Server Admin Log/Archive 34

From Wikitech

2018-04-30

  • 23:38 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.DiffPage.init.js: T192755 (duration: 00m 59s)
  • 23:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Set $wgKartographerUsePageLanguage to false everywhere (T192955) (duration: 00m 59s)
  • 23:33 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/AbuseFilter/includes/AbuseFilter.php: Fix notices when disallowing edits (duration: 00m 59s)
  • 23:21 catrope@tin: Synchronized wmf-config/: USe internal cluster for SPARQL services (T192942) (duration: 01m 02s)
  • 23:14 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Config cleanup patches from SWAT (duration: 01m 00s)
  • 23:05 mutante: ores1001: rm -rf /srv/deployment/ores/venv/ (T193422)
  • 21:46 ebernhardson: T192972 increase eqiad elasticsearch disk watermarks from 75/80 to 85/85
  • 20:27 arlolra: Updated Parsoid to 50b0588 (T186358, T191700, T192909)
  • 20:22 awight@tin: Started deploy [ores/deploy@bf182e2]: ORES: Include bot edits in precaching wikidata itemquality; T187927
  • 20:21 arlolra@tin: Finished deploy [parsoid/deploy@d8d7b42]: Updating Parsoid to 50b0588 (duration: 09m 46s)
  • 20:20 awight@tin: Finished deploy [ores/deploy@5b27205]: Rollback ores1001 to master (duration: 02m 56s)
  • 20:19 bsitzmann@tin: Finished deploy [mobileapps/deploy@d3724d2]: Update mobileapps to cc00cae (T191869) (duration: 07m 32s)
  • 20:17 awight@tin: Started deploy [ores/deploy@5b27205]: Rollback ores1001 to master
  • 20:11 bsitzmann@tin: Started deploy [mobileapps/deploy@d3724d2]: Update mobileapps to cc00cae (T191869)
  • 20:11 arlolra@tin: Started deploy [parsoid/deploy@d8d7b42]: Updating Parsoid to 50b0588
  • 20:09 awight@tin: Finished deploy [ores/deploy@4601497]: Trial LFS deployment to ORES canary; T181678 (take 2) (duration: 02m 10s)
  • 20:06 awight@tin: Started deploy [ores/deploy@4601497]: Trial LFS deployment to ORES canary; T181678 (take 2)
  • 20:06 ppchelko@tin: Finished deploy [changeprop/deploy@8cd45ed]: Don't filter bots from the ORES stream T187927 (duration: 01m 15s)
  • 20:05 ppchelko@tin: Started deploy [changeprop/deploy@8cd45ed]: Don't filter bots from the ORES stream T187927
  • 19:10 awight@tin: Finished deploy [ores/deploy@25579e7]: Trial LFS deployment to ORES canary; T181678 (duration: 02m 06s)
  • 19:08 awight@tin: Started deploy [ores/deploy@25579e7]: Trial LFS deployment to ORES canary; T181678
  • 19:01 mutante: hafnium - sudo service navtiming stop; sudo service statsv stop - downtimed in icinga, decom
  • 18:27 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T191584 (duration: 01m 00s)
  • 18:23 awight@tin: Finished deploy [ores/deploy@5b27205]: Rollback ORES canary to master (duration: 00m 21s)
  • 18:22 awight@tin: Started deploy [ores/deploy@5b27205]: Rollback ORES canary to master
  • 18:17 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/CodeMirror/resources/modules/ve-cm/ve.ui.CodeMirrorAction.js: T191923 (duration: 01m 00s)
  • 18:16 ottomata: starting rolling reimage of kafka main-eqiad brokers kafka100[123] - T192832
  • 18:06 awight@tin: Finished deploy [ores/deploy@46824bb]: Canary-only test deployment for ORES + git-lfs, T181678 (take 2) (duration: 01m 58s)
  • 18:04 awight@tin: Started deploy [ores/deploy@46824bb]: Canary-only test deployment for ORES + git-lfs, T181678 (take 2)
  • 17:41 awight@tin: Finished deploy [ores/deploy@8c586ab]: Canary-only test deployment for ORES + git-lfs, T181678 (duration: 01m 59s)
  • 17:39 ariel@tin: Finished deploy [dumps/dumps@8398f53]: write checksums of dump files into seperate hashfiles, reusing their contents as appropriate (duration: 00m 03s)
  • 17:39 ariel@tin: Started deploy [dumps/dumps@8398f53]: write checksums of dump files into seperate hashfiles, reusing their contents as appropriate
  • 17:39 awight@tin: Started deploy [ores/deploy@8c586ab]: Canary-only test deployment for ORES + git-lfs, T181678
  • 17:26 gehel: restart blazegraph and updater on wdqs1003 to activate UseNUMA -T193365
  • 17:15 gehel@tin: Finished deploy [wdqs/wdqs@2579bfa]: deploying wdqs gui (duration: 04m 16s)
  • 17:11 gehel@tin: Started deploy [wdqs/wdqs@2579bfa]: deploying wdqs gui
  • 17:10 gehel: removing stale scap log for wdqs on tin.eqiad.wmnet
  • 16:50 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch LocalRenameUserJob to EventBus for all wikis - T193254 T190327 (duration: 00m 59s)
  • 16:50 ppchelko@tin: Finished deploy [cpjobqueue/deploy@01630f2]: Switch LocalRenameUserJob for all wikis. T193254 (duration: 00m 49s)
  • 16:49 ppchelko@tin: Started deploy [cpjobqueue/deploy@01630f2]: Switch LocalRenameUserJob for all wikis. T193254
  • 15:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 and db1069 with full weight (duration: 00m 59s)
  • 15:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 and db1069 with low load (duration: 00m 59s)
  • 14:31 jynus: shutting down db1056 for upgrade/maintenance and cloning
  • 14:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Move db1069 from s7 to x1, depool db1056 (duration: 00m 59s)
  • 14:27 elukey: upgrade druid on druid100[1-3] from 0.9.2 to 0.10
  • 14:26 marostegui: Power off db2081 for HW maintenance - T193325
  • 14:17 gehel: rolling restart blazegraph on all wdqs nodes for new configuration - T192759
  • 13:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 after alter table (duration: 00m 59s)
  • 13:40 zeljkof: EU SWAT finished
  • 13:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow bureaucrats to remove flood group for real, allow flooders to strip the group from them (T193350) (duration: 00m 59s)
  • 13:30 zfilipin@tin: Synchronized php-1.32.0-wmf.1/extensions/AbuseFilter: SWAT: Dont use an empty string for block parameters (T189681) (duration: 01m 02s)
  • 13:30 marostegui: Poweroff db1098 for HW maintenance - T193331
  • 13:26 marostegui: Stop MySQL on db1098 - T193331
  • 13:21 ottomata: beginning rolling reimage of kafka200[23] to stretch T192832
  • 13:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RCPatrol in cswiki (T193242) (duration: 00m 59s)
  • 13:16 marostegui: Drop unusued _old tables from a few wikis - https://phabricator.wikimedia.org/T54932#4167221
  • 13:13 gehel: restarting elasticsearch codfw rolling restart for plugin update and NUMA config - T191543 / T191236
  • 13:11 elukey: reimage analytics1049 and 1050 to Debian Stretch
  • 13:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Datetime Selector on Special:Block on all wikis except Meta, MediaWiki, and German Wikipedia (T192962) (duration: 01m 00s)
  • 12:48 arturo: aborrero@labtestnet2001:~ $ sudo rm /var/log/upstart/nova-api.log.1 <--- disk full, logrotate refuses to work bc that
  • 10:34 vgutierrez: Updating puppet compiler facts
  • 10:30 vgutierrez: Repool (Re-enable BGP) lvs3001 - T191897
  • 10:06 elukey: restart hdfs namenode on analytics1002 to pick up new heap settings (last step of the maintenance)
  • 10:00 elukey: set analytics1001 as active HDFS Namenode using manual failover
  • 09:50 elukey: restart HDFS Namenode on analtics1001 (current standby) again with Xmx/Xms set to 8g
  • 09:47 elukey: restart HDFS Namenode on analtics1001 (current standby)
  • 09:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060, fully pool db1090 (duration: 00m 59s)
  • 09:15 ariel@tin: Finished deploy [dumps/dumps@a6baf69]: do not update existing rss feed file if the dump job it covers is more recent than the one for which a feed is requested (duration: 00m 04s)
  • 09:15 ariel@tin: Started deploy [dumps/dumps@a6baf69]: do not update existing rss feed file if the dump job it covers is more recent than the one for which a feed is requested
  • 09:03 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 00m 59s)
  • 09:01 vgutierrez: Depool and reimage lvs3001 as stretch - T191897
  • 08:39 marostegui: Deploy schema change on db1076 - T191519 T188299 T190148
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for alter table (duration: 00m 59s)
  • 08:38 elukey: restart HDFS namenode on analytics1001 (standby master) to pick up new JVM settings - T193257
  • 08:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 after alter table (duration: 01m 00s)
  • 08:23 godog: swift eqiad-prod more weight to ms-be104[0-3] - T191896
  • 08:16 elukey: force a manual failover of the HDFS Namenode from analytics1001 to analytics1002 to test new GC Settings - T193257
  • 08:15 vgutierrez: Repool (Re-enable BGP) in lvs3002 - T191897
  • 08:02 jynus: stopping replication on both db1090 db instances to finish maintenance
  • 07:33 jynus: restarting dbstore1001@s1 to apply config change
  • 07:31 elukey: restart HDFS namenode on analytics1002 (standby master) to pick up new JVM settings - T193257
  • 07:06 marostegui: Restart replication on db1095:s3
  • 07:05 marostegui: Temporary stop replication on db1095:s3
  • 06:48 vgutierrez: Depool and reimage lvs3002 - T191897
  • 06:11 marostegui: Drop table edit_page_tracking from s3 - T57385
  • 06:04 marostegui: Drop table edit_page_tracking from s2 - T57385
  • 05:59 marostegui: Drop table edit_page_tracking from s1 - T57385
  • 05:50 marostegui: Drop table edit_page_tracking from s4, s5 and s7 - T57385
  • 05:47 marostegui: Drop table edit_page_tracking from s6 - T57385
  • 05:28 marostegui: Deploy schema change on db1074 - T191519 T188299 T190148
  • 05:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for alter table (duration: 01m 09s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.1) (duration: 08m 18s)

2018-04-29

  • 17:46 brion: rebuilding image metadata for PDFs on commons on terbium

2018-04-28

  • 23:42 volans@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098 (crashed) (duration: 01m 01s)
  • 15:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2081, crashed (duration: 01m 00s)
  • 05:19 apergos: reimaged snapshot1005 to stretch

2018-04-27

  • 22:45 mutante: m2171,mw2172,mw2173 ff. - reinstalling with stretch and raid1-LVM
  • 22:07 hashar: Running quibble-vendor-mysql-php70-docker against ~ 900 MediaWiki extensions. Triggered with a custom gear-client.py script from contint1001. PID 29710
  • 19:58 tgr: T193254 ran fixStuckGlobalRename.php for: Aliya klein Hasselb Husseinzadeh02 Jswf845 Lorraine Fgr Mikeypugs0134 Ncanty STEEEPGlobal Sunlight me THOR Global Defense Group TPBox Zenas Gao אֲבִי גְדוֹר ぽっぽ大将軍
  • 18:16 mutante: mw2167,mw2168,mw2169 - reinstalling with stretch and raid1-lvm
  • 16:26 imarlier@tin: Finished deploy [performance/navtiming@c059a60]: Deploying navtiming.py with support for enable/disable via etcd (duration: 00m 05s)
  • 16:26 imarlier@tin: Started deploy [performance/navtiming@c059a60]: Deploying navtiming.py with support for enable/disable via etcd
  • 16:19 imarlier@tin: Finished deploy [statsv/statsv@d5108c4]: Update statsv to force the Kafka broker API version (duration: 00m 05s)
  • 16:19 imarlier@tin: Started deploy [statsv/statsv@d5108c4]: Update statsv to force the Kafka broker API version
  • 14:23 anomie: Running populateRevisionLength.php on group 2 for T192189
  • 13:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 after alter table (duration: 00m 59s)
  • 11:41 moritzm: reimaging mwdebug2002 to stretch
  • 11:21 Amir1: ladsgroup@terbium:/var/log/wikidata$ mwscript updateCollation.php --wiki=fawiki --previous-collation=xx-uca-fa
  • 11:13 moritzm: installing uwsgi/Django security updates on graphite hosts in eqiad
  • 10:39 moritzm: installing uwsgi/Django security updates on graphite2001
  • 09:53 moritzm: reimaging mwdebug1001 to stretch
  • 08:58 elukey: reimage analytics10[51,53] to Debian Stretch
  • 08:46 moritzm: installing mysql 5.5 security update (distro-packaged version) on trusty
  • 08:14 moritzm: reimaging mwdebug2001 to stretch
  • 07:32 godog: swift eqiad-prod more weight to ms-be104[0-3] - T190081
  • 05:31 marostegui: Deploy schema change on db1105:3312 - T191519 T188299 T190148
  • 05:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 for alter table (duration: 00m 59s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113 after alter table (duration: 01m 10s)
  • 05:08 cwd: killed some dedupe queries on staging that were causing alerts

2018-04-26

  • 23:31 reedy@tin: Synchronized php-1.32.0-wmf.1/extensions/PdfHandler/: (no justification provided) (duration: 01m 00s)
  • 23:16 reedy@tin: Synchronized php-1.32.0-wmf.1/extensions/UploadWizard/: (no justification provided) (duration: 01m 00s)
  • 23:10 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 00s)
  • 22:44 ebernhardson: start test measuring elasticsearch master mutation latency in codfw
  • 22:38 Jeff_Green: deployed DNS update for frbast1001.wikimedia.org
  • 22:21 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/429100/ (duration: 01m 00s)
  • 22:11 maxsem@tin: Finished scap: Deploy ACW to test wikis, https://gerrit.wikimedia.org/r/429017 / T192455 (duration: 57m 06s)
  • 21:14 maxsem@tin: Started scap: Deploy ACW to test wikis, https://gerrit.wikimedia.org/r/429017 / T192455
  • 21:13 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/429017 (duration: 00m 59s)
  • 21:05 maxsem@tin: Synchronized php-1.32.0-wmf.1/extensions/ArticleCreationWorkflow/: https://gerrit.wikimedia.org/r/#/c/429111/ (duration: 01m 00s)
  • 20:29 hashar: contint1001: cleaned up old Docker images produced by docker-pkg
  • 20:09 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.1
  • 18:12 ottomata: reimaging (some?) kafka200* codfw main kafka nodes to stretch T192832
  • 17:27 awight@tin: Finished deploy [ores/deploy@5b27205]: ORES: update to revscoring 2.2.2, T192917 (duration: 21m 20s)
  • 17:09 ottomata: applying compression_type=snappy to eventbus service kafka producer
  • 17:05 awight@tin: Started deploy [ores/deploy@5b27205]: ORES: update to revscoring 2.2.2, T192917
  • 17:00 moritzm: installing systemd SUA update for stretch
  • 16:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Fix comment, test scap (duration: 01m 12s)
  • 16:03 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Use EventBus for most jobs for test wikis - T190327 (duration: 01m 15s)
  • 16:03 ppchelko@tin: Finished deploy [cpjobqueue/deploy@bf34e00]: Enable all jobs for test, test2, testwikidata and mediawiki. T190327 (duration: 00m 51s)
  • 16:02 ppchelko@tin: Started deploy [cpjobqueue/deploy@bf34e00]: Enable all jobs for test, test2, testwikidata and mediawiki. T190327
  • 15:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add db1090 as multiinstance (duration: 01m 16s)
  • 15:36 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1090 as multiinstance (duration: 01m 17s)
  • 15:18 mutante: added LDAP user tschumann to "nda" group (T192549)
  • 14:53 ppchelko@tin: Finished deploy [changeprop/deploy@f2f7a84]: Commit offsets for non matched messages from time to time. (duration: 01m 26s)
  • 14:51 ppchelko@tin: Started deploy [changeprop/deploy@f2f7a84]: Commit offsets for non matched messages from time to time.
  • 14:26 anomie: Running populateRevisionLength.php on group 1 for T192189
  • 14:25 jynus: stop db1069 for cloning it away
  • 13:58 marostegui: Compress enwiki on db1116:3311 - T190704
  • 13:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069, repool db1086 (duration: 01m 16s)
  • 13:35 zeljkof: EU SWAT finished
  • 13:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change chapcomwikis logo, add HD logo for chapcomwiki (T193024) (duration: 01m 16s)
  • 13:30 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change chapcomwikis logo, add HD logo for chapcomwiki (T193024) (duration: 01m 16s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add all Hindi projects plus meta as import sources for hiwikimedia (T188366) (duration: 01m 17s)
  • 13:09 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Fix pixelization of new wiki logos (T193028) (duration: 01m 17s)
  • 12:53 marostegui: Deploy schema change on db1113:3312 - T191519 T188299 T190148
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113 for alter table (duration: 01m 33s)
  • 12:51 gehel: reindexing lost updates on elasticsearch - T193112
  • 12:04 mobrovac@tin: Finished deploy [cpjobqueue/deploy@7fbb152]: Support the exclude_topics config stanza (duration: 01m 12s)
  • 12:03 mobrovac@tin: Started deploy [cpjobqueue/deploy@7fbb152]: Support the exclude_topics config stanza
  • 10:35 moritzm: reimaging mw1312 mw1317, mw1339 (API servers) to stretch
  • 10:29 moritzm: reimaging mw1269, mw1323, mw1324 (app servers) to stretch
  • 09:57 marostegui: Drop prefswitch_survey on s1 - T173439
  • 09:50 godog: eqiad-prod: more weight to ms-be104[0-3] for container/account - T190081
  • 09:45 marostegui: Drop prefswitch_survey on s3 - T173439
  • 09:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 with low load (duration: 01m 16s)
  • 09:30 marostegui: Drop prefswitch_survey on s7 - T173439
  • 09:16 marostegui: Drop prefswitch_survey on s2 - T173439
  • 09:15 mark: Temp disabling cr1-ulsfo:xe-1/2/0 (Chicago transport) due to stability issues
  • 09:13 marostegui: Drop prefswitch_survey on s4 - T173439
  • 09:02 marostegui: Drop prefswitch_survey on s5 and s6 - T173439
  • 09:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 (duration: 01m 16s)
  • 08:51 moritzm: reimaging mw1320, mw1321, mw1322 (app servers) to stretch
  • 08:32 moritzm: re-attempt reimage of mw1246 (failed yesterday with an error on the puppetmaster, testing whether this can be reproduced)
  • 08:24 jynus: stop and upgrade db1109
  • 07:58 marostegui: Deploy schema change on db1090 - T191519 T188299 T190148
  • 07:45 jynus: stopping db1090 mariadb instance to move its path, port and socket
  • 07:21 gehel: restarting redis masters in codfw - T193112
  • 07:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090, pool db1122 with full weight (duration: 01m 23s)
  • 07:16 gehel: re-enabling puppet on rdb2* - T193112
  • 06:19 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=elasticsearch
  • 05:18 marostegui: Deploy schema change on dbstore1002:s2 - T191519 T188299 T190148
  • 04:39 ebernhardson: unfreeze writes to elasticsearch codfw cluster
  • 03:54 _joe_: stopping redis replication from eqiad to codfw for the jobqueue cluster, we have an issue ongoing with CirrusSearch jobs and replication is broken
  • 03:41 ejegg: re-enabled ingenico recurring charge job
  • 02:05 mutante: mw2163 through mw2166: since the wmf-auto-reimage failed after OS but before puppet run due to "Failed to puppet_generate_certs" i manually logged in with install-console and signed puppet certs (T174431)

2018-04-25

  • 22:55 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Undeploy GlobalPreferences T184121 (duration: 01m 16s)
  • 22:21 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy GlobalPreferences T189806 (duration: 01m 18s)
  • 21:03 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.1
  • 21:01 demon@tin: Synchronized php: symlink bump (duration: 01m 16s)
  • 20:58 hasharAway: on tin: rebased php-1.31.0-wmf.30 for https://gerrit.wikimedia.org/r/#/c/429018/
  • 20:21 XioNoX: remove test VIP for eqiad ping offload server - T190090
  • 20:18 bsitzmann@tin: Finished deploy [mobileapps/deploy@5a4a282]: Config: Start up to 4 workers in parallel during start-up (duration: 06m 48s)
  • 20:11 bsitzmann@tin: Started deploy [mobileapps/deploy@5a4a282]: Config: Start up to 4 workers in parallel during start-up
  • 19:39 otto@tin: Finished deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/ (duration: 01m 45s)
  • 19:37 otto@tin: Started deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 19:12 urandom: altering timeline tables for 6 month TTL -- T192689
  • 19:11 otto@tin: Finished deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/ (duration: 00m 11s)
  • 19:11 otto@tin: Started deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 19:09 otto@tin: Started deploy [eventlogging/eventbus@f562c1b]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 18:55 imarlier@tin: Finished deploy [performance/coal@1e79c79]: deploy fix for coal-web (duration: 00m 06s)
  • 18:55 imarlier@tin: Started deploy [performance/coal@1e79c79]: deploy fix for coal-web
  • 18:16 ejegg: updated CiviCRM from 219798b2c5 to 47197006d5
  • 17:35 urandom: starting cleanups on row 'a' Cassandra nodes -- T189822
  • 17:33 mepps: update civicrm from 6ddeb167ec to 219798b2c5
  • 17:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change fawiki uca to the right one (duration: 01m 17s)
  • 17:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on frwikiquote T192301 (duration: 01m 17s)
  • 17:00 mutante: powercycling wdqs1004
  • 16:09 mutante: re-imaging mw2258, mw2163, mw2164 ff.
  • 15:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1122, db1090 with low load (duration: 01m 14s)
  • 15:22 anomie: Running populateRevisionLength.php on group 0 for T192189
  • 15:05 ottomata: temp disabling puppet, applying ipv6 mapped on kafka200*
  • 15:04 andrewbogott: adding labvirt1016 to the nova-compute scheduling pool
  • 14:37 elukey: restart hive-server2 on analytics1003 to pick up settings in https://gerrit.wikimedia.org/r/428919
  • 14:34 akosiaris: reboot bohrium T150532
  • 14:33 ema: cp3030: upgrade varnish to 5.1.3-1wm7 T192368
  • 14:12 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1002 T193025 (duration: 01m 16s)
  • 13:57 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: pool poolcounter1003 T187297 (duration: 01m 16s)
  • 13:53 Amir1: EU SWAT is done!
  • 13:53 ladsgroup@tin: Synchronized php-1.32.0-wmf.1/extensions/Wikibase/lib/includes/Changes: Make sure statements in EntityDiffChangedAspects are not passed around as stdClass (T192085) (duration: 01m 16s)
  • 13:49 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: repool poolcounter1001 T150532 (duration: 01m 16s)
  • 13:43 ladsgroup@tin: Synchronized php-1.31.0-wmf.30/extensions/Wikibase/lib/includes/Changes: Make sure statements in EntityDiffChangedAspects are not passed around as stdClass (T192085) (duration: 01m 17s)
  • 13:40 akosiaris: reboot poolcounter1001 for T150532
  • 13:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Mapframe for bgwiki (T192895) (duration: 01m 15s)
  • 13:23 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for cswiki Wikipedia event (T192898) (duration: 01m 16s)
  • 13:19 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1001 T150532 (duration: 01m 17s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for cswiki Wikipedia event (T192898) (duration: 01m 16s)
  • 13:12 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Remove xx-uca-fa for Persian Wikis except Wikipedia (duration: 01m 17s)
  • 13:06 marostegui: Deploy schema change on s2 codfw master (db2035) - this will generate lag on codfw - T191519 T188299 T190148
  • 12:55 gehel: starting elasticsearch codfw rolling restart for plugin update and NUMA config - T191543 / T191236
  • 12:47 akosiaris: reboot puppetdb1001 for T150532
  • 12:08 moritzm: reimaging mw1251, mw1252, mw1253 (app servers) to stretch
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add db1122 (duration: 01m 16s)
  • 11:29 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1122 (duration: 03m 24s)
  • 11:19 moritzm: reimaging mw1228, mw1229, mw1230 (API servers) to stretch
  • 10:29 jynus: stopping replication, running optimize table on dbstore2001:s8
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 (duration: 01m 16s)
  • 09:58 elukey: reimage analytics106[1,2] to Debian Stretch
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 after alter table (duration: 01m 30s)
  • 09:09 jynus: stopping db1090 for maintenance
  • 08:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 01m 17s)
  • 08:38 marostegui: Drop user_old and user_temp tables from s3 - T172664
  • 08:23 godog: eqiad-prod: add ms-be104[0-3] with minimal weight - T190081
  • 08:23 moritzm: reimaging mw1247, mw1248, mw1249 (app servers) to stretch
  • 07:35 marostegui: Deploy schema change on db1085 with replication (this will generate lag on labsdb hosts on s6) - T191519 T188299 T190148
  • 07:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 01m 16s)
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3316 after alter table (duration: 01m 16s)
  • 07:05 akosiaris: starting a very slow rolling reboot of all VMs on codfw ganeti cluster, row_C nodegroup, excluding poolcounter1001 and puppetdb1001. T150532
  • 06:53 moritzm: reimaging mw1314, mw1315, mw1316 (API servers) to stretch
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 21s)
  • afk: disabled ingenico recurring donation charge job
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.30) (duration: 07m 23s)
  • 02:52 ejegg: turned fundraising queue consumers back on
  • 01:31 ejegg: disabled fundraising queue consumer jobs
  • 00:31 demon@tin: Synchronized multiversion/defines.php: rm unused defines (duration: 01m 16s)

2018-04-24

  • 23:33 legoktm@tin: Synchronized php-1.32.0-wmf.1/extensions/Kartographer/includes/Tag/MapFrame.php: MapFrame: Allow lang="local" to be passed (duration: 01m 17s)
  • 23:29 urandom: starting Cassandra bootstrap, restbase1010-c -- T189822
  • 23:08 mutante: mw2242.codfw , mw2255.codfw et al.. more stretch reinstalls going on
  • 23:04 demon@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: unbreak multiversion loading for a totally useless script (duration: 01m 16s)
  • 22:55 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 01m 18s)
  • 22:53 legoktm@tin: Synchronized wmf-config/CommonSettings.php: Fix wgTidyConfig and restore proper tidy & Remex config - T192855 (duration: 01m 16s)
  • 21:56 mutante: adding LDAP user 'bitpogo' to group 'wmde' (T191523)
  • 21:23 ejegg: re-enabled recurring donations queue consumer
  • 20:55 demon@tin: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.1
  • 20:27 urandom: starting Cassandra bootstrap, restbase1010-b -- T189822
  • 20:23 Dereckson: Run namespaceDupes on gorwiki
  • 20:03 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for all wikis but wikitech - T191464 (duration: 01m 26s)
  • 19:53 bblack: prometheus-fail switched to UNKNOWNs for now in https://gerrit.wikimedia.org/r/#/c/428725/ - may want to look at this further later, intent is to reduce odds of debilitating ops spam for the evening.
  • 19:49 elukey: re-enable ircecho
  • 19:40 demon@tin: Finished scap: bootstrap 1.32.0-wmf.1 (duration: 106m 55s)
  • 19:36 elukey: stop ircecho on einstenium - icinga shower
  • 19:17 jgleeson: Updating civicrm from 142edbb90b to 6ddeb167ec
  • 18:54 ottomata: temp disabling puppet and applying profile::kafka::broker on kafka100* T192831
  • 17:53 demon@tin: Started scap: bootstrap 1.32.0-wmf.1
  • 17:52 gehel: restarting wdqs-updater on all nodes for prometheus jmx exporter update - T192768
  • 17:51 andrew@tin: Synchronized wmf-config/db-eqiad.php: Renaming 'm5' section to 'wikitech' for T189542, two of two (duration: 00m 59s)
  • 17:49 andrew@tin: Synchronized wmf-config/db-codfw.php: Renaming 'm5' section to 'wikitech' for T189542, one of two (duration: 00m 59s)
  • 17:42 ottomata: temp disabling puppet on kafka200* to apply profile::kafka::broker in main-codfw T192831
  • 17:39 demon@tin: Pruned MediaWiki: 1.31.0-wmf.29 [keeping static files] (duration: 06m 28s)
  • 17:35 XioNoX: removing firewall block on cr1/2-codfw - T175361
  • 17:35 XioNoX: removing firewall block on cr1-eqdfw - T175361
  • 17:29 bstorm_: added MCR tables to labsdb1009 (slots, slot_roles, content_models, content)
  • 17:04 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% [deploy to restbase1010] - T192689 (duration: 02m 04s)
  • 17:02 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% [deploy to restbase1010] - T192689
  • 17:01 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689 (duration: 05m 27s)
  • 16:57 urandom: starting Cassandra bootstrap, restbase1010-a -- T189822
  • 16:55 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689
  • 16:52 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689 (duration: 11m 40s)
  • 16:45 marostegui: Deploy schema change on db1113:3316 - T191519 T188299 T190148
  • 16:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3316 for alter table (duration: 00m 58s)
  • 16:40 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3316 after alter table (duration: 00m 58s)
  • 16:30 elukey: restart hadoop hdfs journalnode on analytics1035/52 to pick up prometheus jmx settings
  • 16:11 elukey: restart hadoop-hdfs-journalnode on analytics1028 to pick up prometheus monitoring
  • 16:10 bstorm_: Added views for new MCR tables on labsdb1011 (slots, slot_roles, content and content_models)
  • 16:08 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011 - https://phabricator.wikimedia.org/T184446
  • 15:59 godog: reimage restbase1010 after ssd swap - T189822
  • 15:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with full weight (duration: 00m 58s)
  • 14:41 elukey: restart hadoop hdfs journalnode on analytics1028 to pick up jmx settings
  • 14:40 sbisson@tin: Finished deploy [kartotherian/deploy@86da82d]: Deploy latest kartotherian with updated fallbacks and support lang=local (duration: 06m 29s)
  • 14:34 sbisson@tin: Started deploy [kartotherian/deploy@86da82d]: Deploy latest kartotherian with updated fallbacks and support lang=local
  • 14:02 Amir1: EU SWAT is done
  • 14:01 hoo@tin: Synchronized wmf-config/abusefilter.php: Grant Meta-Wiki sysops the ability to edit global abusefilter rules (T192722) (duration: 00m 59s)
  • 13:58 hoo@tin: Synchronized wmf-config/: Properly set default for $wmgWikibaseSiteGroup (T188456) (duration: 00m 58s)
  • 13:56 hoo@tin: Synchronized wmf-config/: Properly set default for $wmgWikibaseSiteGroup (T188456) (duration: 01m 00s)
  • 13:43 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Increase the timespan of rate limit in wikidata from 1m to 5m (T192690) (duration: 00m 58s)
  • 13:37 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up old config for logging autopatrol actions (T184485) (duration: 00m 58s)
  • 13:28 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Add badge for good lists (T190976) (duration: 00m 55s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for IndigenizeWikipedia event, clean obsolete rules (T192827) (duration: 00m 58s)
  • 13:06 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Set default for $wmgWikibaseSiteGroup (T188456) (duration: 00m 59s)
  • 12:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 58s)
  • 12:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 load (duration: 00m 58s)
  • 12:28 elukey: cleanup /home/elukey/zookeeper backup files (taken before the 3.4.9 migration) on conf*
  • 12:13 marostegui: Deploy schema change on db1098:3316 - T191519 T188299 T190148
  • 12:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 58s)
  • 12:10 elukey: reimage analytics106[34] to Debian Stretch
  • 12:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 after alter table (duration: 00m 58s)
  • 11:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with low load (duration: 00m 59s)
  • 11:44 moritzm: reimaging mw1241, mw1242, mw1243 (app servers) to stretch
  • 10:58 moritzm: reimaging mw1224, mw1225, mw1226 (API servers) to stretch
  • 10:50 elukey: reimage analytics106[56] to Debian Stretch
  • 10:49 arturo: enable puppet in labtestcontrol2001 to sync with repo changes
  • 10:39 akosiaris: starting a very slow rolling reboot of all VMs on codfw ganeti cluster T150532
  • 10:39 akosiaris: upgrade to qemu 2.8 on codfw ganeti cluster. T150532
  • 10:31 jynus: stop and reimage db1110
  • 10:01 apergos: reimaged snapshot1001 for testing with php7/stretch
  • 09:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 00m 58s)
  • 09:28 marostegui: Deploy schema change on db1088 - T191519 T188299 T190148
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 58s)
  • 09:25 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #3 - T192689 T190846 (duration: 04m 30s)
  • 09:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 after alter table (duration: 03m 06s)
  • 09:21 moritzm: reimaging mw1221, mw1222, mw1223 (API servers) to stretch
  • 09:21 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #3 - T192689 T190846
  • 09:21 moritzm: reimaging mw1221, mw1222, mw1223 (app servers) to stretch
  • 09:21 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #2 - T192689 T190846 (duration: 03m 03s)
  • 09:18 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #2 - T192689 T190846
  • 09:17 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points - T192689 T190846 (duration: 13m 13s)
  • 09:12 moritzm: reimaging mw1273, mw1274, mw1275 (app servers) to stretch
  • 09:03 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points - T192689 T190846
  • 08:17 hoo: Finished running populateSitesTable.php for all wikis (T192628, T192632, T192631, T192633)
  • 08:14 elukey: upload druid_0.10.0-3~jessie1 (collection of druid packages) to jessie-wikimedia - T164008
  • 08:05 godog: power off restbase1010 for ssd replacement - T189822
  • 07:50 hoo: Started running populateSitesTable.php for all wikis (T192628, T192632, T192631, T192633)
  • 07:39 marostegui: Rename user_old and user_temp tables on db1077 - T172664
  • 07:28 gehel: restarting blazegraph on wdqs1004 for jvm upgrade
  • 07:23 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 - T184446
  • 07:16 vgutierrez: Update puppet compiler facts
  • 06:56 elukey: restart zookeeper on conf200[123] for openjdk upgrades
  • 06:41 moritzm: installing poppler security updates
  • 06:35 marostegui: Deploy schema change on db1093 - T191519 T188299 T190148
  • 06:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 59s)
  • 05:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 after alter table (duration: 00m 59s)
  • 05:03 _joe_: rebuilding the docker base images
  • 04:35 mutante: repooled mw2224, reinstalling mw2225 through mw2228
  • 03:08 mutante: reinstalling mw2224.codfw.wmnet with wmf-auto-reimage
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.30) (duration: 10m 37s)
  • 01:55 cwd: payments, civi, and alerts re-enabled
  • 01:11 ejegg: re-enabled fundraising jobs
  • 01:09 ejegg: updated fundraising python tools from f3ed1d05b8 to 3754f32ab6
  • 00:18 mutante: removing travel@ and travelapproval@ exim aliases, moving to OIT/Google (T127549)

2018-04-23

  • 23:51 eileen: civicrm revision changed from 347e613aa5 to 142edbb90b, config revision is 07dee62bff
  • 23:35 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable non-static internationalized maps on test2wiki (duration: 00m 59s)
  • 23:32 catrope@tin: Synchronized php-1.31.0-wmf.30/extensions/Thanks/includes/EchoCoreThanksPresentationModel.php: Fix fatal error in Thanks notifications (T192711) (duration: 00m 58s)
  • 23:29 eileen: civicrm revision changed from b1e7ccfc4d to 347e613aa5, config revision is 07dee62bff
  • 23:15 XioNoX: changed AMS-IX peering mode to default (filter on radb+rpki)
  • 23:13 cwd: disabled most (all?) frack alerts
  • 23:11 ebernhardson: restart elasticsearch on elastic1031 to apply numa settings
  • 22:56 XioNoX: disabling flapping VCP on asw1-eqsin - T192125
  • 22:37 mutante: phab1001 - deleting duplicate cronjob for public_taskdump.py (the one that did not output to /dev/null) (T188149)
  • 22:21 ebernhardson: restart elasticsearch on elastic1030 to apply numa settings
  • 22:12 ebernhardson: restart elasticsearch on elastic1029 to apply numa settings
  • 21:49 ebernhardson: restart elasticsearch on elastic1028 to apply numa settings
  • 21:40 ejegg: updated fundraising python tools from 7c5c7a5f9e to f3ed1d05b8
  • 21:39 ejegg: updated SmashPig from 1ebee97a45 to a4de12d415
  • 21:36 ebernhardson: restart elasticsearch on elastic1024 to apply numa settings
  • 21:25 ebernhardson: restart elasticsearch on elastic1025 to apply numa settings
  • 20:53 XioNoX: redirect text-lb.eqiad pings to ping1001 on cr1/2-eqiad (24h tests) - T190090
  • 20:47 ppchelko@tin: Finished deploy [restbase/deploy@228caf8]: Log the lack of the index entries, take 2 (duration: 03m 55s)
  • 20:43 ppchelko@tin: Started deploy [restbase/deploy@228caf8]: Log the lack of the index entries, take 2
  • 20:43 ppchelko@tin: Finished deploy [restbase/deploy@228caf8]: Log the lack of the index entries (duration: 14m 19s)
  • 20:40 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.30
  • 20:32 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5650605]: Update mobileapps to b011b2a (duration: 05m 56s)
  • 20:29 ppchelko@tin: Started deploy [restbase/deploy@228caf8]: Log the lack of the index entries
  • 20:26 mholloway-shell@tin: Started deploy [mobileapps/deploy@5650605]: Update mobileapps to b011b2a
  • 20:15 Dereckson: Purged all languages messages from the cache, for gorwiki (rebuildmessages.php, T189127)
  • 19:49 vgutierrez: Repool (Re-enable BGP) in lvs5001 - T191897
  • 19:34 elukey@tin: Finished deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008 (duration: 00m 17s)
  • 19:34 elukey@tin: Started deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008
  • 18:48 catrope@tin: Synchronized dblists/wikidataclient.dblist: Add ruwikimedia to wikidataclient (T188456) (duration: 01m 15s)
  • 18:33 vgutierrez: Depool lvs5001 - T191897
  • 18:33 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Change timezone for napwiki (T192568) (duration: 01m 31s)
  • 18:28 vgutierrez: Repool (Re-enable BGP) lvs5002 - T191897
  • 18:18 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable WikiLove on sawiki (T192212) (duration: 01m 19s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable internationalized maps on testwiki (duration: 01m 17s)
  • 17:52 ariel@tin: Finished deploy [dumps/dumps@02a3e80]: fix up checks for truncated/binary output files (duration: 00m 04s)
  • 17:52 ariel@tin: Started deploy [dumps/dumps@02a3e80]: fix up checks for truncated/binary output files
  • 17:35 XioNoX: pushing firewall block on cr1-eqdfw - T175361
  • 17:24 XioNoX: pushing firewall block on cr1/2-codfw - T175361
  • 17:18 thcipriani@tin: Synchronized php: Group1 to 1.31.0-wmf.30 (duration: 01m 16s)
  • 17:15 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.30
  • 17:02 vgutierrez: Depool and reimage lvs5002 as stretch - T191897
  • 16:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.26 (duration: 03m 28s)
  • 16:07 marostegui: Deploy schema change on db1096:3316 - T191519 T188299 T190148
  • 16:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 01m 16s)
  • 16:03 gehel: restarting wdqs-updater on all nodes
  • 15:55 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010
  • 15:53 bstorm_: Added slots, slot_roles, content and content_models to views on labsdb1010
  • 15:36 dereckson@tin: Finished scap: Rebuild localisation cache to add Gorontalo (T189127) (duration: 08m 29s)
  • 15:28 dereckson@tin: Started scap: Rebuild localisation cache to add Gorontalo (T189127)
  • 15:28 dereckson@tin: scap aborted: Rebuild localisation cache to add Gorontalo (T189127)z (duration: 00m 01s)
  • 15:28 dereckson@tin: Started scap: Rebuild localisation cache to add Gorontalo (T189127)z
  • 15:23 dereckson@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 00m 46s)
  • 15:20 dereckson@tin: Synchronized php-1.31.0-wmf.30/languages/messages/MessagesGor.php: Localisation for MediaWiki in Gorontalo (T189127) (duration: 01m 16s)
  • 15:13 dereckson@tin: Synchronized php-1.31.0-wmf.29/languages/messages/MessagesGor.php: Localisation for MediaWiki in Gorontalo (T189127) (duration: 01m 18s)
  • 14:10 ottomata: switching main -> analytics MirrorMaker to --new.consumer (temporarily stopping puppet on kafka101[234]) https://phabricator.wikimedia.org/T192387
  • 14:02 zeljkof: EU SWAT finished
  • 13:57 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: lfnwiki: add logo path and missing namespace names (T183561) (duration: 01m 15s)
  • 13:55 elukey: reimage analytics1067 to Debian Stretch - T192557
  • 13:53 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 13:50 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 13:43 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: euwikisource: add missing $wgMetaNamespace (T189465) (duration: 01m 16s)
  • 13:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: gorwiki: add missing namespaces (T189109) (duration: 01m 17s)
  • 13:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add logos for gorwiki (T192669) (duration: 01m 14s)
  • 13:27 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add logos for gorwiki (T192669) (duration: 01m 16s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Temp rate limit for arwiki due to mass vandalism (T192668) (duration: 01m 17s)
  • 13:12 jynus: restarting es2003 to test gerrit:427902
  • 12:59 marostegui: Deploy schema change on dbstore1002 s6 - T191519 T188299 T190148
  • 12:58 jynus: disabling puppet on several mysql hosts before deploying gerrit:427902
  • 12:40 sbisson@tin: Finished deploy [kartotherian/deploy@2195dde]: Deploy kartotherian with new babel fallback rules (duration: 04m 52s)
  • 12:35 sbisson@tin: Started deploy [kartotherian/deploy@2195dde]: Deploy kartotherian with new babel fallback rules
  • 11:50 moritzm: reimaging mw1238,mw1239,mw1240 (app servers) to stretch
  • 11:46 moritzm: reimaging mw1285 (previous attempt had a hardware problem which failed to trigger the reboot via IPMI) ,mw1287,mw1288 (API servers) to stretch
  • 11:41 moritzm: installing poppler security updates
  • 11:25 mobrovac@tin: Finished deploy [citoid/deploy@b3c0818]: Add support for restful crossRef API and Wikidata QIDs - T108175 T176411 (duration: 03m 36s)
  • 11:22 mobrovac@tin: Started deploy [citoid/deploy@b3c0818]: Add support for restful crossRef API and Wikidata QIDs - T108175 T176411
  • 11:17 mobrovac@tin: Finished deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource, take #2 - T192678 (duration: 07m 21s)
  • 11:10 mobrovac@tin: Started deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource, take #2 - T192678
  • 11:09 mobrovac@tin: Finished deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource - T192678 (duration: 11m 47s)
  • 11:00 gehel: restarting wdqs updater on all wdqs notes
  • 10:57 mobrovac@tin: Started deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource - T192678
  • 10:26 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 10:25 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 17s)
  • 09:56 _joe_: restarting memcached on mc1020-1036 at 1 hour intervals - T184854
  • 09:13 godog: Flashing Smart Array P840 in Slot 3 [ 4.52 -> 6.30 ] on ms-be2034 - T192721 T141756
  • 09:05 _joe_: AMEND: restart memcached on mc1019 (T184854)
  • 09:05 _joe_: restart memcached on mw1019 (Ttail -f /var/log/etcdmirror-conftool-eqiad-wmnet/syslog.log
  • 09:05 vgutierrez: restarting pybal on lvs1006
  • 09:02 _joe_: restarting etcdmirror on conf2002 after restarting nginx on conf1001
  • 08:59 moritzm: reimaging mw1283,mw1285,mw1286 (API servers) to stretch
  • 08:57 marostegui: Deploy schema change on s6 codfw master (db2039) - this will generate lag on codfw - T191519 T188299 T190148
  • 08:56 gehel: rolling restart of blazegraph on wdqs1004, 2004 and 2005 for JVM upgrade
  • 08:55 moritzm: reimaging mw1270,mw1271,mw1272 (app servers) to stretch
  • 08:52 vgutierrez: restarting pybal on esams cluster
  • 08:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 (duration: 01m 16s)
  • 08:48 _joe_: upgrading nginx on the config cluster in eqiad (T164456)
  • 08:47 marostegui: Drop table logging_pre_1_10 in s5 - T118859
  • 08:47 marostegui: Dropped table logging_pre_1_10 in s3 - T118859
  • 08:42 vgutierrez: restarting pybal on lvs4006
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 (duration: 01m 18s)
  • 08:36 vgutierrez: restarting pybal on codfw (once at a time)
  • 08:33 vgutierrez: restart pybal on lvs4007
  • 08:31 vgutierrez: restarting pybal on lvs5002
  • 08:30 vgutierrez: restarting pybal on lvs5001
  • 08:30 marostegui: Drop table logging_pre_1_10 in s4 - T118859
  • 08:27 vgutierrez: restarting pybal on lvs4005
  • 08:27 _joe_: restarting pybal on lvs5003
  • 08:17 _joe_: upgrading nginx on the config cluster in codfw (T164456)
  • 08:13 marostegui: Drop table logging_pre_1_10 in s7 - T118859
  • 08:08 _joe_: restarting memcached in codfw (T184854)
  • 08:08 gehel: restarting blazegraph on wdqs1003 (crazy number of java threads)
  • 08:04 moritzm: upgrading terbium to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:58 ema: cp-misc: upgrade varnish to 5.1.3-1wm7
  • 07:55 marostegui: reload haproxy on dbproxy1010 to depool labsdb1010
  • 07:55 marostegui: Depool labsdb1010 - T184446
  • 07:50 marostegui: Drop table logging_pre_1_10 in s2 - T118859
  • 07:47 marostegui: Drop table logging_pre_1_10 in s6 - T118859
  • 07:36 moritzm: upgrading remaining API servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:35 elukey: reboot ms-be2034 - stuck in com2 console with "sd 0:1:0:1: rejecting I/O to offline device", not responsive to ssh
  • 07:00 moritzm: upgrading remaining app servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 06:26 marostegui: Remove logging_pre_1_10 from codfw - T118859
  • 05:28 marostegui: flow_subscription empty table from officewiki - T149936
  • 05:17 marostegui: Deploy schema change on db1070 (s5 primary master) - T191519 T188299 T190148
  • 02:40 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 56s)

2018-04-22

  • 16:29 ariel@tin: Finished deploy [dumps/dumps@bb7ae96]: creadtedirs date fixup, rerun only missing stub type (duration: 00m 03s)
  • 16:29 ariel@tin: Started deploy [dumps/dumps@bb7ae96]: creadtedirs date fixup, rerun only missing stub type

2018-04-21

2018-04-20

  • 20:45 andrewbogott: re-imaging labvirt1021 and 1022 as Jessie
  • 20:23 ejegg: updated fundraising python tools from 0c50f9e38f to 7c5c7a5f9e
  • 18:23 mutante: add LDAP user "tieu" to group "wmde" (T192256)
  • 17:42 imarlier@tin: Finished deploy [performance/coal@99db58f]: coal - update to submit via graphite. Not yet active, requires puppet changes (duration: 00m 04s)
  • 17:41 imarlier@tin: Started deploy [performance/coal@99db58f]: coal - update to submit via graphite. Not yet active, requires puppet changes
  • 17:35 no_justification: gerrit: update mysql-client and deps 5.5.59 -> 5.5.60
  • 17:28 mutante: phabricator - restarted apache
  • 17:26 mutante: phabricator (phab1001) - upgrading Apache, openssl, mysql-common
  • 17:17 mutante: phab2001 - upgrading apache, openssl, mysql-common
  • 17:04 andrewbogott: rebooting labvirt1021 and 1022
  • 16:44 dcausse@tin: Synchronized php-1.31.0-wmf.30/extensions/CirrusSearch/: T192609: Do not propagate Elastica doc modifications out of DataSender (duration: 01m 34s)
  • 15:07 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2087 (duration: 01m 16s)
  • 14:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2086, depool db2087 (duration: 01m 16s)
  • 14:16 andrew@tin: Synchronized dblists: Purging obsolete silver.dblist (duration: 01m 17s)
  • 14:02 moritzm: upgrading labweb* servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 14:00 jynus: upgrade and restart db2086
  • 13:59 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2086 (duration: 01m 13s)
  • 13:35 anomie: (re-)creating `slots` table on all wikis, following up T190153 and T184446#4143097
  • 13:25 moritzm: upgrading mysql (as shipped in Debian) on bohrium
  • 13:00 moritzm: installing zsh security updates on trusty servers
  • 12:25 moritzm: upgrading apache on auth* servers
  • 12:18 jynus: upgrading and restarting dbstore2002
  • 12:15 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 (duration: 01m 17s)
  • 12:06 moritzm: installing apache security updates on video scalers
  • 12:05 moritzm: upgrading apache on einsteinium/icinga.wikimedia.org
  • 11:53 moritzm: installing apache security updates on netmon1002/2001
  • 11:27 elukey: reimage analytics1068 to Debian Stretch - T192557
  • 11:06 moritzm: installing tiff security updates on trusty
  • 09:58 godog: upload scap 3.8.0-2 - T192124
  • 09:51 moritzm: upgrading deployment servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 09:41 jynus: starting reimage of db2070
  • 09:41 moritzm: upgrading mwdebug servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 09:33 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2071, depool db2070 (duration: 01m 16s)
  • 09:12 elukey: restart of mw apis showing ~50% cpu utilization as precaution before the weekend - mw[1224,1225,1228,1230,1231,1233-1235,1276-1283,1286,1312,1313,1315,1316,1341,1343,1344,1347,1348]*
  • 09:06 moritzm: upgrading video scalers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 08:41 moritzm: upgrading job runners in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 08:39 marostegui: Going to sanitize gorwiki euwikisource romdwikimedia inhwiki on db1095 - T189112 T189466 T187774 T184375
  • 08:39 elukey: restart hhvm on mw[1226,1232].eqiad.wmnet - high load
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 01m 16s)
  • 07:57 jynus: starting reimage of db2071
  • 07:52 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 01m 16s)
  • 07:48 moritzm: upgrading app servers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 01m 17s)
  • 07:38 ema: cp3041: restart varnish-be due to mbox lag
  • 07:37 akosiaris: upgrade qemu on ganeti2006 to 1:2.8+dfsg-3~bpo8+1 and migrate mwdebug2001 to it T150532
  • 07:32 ema: cp3030: restart varnish-be due to mbox lag
  • 07:30 _joe_: upgrading hhvm on all jobrunners in eqiad
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 01m 15s)
  • 07:09 ema: cp3032/cp3043: restart varnish-be due to mbox lag
  • 07:08 moritzm: upgrading API servers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 after alter table (duration: 01m 16s)
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 01m 15s)
  • 06:26 ema: kafka::analytics remove strongswan leftovers T185136
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 01m 15s)
  • 06:07 marostegui: Stop mysql db1114 for a reboot
  • 06:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 01m 16s)
  • 05:55 _joe_: depooling mw1227 from live traffic for investigation
  • 05:31 marostegui: Start atop on db1114 with "-R" option enabled - T192551
  • 05:31 marostegui: Deploy schema change on db1110 - T191519 T188299 T190148
  • 05:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 for alter table (duration: 01m 17s)
  • 05:21 ariel@tin: Finished deploy [dumps/dumps@c2d3bb4]: keep completed stubs/abstracts/logs files around for retries (duration: 00m 04s)
  • 05:20 ariel@tin: Started deploy [dumps/dumps@c2d3bb4]: keep completed stubs/abstracts/logs files around for retries
  • 01:50 krinkle@tin: Synchronized wmf-config/CommonSettings.php: If8fdce707d (duration: 01m 17s)

2018-04-19

  • 23:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.29/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off cirrus ab test (duration: 01m 18s)
  • 23:13 ebernhardson@tin: Synchronized php-1.31.0-wmf.30/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off cirrus ab test (duration: 01m 17s)
  • 23:04 thcipriani@tin: Synchronized php: complete group1 and group2 wikis back to 1.31.0-wmf.29 (duration: 01m 16s)
  • 22:30 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 and group2 wikis back to 1.31.0-wmf.29
  • 21:41 urandom: Start cleanup, restbase10{07,11,16}-c -- T189822
  • 21:22 urandom: Start cleanup, restbase10{07,11,16}-b -- T189822
  • 21:15 urandom: Start cleanup, restbase10{07,11,16}-a -- T189822
  • 21:12 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter, restbase1010-c -- T189822, T192456
  • 21:00 ebernhardson: issue move of enwiki_content shard 2 from overloaded elasti1027 to elastic1017
  • 20:48 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter, restbase1010-a -- T189822, T192456
  • 20:48 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter -- T189822, T192456
  • 20:32 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.30
  • 20:27 milimetric@tin: Finished deploy [analytics/refinery@c1c9885]: Correcting hql from last deployment (duration: 05m 09s)
  • 20:22 milimetric@tin: Started deploy [analytics/refinery@c1c9885]: Correcting hql from last deployment
  • 19:53 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.30 (duration: 01m 15s)
  • 19:45 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.30
  • 19:35 thcipriani@tin: Synchronized php-1.31.0-wmf.30/includes/page/Article.php: Do not pass USE INDEX to a $dbType parameter T192584 (duration: 01m 17s)
  • 19:33 ejegg: updated fundraising python tools from 626fe02a9f to 0c50f9e38f
  • 19:22 no_justification: gerrit: restarting services to pick up gc & indexing changes
  • 18:32 thcipriani@tin: Synchronized php-1.31.0-wmf.30/resources/src/jquery: jquery.makeCollapsible: Only add "[" "]" to autogenerated toggles T192140 (duration: 01m 17s)
  • 17:21 andrew@tin: Synchronized wmf-config/db-eqiad.php: Moving labtestwikitech to m5, step 3 (duration: 01m 16s)
  • 17:20 andrew@tin: Synchronized wmf-config/db-codfw.php: Moving labtestwikitech to m5, step 2 (duration: 01m 16s)
  • 17:18 andrew@tin: Synchronized docroot/noc/db.php: Moving labtestwikitech to m5, step 1 (duration: 01m 16s)
  • 16:56 ejegg: re-enabled banner impressions loader
  • 16:50 moritzm: uploaded tidy-0.99 to component/ci for apt.wikimedia.org/stretch-wikimedia (T191771)
  • 16:46 ejegg: disabled banner impressions loader in order to run backfill mode
  • 16:28 gehel: restarting tilerator on maps[12].* - T191655
  • 16:20 gehel: shutting down tilerator on maps[12].* for maintenance - T191655
  • 15:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 after alter table (duration: 01m 16s)
  • 15:50 fdans@tin: Finished deploy [analytics/refinery@5d0f63f]: deploying to launch page preview job (duration: 06m 34s)
  • 15:48 marostegui: Deploy schema change on dbstore1002 (s5) - T191519 T188299 T190148
  • 15:44 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2074 (duration: 01m 17s)
  • 15:44 fdans@tin: Started deploy [analytics/refinery@5d0f63f]: deploying to launch page preview job
  • 15:42 sbisson@tin: Finished deploy [kartotherian/deploy@0a5a3ef]: Deploy latest kartotherian with new i18n sources (take 3) (duration: 05m 22s)
  • 15:37 sbisson@tin: Started deploy [kartotherian/deploy@0a5a3ef]: Deploy latest kartotherian with new i18n sources (take 3)
  • 15:36 sbisson@tin: Finished deploy [kartotherian/deploy@89c4ca9]: Deploy latest kartotherian with new i18n sources (take 2) (duration: 03m 05s)
  • 15:33 sbisson@tin: Started deploy [kartotherian/deploy@89c4ca9]: Deploy latest kartotherian with new i18n sources (take 2)
  • 15:16 sbisson@tin: Finished deploy [kartotherian/deploy@74121d5]: Deploy latest kartotherian with new i18n sources (duration: 05m 19s)
  • 15:11 sbisson@tin: Started deploy [kartotherian/deploy@74121d5]: Deploy latest kartotherian with new i18n sources
  • 14:48 Dereckson: Erratum: read "User:Andrei Stroe" and not "User:Anderi Store" for the previous entry (T187184)
  • 14:47 Dereckson: Create bureaucrat account for User:Anderi Store on romd.wikimedia (T187184)
  • 14:30 marostegui: Star atop on db1114 without "-R" - T192551
  • 14:29 marostegui: Deploy schema change on db1082 (this will generate lag on s5 on labs hosts) - T191519 T188299 T190148
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 for alter table (duration: 01m 13s)
  • 14:19 ejegg: re-enabled queue jobs
  • 14:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 after alter table (duration: 01m 16s)
  • 14:12 jynus: starting reimage of db2074
  • 13:56 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2075, depool db2074 (duration: 01m 16s)
  • 13:39 marostegui: Stop atop on db1114 - T191996
  • 13:33 marostegui: Start atop on db1114 - T191996
  • 13:30 Trey314159: reindexing serbian wikis on elastic@eqiad (T189265)
  • 13:30 moritzm: upgrading mw1334-mw1337 (job runners) to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 13:14 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: T192427 T189277 (duration: 01m 17s)
  • 12:58 jynus: starting reimage of db2075
  • 12:48 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2075 (duration: 01m 16s)
  • 11:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2076 (duration: 01m 16s)
  • 11:39 moritzm: upgrading eqiad video scalers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 11:24 marostegui: Run check_private_data on labsdb - T183566
  • 11:21 marostegui: Sanitize lfnwiki - T183566
  • 11:20 moritzm: upgrading app servers mw1238-mw1258 to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 11:14 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: T181121 (duration: 01m 16s)
  • 11:09 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 01m 17s)
  • 11:05 marostegui: Deploy schema change on db1113:3315 - T191519 T188299 T190148
  • 11:03 jynus: starting reimage of db2076
  • 11:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 for alter table (duration: 01m 16s)
  • 11:01 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2076 (duration: 01m 18s)
  • 10:34 moritzm: upgrading API servers mw1221-mw1235 to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 10:27 vgutierrez: Repool (Re-enable BGP) lvs4005 - T191897
  • 09:59 elukey: complete migration of zookeeper on conf100[123]
  • 09:55 akosiaris: reboot ganeti VMs on row_B in codfw for cache=none setting. T181121
  • 09:54 vgutierrez: Updating puppet compiler facts
  • 09:51 moritzm: rolling restart of Cassandra on maps completed
  • 09:33 elukey: upgrade zookeper on conf100[123] from 3.4.5 to 3.4.9 - T182924
  • 09:31 akosiaris: start a force puppet run in all of eqiad with a batch size of 30
  • 09:29 akosiaris: stop ircecho for a while, puppetdb1001 reboot was eventful
  • 09:17 akosiaris: reboot puppetdb1001 for cache=none setting apply. T181121
  • 09:14 moritzm: installing Java security updates on maps* plus rolling restart of Cassandra to pick up new JRE
  • 09:06 vgutierrez: Depool and reimage lvs4005 as stretch - T191897
  • 09:03 moritzm: upgrading API server canaries to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build (T184854)
  • 08:40 vgutierrez: Repool (Re-enable BGP) lvs4006 - T191897
  • 08:14 ema: reboot deploy1001 and arm keyholder T175288
  • 08:14 moritzm: upgrading app server canaries to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build (T184854)
  • 07:47 akosiaris: set cache=none for ganeti VMs in codfw cluster configuration. VM reboots to follow T181121
  • 07:32 vgutierrez: Depool and reimage lvs4006 - T191897
  • 07:24 akosiaris: reboot ganeti VMs on row_A in eqiad for cache=none setting. T181121
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 after alter table (duration: 01m 17s)
  • 05:36 marostegui: Kill atop on db1114 - T191996
  • 05:33 marostegui: Revert RX buffer changes on db1114 - T191996
  • 05:27 marostegui: Deploy schema change on db1097:3315 - T191519 T188299 T190148
  • 05:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 for alter table (duration: 01m 33s)
  • 03:18 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 52s)
  • 01:15 eileen: civicrm revision changed from 0ac27e7c0d to b1e7ccfc4d, config revision is 49f5ba45e8
  • 00:12 Dereckson: Wikis creation done
  • 00:12 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set project namespace for hi.wikimedia (T188366) (duration: 01m 16s)
  • 00:04 Dereckson: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Fix path to hi.wikimedia.org 1x logo (Gerrit:427567)

2018-04-18

  • 23:44 Dereckson: Created bureaucrat account for Suyash.dwivedi at hi.wikimedia (T188366)
  • 23:35 dereckson@tin: Synchronized wmf-config/interwiki.php: New interwiki map for the six newest wikis (duration: 01m 17s)
  • 23:22 Dereckson: HTCP purge for https://hi.wikimedia.org and https://hi.wikimedia.org/
  • 23:19 Dereckson: Create tables for Translate extension on hiwikimedia
  • 23:13 Dereckson: HTCP purge for eu.wikisource logos
  • 23:10 dereckson@tin: Synchronized multiversion/MWMultiVersion.php: +hi.wikimedia.org +romd.wikimedia.org (duration: 01m 15s)
  • 23:05 dereckson@tin: Synchronized langlist: New languages: gor, inh, lfn (duration: 01m 17s)
  • 23:04 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for six wikis (duration: 01m 16s)
  • 23:03 dereckson@tin: rebuilt and synchronized wikiversions files: (no justification provided)
  • 23:02 dereckson@tin: Synchronized dblists: (no justification provided) (duration: 01m 15s)
  • 23:00 Dereckson: Starting syncing to production sequence for six wiki creation
  • 22:58 dereckson@tin: Synchronized static/images/project-logos/: Logos for eu.wikisource (T189465) (duration: 01m 12s)
  • 22:58 Dereckson: Created database and set initial stuff for hi.wikimedia.org (T188366)
  • 22:57 Dereckson: Created database and set initial stuff for romd.wikimedia.org
  • 22:31 Dereckson: Created database and set initial stuff for eu.wikisource.org (T189465)
  • 22:28 Dereckson: Created database and set initial stuff for gor.wikipedia.org (T189109)
  • 22:27 Dereckson: Created database and set initial stuff for inh.wikipedia.org (T184374)
  • 22:24 dereckson@tin: Synchronized php-1.31.0-wmf.29/extensions/WikimediaMaintenance/addWiki.php: Fix MassMessage fatal error (T192468) (duration: 01m 17s)
  • 22:17 Dereckson: Created database for lfn.wikipedia.org (T183561)
  • 21:57 eileen: civicrm revision changed from 00870af548 to 0ac27e7c0d, config revision is 853fcc9111
  • 21:53 ebernhardson: restart elasticsearch on elastic1022 with numa interleave
  • 21:17 eileen: civicrm revision changed from cddfe9416c to 00870af548, config revision is 853fcc9111
  • 20:52 ebernhardson: restart elasticsearch on elastic1020 with numa interleave
  • 20:13 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9328a7d]: Update mobileapps to fb161d7 (duration: 05m 56s)
  • 20:10 ebernhardson: restart elasticsearch on elastic1019 with numa interleave
  • 20:07 mholloway-shell@tin: Started deploy [mobileapps/deploy@9328a7d]: Update mobileapps to fb161d7
  • 19:55 thcipriani@tin: Finished scap: rebuild l10n cache (duration: 58m 57s)
  • 19:28 ppchelko@tin: Finished deploy [restbase/deploy@8d8f1df]: Test concurrent worker startups (duration: 15m 23s)
  • 19:15 ebernhardson: restart elasticsearch on elastic1018 with numa interleave
  • 19:13 ppchelko@tin: Started deploy [restbase/deploy@8d8f1df]: Test concurrent worker startups
  • 18:56 thcipriani@tin: Started scap: rebuild l10n cache
  • 18:35 dereckson@tin: Synchronized php-1.31.0-wmf.30/extensions/CentralNotice: Emit CSP headers on banner preview (duration: 01m 18s)
  • 18:33 ppchelko@tin: Finished deploy [changeprop/deploy@d83fad3]: Support multi-topic rules, rename metrics, update dependencies (duration: 01m 14s)
  • 18:32 ppchelko@tin: Started deploy [changeprop/deploy@d83fad3]: Support multi-topic rules, rename metrics, update dependencies
  • 18:25 imarlier@tin: Finished deploy [performance/coal@3c0ef36]: coal: typoed the run file (duration: 00m 04s)
  • 18:25 imarlier@tin: Started deploy [performance/coal@3c0ef36]: coal: typoed the run file
  • 18:17 imarlier@tin: Finished deploy [performance/coal@f1ca191]: Deploying coal version that includes a runner for service use (duration: 00m 04s)
  • 18:17 imarlier@tin: Started deploy [performance/coal@f1ca191]: Deploying coal version that includes a runner for service use
  • 17:53 ebernhardson: restart elasticsearch on elastic1017
  • 17:35 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Emit CSP headers on banner previews (T190100, no-op for now) (duration: 01m 16s)
  • 17:19 ejegg: updated CiviCRM from 64b26ad377 to cddfe9416c
  • 16:47 andrewbogott: deleted lots of log files (mostly nova-api logs) on labtestnet2001
  • 16:42 reedy@tin: Synchronized wmf-config/interwiki.php: sync! (duration: 01m 15s)
  • 16:32 reedy@tin: Synchronized php-1.31.0-wmf.30/extensions/WikimediaMaintenance: fix addwiki.php (duration: 01m 18s)
  • 16:30 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: translatew for advisorswiki (duration: 01m 16s)
  • 16:26 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: advisorswikki (duration: 01m 15s)
  • 16:24 reedy@tin: rebuilt and synchronized wikiversions files: advisorswiki
  • 16:21 reedy@tin: Synchronized dblists/: advisorswiki (duration: 01m 16s)
  • 16:11 ppchelko@tin: Finished deploy [cpjobqueue/deploy@749ae82]: Update dependencies and reduce dedupe logging rate (duration: 00m 43s)
  • 16:10 ppchelko@tin: Started deploy [cpjobqueue/deploy@749ae82]: Update dependencies and reduce dedupe logging rate
  • 15:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2077 (duration: 01m 16s)
  • 15:33 _joe_: depooling mw1227 for investigation in high load
  • 15:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 after alter table (duration: 01m 15s)
  • 15:09 urandom: decommissioning Cassandra, restbase1010-b -- T189822
  • 15:08 dcausse: reindexing serbian wikis on elastic@eqiad (T189265)
  • 14:55 urandom: restarting Cassandra, restbase1011-a to test v 0.8 of Prometheus JMX exporter -- T192456
  • 14:51 jynus: starting reimage of db2077
  • 14:37 urandom: restarting Cassandra, restbase1011-a -- T192456
  • 14:35 marostegui: Disable puppet on db1114 - T191996
  • 14:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2080, depool db2077 (duration: 01m 16s)
  • 14:04 gehel: powercycle unresponsive maps-test2001
  • 14:00 elukey: restart kafka on kafka1001 and kafka2001 (jobqueues,eventbus) for opnejdk-7 upgrades
  • 13:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 for alter table (duration: 01m 16s)
  • 13:49 marostegui: Deploy schema change on db1100 - T191519 T188299 T190148
  • 13:44 moritzm: uploaded HHVM 3.18.5+dfsg-1+wmf7+icu57 to apt.wikimedia.org/jessie-wikimedia (includes a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854))
  • 13:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 after alter table (duration: 01m 15s)
  • 13:17 Amir1: EU SWAT is done
  • 13:17 moritzm: uploaded HHVM 3.18.5+dfsg-1+wmf7+deb9u1 to apt.wikimedia.org/stretch-wikimedia (includes a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 13:16 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Limit page creation and edit rate on Wikidata (T184948) (duration: 01m 17s)
  • 13:00 jynus: starting reimage of db2080
  • 12:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2081, depool db2080 (duration: 01m 16s)
  • 11:20 vgutierrez: Repool (Re-enable BGP) lvs2004 - T191897
  • 11:02 jynus: starting reimage of db2081
  • 10:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2081, repool db2082, es2013 (duration: 01m 15s)
  • 10:45 vgutierrez: Depool and reimage lvs2004 - T191897
  • 10:27 vgutierrez: Repool (Re-enable BGP) in lvs2005 - T191897
  • 09:49 hoo: Ran scap pull on mwdebug1001 after checking https://gerrit.wikimedia.org/r/427156
  • 09:49 jynus: starting reimage of db2082
  • 09:46 Amir1: start of deleting auto patrol actions in small wikis (T184485)
  • 09:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2082 (duration: 01m 15s)
  • 09:37 moritzm: strip apache/nginx/nutcracker/hhvm from former image scaler (now spares)
  • 09:32 vgutierrez: Depool and reimage lvs2005 - T191897
  • 09:30 marostegui: Deploy schema change on db1096:3315 - T191519 T188299 T190148
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 for alter table (duration: 01m 22s)
  • 09:27 godog: reenable puppet fleetwide after https://gerrit.wikimedia.org/r/c/421860
  • 09:16 moritzm: imported lz4 0.0~r131-2~wmf1+trusty1 for trusty-wikimedia to apt.wikimedia.org (needed to build HHVM 3.18 for trusty)
  • 09:09 godog: stop puppet agent fleetwide before applying https://gerrit.wikimedia.org/r/c/421860/
  • 09:08 moritzm: reimaging mw1281 to stretch
  • 09:04 _joe_: restart HHVM on mw1223,mw1224, also repool them after investigation in crashes
  • 08:59 vgutierrez: Repool (Re-enable BGP) in lvs3003 - T191897
  • 08:44 elukey: execute cumin 'analytics10[28-69]*' 'rm /etc/apt/preferences.d/r_* && apt-get update' to clear jessie backports apt config - T192348
  • 07:39 vgutierrez: Depool and reimage lvs3003 as stretch - T191897
  • 06:49 marostegui: Deploy schema change on s5 codfw master (db2052) this will generate lag in codfw - T191519 T188299 T190148
  • 06:43 moritzm: installing ruby security updates for trusty
  • 05:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 after changing RX buffers - T191996 (duration: 01m 09s)
  • 05:20 marostegui: Change RX buffers on db1114 - T191996
  • 05:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 01m 15s)
  • 05:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 after alter table (duration: 01m 16s)
  • 05:02 marostegui: Deploy schema change on db1071 (s8 primary master) - T185128 T153182
  • 02:51 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 55s)
  • 00:05 aaron@tin: Synchronized wmf-config/mc-labs.php: 8ad186728d: use mcrouter key prefixes (deployment-prep only) (duration: 01m 15s)

2018-04-17

  • 23:31 ebernhardson@tin: Synchronized wmf-config/CommonSettings-labs.php: labs config noop (duration: 01m 15s)
  • 23:17 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T191236: Shift search traffic back to eqiad (duration: 01m 17s)
  • 23:08 gilles: Private wiki thumbnail traffic now going to eqiad T191643
  • 23:07 gilles@tin: Synchronized wmf-config/filebackend.php: Fix private wiki DC configuration: Serve private wiki thumbnails with Thumbor (T191643) (duration: 01m 18s)
  • 21:34 demon@tin: Synchronized wmf-config/CommonSettings.php: ext-dist config changes for rel1_31 (duration: 01m 16s)
  • 20:13 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.30
  • 19:59 smalyshev@tin: Started deploy [wdqs/wdqs@f08fbcc]: GUI update
  • 19:48 demon@tin: Finished scap: bootstrap wmf.30 (duration: 112m 35s)
  • 19:01 imarlier@tin: Finished deploy [performance/navtiming@22483a4]: Navtiming refactor for increased testability, and to add wrapper for easy service use (duration: 00m 02s)
  • 19:01 imarlier@tin: Started deploy [performance/navtiming@22483a4]: Navtiming refactor for increased testability, and to add wrapper for easy service use
  • 18:52 urandom: rebooting restbase-dev1006 (kernel oom killer misbehaving)
  • 18:45 urandom: rebooting restbase-dev1005 (kernel oom killer misbehaving)
  • 18:41 urandom: rebooting restbase-dev1004 (kernel oom killer misbehaving)
  • 17:56 demon@tin: Started scap: bootstrap wmf.30
  • 17:27 ejegg: updated payments-wiki from 320a6c2600 to 4a8aada491
  • 17:16 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 100% of anons for enwiki - T191101 (duration: 00m 59s)
  • 16:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1017 fully (duration: 01m 16s)
  • 16:37 elukey: incremental rollout of the new zookeeper jmx config to druid1* and conf*
  • 16:34 urandom: decommissioning Cassandra, restbase1010-a -- T189822
  • 16:02 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 75% of anons for enwiki - T191101 (duration: 00m 58s)
  • 15:50 arturo: enable puppet in labstore1004
  • 15:37 vgutierrez: Repool (Enable BGP) on lvs3004 - T191897
  • 15:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2048 IP - T191193 (duration: 00m 58s)
  • 15:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2048 IP - T191193 (duration: 00m 58s)
  • 15:23 marostegui: Stopping mysql on db2048 will break replication on codfw s1 slaves
  • 15:23 marostegui: Stop MySQL on db2048 for rack movement - T191193
  • 15:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067, es1017 with low load (duration: 01m 02s)
  • 14:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 after changing the network cable - T191996 (duration: 01m 02s)
  • 14:55 gehel: starting data reimport after re-image for wdqs2001 - T189192
  • 14:53 marostegui: Stop MySQL on db2042 to move it to another rack - https://phabricator.wikimedia.org/T191193
  • 14:36 ariel@tin: Finished deploy [dumps/dumps@1073d75]: more exception logging from xmlstream (duration: 00m 03s)
  • 14:36 ariel@tin: Started deploy [dumps/dumps@1073d75]: more exception logging from xmlstream
  • 14:30 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 50% of anons for enwiki - T191101 (duration: 00m 58s)
  • 14:25 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Support per-event dispatch of events, file 3/3 - T191464 (duration: 03m 07s)
  • 14:23 jynus: start es1017 reimage
  • 14:22 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Support per-event dispatch of events, file 2/3 - T191464 (duration: 03m 06s)
  • 14:16 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/extension.json: Support per-event dispatch of events, file 1/3 - T191464 (duration: 03m 00s)
  • 14:08 vgutierrez: Depool and reimage lvs3004 as stretch - T191897
  • 13:42 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/extension.json: Support per-event dispatch of events, file 1/3 - T191464 (duration: 03m 07s)
  • 13:33 moritzm: removed role::mediawiki::imagescaler from deployment-mediawiki05, per watroles the only use of that role in WMCS
  • 13:32 moritzm: removed role::mediawiki::imagescaler from deployment-prep, per watroles the only use of that role in WMCS
  • 13:30 jynus: starting backup from db1067, may generate some lag
  • 13:26 volans: updating puppet compiler facts
  • 13:25 elukey: completed migration of zookeeper on conf200[123]
  • 13:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 (duration: 00m 58s)
  • 13:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 to get it ready for a network cable change (duration: 00m 58s)
  • 13:00 elukey: upgrade zookeeper on conf200[123] to 3.4.9~jessie - T182924
  • 12:31 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 25% of annons on enwiki, take #2 - T191101 (duration: 00m 58s)
  • 12:04 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 25% of annons on enwiki - T191101 (duration: 01m 03s)
  • 10:52 ema: lvs100[63] restart pybal to apply https://gerrit.wikimedia.org/r/424553 T188062
  • 10:39 ema: lvs200[63]: restart pybal to apply https://gerrit.wikimedia.org/r/424553 T188062
  • 10:03 mobrovac@tin: Finished deploy [restbase/deploy@e463fcf]: Use keep-alive for connections to AQS (duration: 20m 17s)
  • 09:43 mobrovac@tin: Started deploy [restbase/deploy@e463fcf]: Use keep-alive for connections to AQS
  • 09:37 moritzm: reimaging mw1280, mw1281, mw1282 (API servers) to stretch
  • 09:36 moritzm: reimaging mw1266, mw1267, mw1268 (app servers) to stretch
  • 09:17 godog: restart xenon-log on mwlog* - T169249
  • 08:46 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 08:19 elukey: restart nrpe-server on kafka2001 (kafka check not defined)
  • 08:01 moritzm: rolling restart of HHVM on video scalers to pick up ICU security update
  • 07:42 moritzm: installing ICU security updates
  • 07:27 jynus: restarting dbstore2001
  • 07:14 moritzm: installing perl security updates on trusty
  • 06:48 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 06:47 vgutierrez: Depool and reimage chromium as stretch - T187090
  • 06:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 00m 58s)
  • 05:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 58s)
  • 05:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 05:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 00m 58s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 05:21 marostegui: Deploy schema change on db1092 - T187089 T185128 T153182
  • 05:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 for alter table (duration: 00m 58s)
  • 05:11 marostegui: Stop MySQL and reboot db1114 to boot up with the new kernel
  • 05:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 59s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 27s)
  • 01:09 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op Ib39022 (duration: 01m 00s)

2018-04-16

  • 23:57 eileen: update civicrm revision changed from b3326dbf70 to 64b26ad377, config revision is 853fcc9111
  • 21:03 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the correct way of calculating the domain from the wiki, file 2/2 - T192198 (duration: 00m 58s)
  • 21:02 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the correct way of calculating the domain from the wiki, file 1/2 - T192198 (duration: 00m 59s)
  • 20:34 imarlier@tin: Finished deploy [performance/navtiming@64d9c90]: null deploy (duration: 00m 02s)
  • 20:33 imarlier@tin: Started deploy [performance/navtiming@64d9c90]: null deploy
  • 20:13 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Revert using the wiki of the job runner, file 2/2 (duration: 00m 58s)
  • 20:12 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Revert using the wiki of the job runner, file 1/2 (duration: 00m 58s)
  • 19:47 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the wiki set in the JobQueue when creating the event, file 2/2 - T192198 (duration: 00m 59s)
  • 19:46 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the wiki set in the JobQueue when creating the event, file 1/2 - T192198 (duration: 01m 00s)
  • 18:28 ottomata: temporarily stopping puppet on kafka200[123] to apply MirrorMaker --new.consumer https://gerrit.wikimedia.org/r/#/c/424344/ T190940
  • 18:03 ottomata: restarting main <-> main DC kafka mirror maker instances to blacklist job and cp topics T190940 T167039
  • 17:11 moritzm: upgraded HHVM on mediawiki-jobrunner03 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 15:53 akosiaris: restart hhvm on mw2252
  • 15:29 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2a720fc]: Log HTML for PHP fatal errors from MW (duration: 01m 01s)
  • 15:28 ppchelko@tin: Started deploy [cpjobqueue/deploy@2a720fc]: Log HTML for PHP fatal errors from MW
  • 15:25 moritzm: upgraded HHVM on mediawiki-deployment-07 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 15:07 jynus: start reimage of es3-codfw master, es2017
  • 15:01 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 14:53 vgutierrez: restart pybal on lvs1003 - T187766
  • 14:49 vgutierrez: restart pybal on lvs2003 - T187766
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 00m 58s)
  • 14:42 vgutierrez: restart pybal on lvs1006 - T187766
  • 14:39 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=wdqs-internal
  • 14:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 57s)
  • 14:25 vgutierrez: restarting pybal on lvs2006 - T187766
  • 14:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 14:12 moritzm: upgraded HHVM on mediawiki-deployment-09 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 14:06 jynus: start reimage of es2-codfw master, es2016
  • 14:05 hashar: restarted Jenkins for plugin upgrade T192261
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 00m 58s)
  • 13:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 13:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1017 (duration: 00m 58s)
  • 13:31 marostegui: Stop MySQL on db1114 to reboot with another kernel - T191996
  • 13:30 godog: roll-restart swift-proxy in codfw and eqiad - T188062
  • 13:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 54s)
  • 13:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 (duration: 00m 59s)
  • 12:12 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 12:11 vgutierrez: Depool and reimage hydrogen as stretch - T187090
  • 11:50 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 11:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1114 original weight (duration: 00m 59s)
  • 10:50 moritzm: reimaging mw1299 (job runner) to stretch
  • 10:23 ariel@tin: Finished deploy [dumps/dumps@4706d30]: show full stacktrace for dump job errors (duration: 00m 04s)
  • 10:23 ariel@tin: Started deploy [dumps/dumps@4706d30]: show full stacktrace for dump job errors
  • 10:18 godog: upload prometheus-memcached-exporter to stretch-wikimedia - T189056
  • 10:17 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 10:16 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 58s)
  • 09:50 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 09:49 vgutierrez: Depool and reimage acamar as stretch - T187090
  • 09:43 gehel: rolling restart of wdqs100[35] and wdqs200[123] for kernel upgrade completed
  • 09:40 jynus: restarting dbstore2001:s8 to increase the number of purge threads
  • 09:23 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=achernar.wikimedia.org,service=pdns_recursor
  • 09:07 gehel: starting rolling restart of wdqs100[35] and wdqs200[123] for kernel upgrade
  • 09:05 moritzm: pooled mw1276-mw1278 (API app server canaries running stretch)
  • 08:49 gehel: first manual run of populate_admin() for maps[12]001 - T190605
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1114 original main traffic weight (duration: 00m 58s)
  • 08:41 moritzm: pooled mw1261-mw1264 (app server canaries running stretch)
  • 08:29 joal@tin: Finished deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy (duration: 05m 27s)
  • 08:25 _joe_: depooling mw1223 for investigation too
  • 08:23 joal@tin: Started deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 08:04 elukey: restart hhvm on mw[1228,1234,1281-1287,1289,1290,1312-1314,1317,1339,1343,1345,1346,1348] - more than 50% cpu usage, prevention scheme for current high load
  • 08:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 07:49 marostegui: Stop MySQL and reboot db1114 - T191996
  • 07:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 59s)
  • 07:40 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=achernar.wikimedia.org,service=pdns_recursor
  • 07:39 vgutierrez: Depool and reimage achernar.wikimedia.org - T187090
  • 07:27 moritzm: installing perl security updates on Debian systems
  • 06:45 TimStarling: depooled mw1230
  • 06:38 _joe_: repooling mw1230
  • 06:20 marostegui: Drop table flow_subscription from x1 - T149936
  • 05:59 elukey: restart hhvm on mw[1221,1233,1280,1347] - high load
  • 05:55 elukey: repool mw1341 after investigation
  • 05:48 elukey: restart hhvm on mw1225, 1315, 1316, 1340, 1341, 1342, 1347 - high load
  • 05:42 marostegui: Reload haproxy on dbproxy1010
  • 05:36 elukey: restart hhvm on mw1226,27,32,88 - high load
  • 05:35 _joe_: depooling mw1341 to further debug the API issue
  • 05:33 marostegui: Deploy schema change on db1087 with replication (this will generate lag in labs) - T187089 T185128 T153182
  • 05:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 (duration: 00m 59s)
  • 03:02 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 11m 09s)

2018-04-15

  • 22:09 ema: cp3037: restart varnish-be
  • 21:45 ema: cp3039: restart varnish-be
  • 21:42 elukey: restart hhvm on mw1286,1317,1339 - high load
  • 21:31 ema: cp3038: restart varnish-be
  • 21:30 ema: cp3036: restart varnish-be
  • 20:52 elukey: restart hhvm on mw13[43,45,46,48] - high load
  • 20:48 elukey: restart hhvm on mw13[12-14] - high load
  • 20:45 elukey: restart hhvm on mw[1285,1287,1289-1290] - high load
  • 20:40 _joe_: restart mw1344, high load
  • 20:38 elukey: restart hhvm on mw12[22,79,82] - high load
  • 20:32 elukey: restart hhvm on mw12[32-35] - high load
  • 20:24 elukey: restart hhvm on mw1229-31 - high load
  • 20:24 _joe_: restarted mw1280-4, high load
  • 20:17 elukey: restart hhvm on mw122[6-8] - high load
  • 20:05 elukey: restart hhvm on mw122[3,4] - high load
  • 13:42 elukey: restart hhvm on mw1227 due to high load (hhvm dump debug in /tmp/hhvm.44071.bt)
  • 10:53 elukey: powercycle mw1272 - not responsive to ssh, mgmt com2 console showing "[OK" and no tty

2018-04-13

  • 20:44 imarlier@tin: Finished deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active) (duration: 00m 02s)
  • 20:44 imarlier@tin: Started deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active)
  • 20:00 demon@tin: Pruned MediaWiki: 1.31.0-wmf.28 [keeping static files] (duration: 01m 34s)
  • 19:23 demon@tin: Pruned MediaWiki: 1.31.0-wmf.25 (duration: 05m 03s)
  • 17:17 andrewbogott: upgraded packages on all labvirts and restarted nova-compute
  • 16:55 arturo: enable puppet in labstore1005
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give db1104 origina main traffic weight (duration: 01m 00s)
  • 16:34 andrewbogott: upgrading packages on labvirt1016 and rebooting (1016 is a spare server that won't affect VPS users)
  • 16:26 arturo: disable puppet in labstore1005 to hot-test https://gerrit.wikimedia.org/r/#/c/426103/
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give db1104 some main traffic - T191996 (duration: 01m 00s)
  • 16:04 hashar: cleaning up lost instances in nodepool (nodepool delete XXXXX)
  • 15:50 andrewbogott: upgrading lots of packages and rebooting labservices1002 and 1002
  • 15:43 andrewbogott: restarting nodepool on labnodepool1001
  • 15:27 andrewbogott: rebooting lots of packages on labnet1001 and labnet1002 for T145919
  • 15:14 bd808: wiki replicas: added page_assessments views for frwiki & huwiki
  • 15:09 chasemp: labstore1004 stop nfs-exportd, cp export.bak to export.d, exportfs -ra (all exports were wiped out)
  • 14:59 andrewbogott: rebooting labcontrol1001
  • 14:42 andrewbogott: upgrading lots of packages on labcontrol1001 and 1002 and rebooting. T145919
  • 14:38 andrewbogott: stopping puppet and nodepool on labnodepool1001
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 - T191996 (duration: 01m 07s)
  • 14:22 XioNoX: enable flow control on db1114's switch port - T191996
  • 14:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T191996 (duration: 00m 59s)
  • 14:13 andrewbogott: disabling puppet on labcontrol*, labnet*, labservices*, labvirt* before beginning T145919
  • 14:13 moritzm: installing apache security updates on contint1001
  • 14:09 andrewbogott: silencing alerts for labcontrol*, labnet*, labservices*, labvirt* before beginning T145919
  • 14:06 moritzm: uploaded ivy-debian-helper to apt.wikimedia.org/jessie (needed for zookeeper backport)
  • 13:52 elukey: roll restart druid + zookeeper daemons on druid100[123] for openjdk-7 updates
  • 13:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1013 with full weight (duration: 01m 00s)
  • 13:32 elukey: restart druid and zookeeper daemons on druid100[456] for opejdk-7 updates
  • 13:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 after alter table (duration: 01m 02s)
  • 13:19 urandom: increasing heap size to 16G -- T186751
  • 12:37 moritzm: installing apache security updates on mendelevium (otrs)
  • 12:36 moritzm: installing apache security updates on bohrium (piwik)
  • 11:58 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 11:56 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1013 with low load (duration: 01m 04s)
  • 10:59 moritzm: reimaging mw1261-mw1264 to stretch (T174431)
  • 10:40 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 10:38 vgutierrez: Depool and reimage maerlant.wikimedia.org as stretch
  • 10:16 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=nescio.wikimedia.org,service=pdns_recursor
  • 10:01 moritzm: installing java security updates on meiterium/archive.wikimedia.org
  • 09:33 jynus: start reimage of es1013
  • 09:03 moritzm: reimaging mw1276-mw1278 to stretch (T174431)
  • 08:53 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=nescio.wikimedia.org,service=pdns_recursor
  • 08:52 vgutierrez: depool and reimage nescio.wikimedia.org as stretch
  • 08:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 in API - T191996 (duration: 01m 00s)
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully depool db1114 - T191996 (duration: 01m 00s)
  • 07:58 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Kick Electron, hanging, take 2 - T174916
  • 07:52 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Kick Electron, hanging - T174916
  • 07:22 legoktm: restarting jenkins
  • 07:15 moritzm: pooling mw1265 and mw1279 for production traffic
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from main traffic - T191996 (duration: 01m 00s)
  • 05:37 marostegui: Deploy schema change on db1104 - T187089 T185128 T153182
  • 05:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 for alter table (duration: 01m 00s)
  • 05:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after alter table (duration: 01m 01s)
  • 05:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 after alter table (duration: 01m 01s)

2018-04-12

  • 23:33 awight@tin: Finished deploy [ores/deploy@543901a]: Restore ores1001 canary to master branch (duration: 03m 24s)
  • 23:30 awight@tin: Started deploy [ores/deploy@543901a]: Restore ores1001 canary to master branch
  • 23:25 awight@tin: Finished deploy [ores/deploy@a5cec53]: Canary ores1001 only: Limited test of git-lfs for ORES (duration: 02m 31s)
  • 23:22 awight@tin: Started deploy [ores/deploy@a5cec53]: Canary ores1001 only: Limited test of git-lfs for ORES
  • 23:09 dereckson@tin: Synchronized tests/: Update PHPUnit tests to use PHPUnit\Framework\TestCase (no-op) (duration: 01m 01s)
  • 22:07 urandom: restarting Cassandra, restbase2003 -- T192112
  • 21:07 urandom: restarting Cassandra, restbase1010 -- T192112
  • 21:03 urandom: temporarily disabling puppet to make (ephemeral) change to GC settings, restbase1010 -- T192112
  • 20:37 urandom: increase change-prop sample rate in dev env to 100% (from 80) -- T186751
  • 20:34 ppchelko@tin: Finished deploy [cpjobqueue/deploy@bd772eb]: Revert switching TranslationUpdateJob T192107 (duration: 00m 39s)
  • 20:33 ppchelko@tin: Started deploy [cpjobqueue/deploy@bd772eb]: Revert switching TranslationUpdateJob T192107
  • 20:32 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch TranslateUpdateJob back to the Redis-based queue as it is using PHP serialisation - T192107 (duration: 01m 00s)
  • 20:04 XioNoX: all good, revert routing ns1 to radon
  • 19:54 ema: reboot baham for kernel upgrade T188092
  • 19:51 XioNoX: routing ns1 to radon
  • 19:46 XioNoX: all good, revert routing ns0 to baham
  • 19:41 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.29
  • 19:40 ema: reboot radon for kernel upgrade T188092
  • 19:37 XioNoX: routing ns0 to baham
  • 18:02 arlolra@tin: Finished deploy [parsoid/deploy@1807a38]: Updating Parsoid to 322b6e8 (duration: 15m 09s)
  • 17:47 arlolra@tin: Started deploy [parsoid/deploy@1807a38]: Updating Parsoid to 322b6e8
  • 17:38 herron: puppet master updates complete — re-enabling puppet agents
  • 17:35 moritzm: installing apache security updates on hafnium
  • 17:31 herron: temporarily disabling puppet agents for openssl updates and apache restarts on puppet masters
  • 17:27 moritzm: installing apache security updates on krypton
  • 17:17 moritzm: installing patch security updates on trusty
  • 16:59 urandom: increase change-prop sample rate in dev env to 80% (from 60) -- T186751
  • 16:21 marostegui: Deploy schema change on db1066 - T132416
  • 16:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 to main traffic and depool db1066 for alter table - T191996 (duration: 01m 17s)
  • 16:07 marostegui: Reboot es2013 - T191977
  • 15:27 gehel: rolling restart of elasticsearch cirrus / eqiad for jvm upgrade completed
  • 15:06 moritzm: installing django/apache security updates on labmon*
  • 15:03 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2013 (duration: 01m 17s)
  • 14:59 jynus: shutting down es2013's mariadb
  • 14:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: No-op: Clean up an unused global var for the EventBus-based JobQueue (duration: 01m 17s)
  • 14:44 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the second bulk of low-traffic jobs for all wikis - T190327 (duration: 01m 16s)
  • 14:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327 (duration: 00m 35s)
  • 14:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327
  • 14:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from main traffic - T191996 (duration: 01m 18s)
  • 14:21 vgutierrez: Reimage lvs2006 as stretch
  • 14:11 moritzm: pooling mw1265 (app server) temporarily for production traffic
  • 14:03 urandom: increase change-prop sample rate in dev env to 60% (from 40) -- T186751
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 into API - T191996 (duration: 01m 17s)
  • 13:47 herron: updated puppet-run script to log using syslog and updated rsyslog config to direct puppet-agent logs to /var/log/puppet.log https://gerrit.wikimedia.org/r/425538
  • 13:44 sbisson@tin: Finished deploy [tilerator/deploy@46cc948]: Deploying tilerator@i18n everywhere (duration: 02m 04s)
  • 13:44 marostegui: Deploy schema change on db1101:3318 - T187089 T185128 T153182
  • 13:42 sbisson@tin: Started deploy [tilerator/deploy@46cc948]: Deploying tilerator@i18n everywhere
  • 13:40 gehel: dropping leftover keyspace v2 and v5 on maps / eqiad - T191655
  • 13:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 for alter table (duration: 01m 17s)
  • 13:31 moritzm: installing openssl updates
  • 13:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 (duration: 01m 17s)
  • 13:22 gehel: i18n maps will not be available yet, this is only preliminary work
  • 13:22 gehel: deploying maps internationalization, including new keyspace and generating new tiles - T191655
  • 13:18 zeljkof: EU SWAT finished
  • 13:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Page Previews for 10% enwiki anon users (T189906) (duration: 01m 18s)
  • 13:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1012 with full weight (duration: 01m 17s)
  • 12:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 after alter table (duration: 01m 17s)
  • 12:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from API - T191996 (duration: 01m 17s)
  • 12:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1012 with low weight (duration: 01m 19s)
  • 12:13 marostegui: Deploy schema change on s8 dbstore1002 - T187089 T185128 T153182
  • 11:59 moritzm: pooling mw1279 for some brief test production traffic
  • 09:58 jynus: reimage es1012, take 2
  • 08:12 marostegui: Drop table linkscc from s3 codfw primary master
  • 08:11 marostegui: Drop table linkscc from s1
  • 07:55 marostegui: Drop table linkscc from s2 and s7
  • 07:50 marostegui: Drop table linkscc from s4,s5 and s6
  • 07:41 jynus: reimage es1012
  • 07:40 moritzm: enabling production traffic for mw1265
  • 07:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 after alter table - T190780 (duration: 01m 16s)
  • 07:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 for alter table - T190780 (duration: 01m 17s)
  • 06:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 after alter table - T190780 (duration: 01m 17s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for alter table - T190780 (duration: 01m 17s)
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 after alter table - T190780 (duration: 01m 16s)
  • 06:42 marostegui: Deploy schema change on db1072 (sanitarium master for s3) - this will generate lag on s3 labsdb - T190780
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 for alter table - T190780 (duration: 01m 18s)
  • 06:27 marostegui: Deploy schema change on s3 codfw master (db2043) - this will generate lag on s3 codfw -T190780
  • 06:24 marostegui: Deploy schema change on s1 primary master (db1052) - T190780
  • 06:11 marostegui: Deploy schema change on s7 primary master (db1062) - T190780
  • 06:08 elukey: force kill of fuse_dfs (handling /mnt/hdfs) on stat1004, apparently causing a huge load
  • 06:05 elukey: force kill of fuse_dfs (handling /mnt/hdfs) on stat1005, apparently causing a huge load
  • 05:52 marostegui: Deploy schema change on s2 primary master (db1054) - T190780
  • 05:49 marostegui: Deploy schema change on s8 primary master (db1071) - T190780
  • 05:45 marostegui: Deploy schema change on s4 primary master (db1068) - T190780
  • 05:39 marostegui: Deploy schema change on s6 primary master (db1061) - T190780
  • 05:34 marostegui: Deploy schema change on s5 primary master (db1070) - T190780
  • 05:27 marostegui: Deploy schema change on db1109 - T187089 T185128 T153182
  • 05:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 for alter table (duration: 01m 17s)
  • 05:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 after alter table (duration: 01m 18s)
  • 05:11 marostegui: Reload haproxy on dbproxy1011 to repool labsdb1009
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 07m 20s)
  • 01:34 eileen: civicrm revision changed from 07bade75a2 to b3326dbf70, config revision is 853fcc9111 (deploy wmffraud report)
  • 00:44 twentyafterfour: The hotfix that I deployed for phabricator: https://phabricator.wikimedia.org/rPHEX7801b519442eea2bfd47a272ba36959b487ae7d7
  • 00:33 twentyafterfour: phabricator: hotfixing DeadlineEditEngineSubtype.php
  • 00:23 twentyafterfour: phabricator is back
  • 00:18 twentyafterfour: phabricator will be offline for just a moment while I run the upgrade script.
  • 00:15 twentyafterfour: preparing to deploy phabricator rPHDEP/release/2018-04-12/1 https://phabricator.wikimedia.org/project/view/3335/
  • 00:09 mutante: jerkins-bot tests all return -1 due to operations-mw-config-php55lint failing which says it can't clone on integration-slave-jessie-1003, which is out of disk space in /srv as reported by shinken. it's mostly all /srv/pbuilder
  • 00:08 twentyafterfour: phabricator update will begin shortly, running a bit behind due to a massive upstream merge which will have to wait until later date.
  • 00:08 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/425723/ (duration: 01m 18s)

2018-04-11

  • 23:48 ejegg: enabled new civicrm contact de-dupe job
  • 23:19 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Allow sysops to create Flow boards on euwiki (T190500) (duration: 01m 17s)
  • 23:09 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Stop logging autopatrol actions everywhere (T184485) (duration: 01m 18s)
  • 22:47 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy GlobalPreferences T184121 (duration: 01m 17s)
  • 22:47 mutante: ores2* - puppet ran to change venv config, then 'rm -rf /srv/deployment/ores/venv/' via cumin to clean-up (T181071)
  • 22:41 mutante: ores1002-1009 - deleting old venv dir - rm -f /srv/deployment/ores/venv (T181071)
  • 22:37 mutante: ores1001 - rm -rf /srv/deployment/ores/venv/
  • 22:37 mutante: ores - same for codfw instances, change of venv path to /srv/deployment/ores/deploy/venv/
  • 22:30 mutante: ores - all eqiad instances are being restarted by puppet after config change
  • 22:28 mutante: ores - running puppet on all instances to apply venv path change for T181071
  • 22:24 musikanimal@tin: Synchronized wmf-config/InitialiseSettings.php: Enabling PageAssessments on huwiki (T191697) (duration: 01m 17s)
  • 22:23 bstorm_: views updated on labsdb1009
  • 22:13 musikanimal@tin: Synchronized wmf-config/InitialiseSettings.php: Enabling PageAssessments on frwiki (T153393) (duration: 01m 26s)
  • 20:36 urandom: increase change-prop sample rate in dev env to 40% (from 20) -- T186751
  • 20:20 awight@tin: Finished deploy [ores/deploy@b6deb5d]: Transitional virtualenv for ORES (take 2), T181071 (duration: 18m 34s)
  • 20:02 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.29 (duration: 01m 16s)
  • 20:02 awight@tin: Started deploy [ores/deploy@b6deb5d]: Transitional virtualenv for ORES (take 2), T181071
  • 20:00 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.29
  • 19:23 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.29
  • 19:11 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki back to 1.31.0-wmf.29
  • 19:09 thcipriani@tin: Synchronized php-1.31.0-wmf.29/includes/libs/rdbms/database: rdbms: fix transaction flushing in Database::close T191916 (duration: 01m 01s)
  • 18:47 urandom: restarting cassandra, dev environment (set -XX:+PerfDisableSharedMem) -- T186751
  • 18:11 mutante: deploy1001 is back on stretch once again - it has been removed from scap hosts though (T175288 T185275)
  • 17:40 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy page previews for anons on dewiki T191966 (duration: 00m 54s)
  • 17:30 sbisson@tin: Finished deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 everywhere (duration: 02m 27s)
  • 17:29 Krinkle: actually re-enabled puppet on graphite2001
  • 17:28 sbisson@tin: Started deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 everywhere
  • 17:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on wikis with <50 issues in high priority linter cats T190731 (duration: 00m 59s)
  • 16:53 sbisson@tin: Finished deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 to maps-test* (duration: 01m 16s)
  • 16:51 sbisson@tin: Started deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 to maps-test*
  • 16:44 elukey: restart hadoop hdfs namenodes on analytics100[12] to pick up HDFS Trash settings - T189051
  • 16:35 robh: cp2018 returned to service
  • 16:33 foks: See T191887
  • 16:24 robh: cp2011 returned to service
  • 16:23 marostegui: Reload haproxy on dbproxy1011 to depool labsdb1009
  • 16:14 elukey: reboot notebook1001 for kernel updates
  • 16:11 urandom: restarting cassandra, dev environment (testing default GC settings) -- T186751
  • 15:58 Krinkle: Re-enabled puppet and coal on graphite2001
  • 15:43 robh: cp2008 repooled after memory swap
  • 15:20 Krinkle: disabling coal service on graphite2001 and disabling puppet – T191239
  • 15:19 jynus: fixing grant issue on db1114
  • {{safesubst:SAL entry|1=15:14 ema: restart pybal on lvs1003 for logstash-{json,syslog} UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425253/}}
  • {{safesubst:SAL entry|1=15:08 ema: restart pybal on lvs1006 for logstash-{json,syslog} UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425253/}}
  • 15:06 robh: shutting down cp2008, cp2011, and cp2018 for onsite work
  • 15:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1012 (duration: 01m 00s)
  • 15:01 marlier: Stopping coal on graphite2001.codfw.wmnet for data replay
  • 14:54 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2013 (duration: 01m 00s)
  • 14:54 gehel: starting rolling restart of elasticsearch cirrus / eqiad for jvm upgrade
  • 14:39 moritzm: rolling restart of restbase in eqiad to pick up openssl update
  • 14:38 Krinkle: Turned regular coal back on (T191239)
  • 14:37 ppchelko@tin: Finished deploy [cpjobqueue/deploy@a090a3c]: Fix the low priority jobs topic names (duration: 00m 38s)
  • 14:36 ppchelko@tin: Started deploy [cpjobqueue/deploy@a090a3c]: Fix the low priority jobs topic names
  • 14:15 jynus: start reimage of es2013
  • 14:14 marostegui: Deploy schema change on db1099:3318 - T187089 T185128 T153182
  • 14:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 for alter table (duration: 01m 00s)
  • 14:12 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2013 (duration: 01m 00s)
  • 13:44 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3ba6580]: Enable second bulk of low-traffic jobs T190327 take 2 (duration: 00m 49s)
  • 13:44 ppchelko@tin: Started deploy [cpjobqueue/deploy@3ba6580]: Enable second bulk of low-traffic jobs T190327 take 2
  • 13:41 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327 (duration: 08m 27s)
  • 13:37 moritzm: rolling restart of restbase in codfw to pick up openssl update
  • 13:33 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 2/2 - T190327 (duration: 01m 00s)
  • 13:32 ppchelko@tin: Started deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327
  • 13:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 (duration: 01m 07s)
  • 13:31 ppchelko@tin: Started deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327
  • 13:27 marostegui: Drop prefstats table on s3 sanitarium master (db1072) this might cause lag on labs - T154490
  • 13:26 moritzm: installing java security updates on kafka/main cluster
  • 13:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 (duration: 01m 00s)
  • 13:13 marostegui: Drop prefstats table on s1 codfw master - db2048 (this might generate lag on codfw) - T154490
  • 13:12 elukey: restart kafka brokers on kafka1012->23 for openjdk-7 upgrades
  • 13:09 marostegui: Drop prefstats table on s3 codfw master - db2043 (this might generate lag on codfw) - T154490
  • 13:01 vgutierrez: Reimage lvs4007 as stretch
  • 13:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2012 (duration: 01m 00s)
  • 12:39 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry #2) (duration: 01m 01s)
  • 12:32 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry) - T190327 (duration: 01m 00s)
  • 12:21 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 - T190327 (duration: 01m 01s)
  • 12:21 moritzm: enable production traffic for mw1265 (stretch app server) for a brief test period
  • 12:09 jynus: start reimage of es2012
  • 12:05 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2011, depool es2012 (duration: 01m 01s)
  • 11:47 jynus: start reimage of es2011
  • 11:09 ema: start pybal on lvs5001, test completed on lvs5003
  • 11:04 marostegui: Drop table prefstats in s7 - T154490
  • 10:59 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2011 (duration: 00m 59s)
  • 10:56 ema: stop pybal on lvs5001 to test requests through lvs5003, reimaged as stretch T191897
  • 10:50 moritzm: installing openssl updates
  • 10:43 marostegui: Drop table prefstats in s2 - T154490
  • 10:33 marostegui: Drop table prefstats in s4 - T154490
  • 10:31 marostegui: Drop table prefstats in s6 - T154490
  • 10:28 marostegui: Drop table prefstats in s5 - T154490
  • 10:04 jynus: start reimage of es2015
  • 10:00 moritzm: installing java security updates on kafka/jumbo cluster
  • 09:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2014, depool es2015 (duration: 01m 02s)
  • 09:52 moritzm: installing java security updates on kafka/analytics cluster
  • 09:29 arturo: doing some testing in labtestvirt2001 mounting instance's qcow2 files into /home/aborrero/mnt
  • 09:17 jynus: start reimage of es2014
  • 09:08 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 (duration: 01m 03s)
  • 09:03 ema: restart pybal on lvs1003 for UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425251/
  • 08:59 moritzm: reimaging mw1265 to stretch (T174431)
  • 08:18 jynus: rerunning eqiad misc backups
  • 08:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 as candidate master for x1 - T191275 (duration: 01m 03s)
  • 07:45 ema: cp2022: restart varnish-be due to child process crash https://phabricator.wikimedia.org/P6979 T191229
  • 07:27 marostegui: Stop MySQL on db2033 to copy its data away before reimaging - T191275
  • 07:08 vgutierrez: Reimaging lvs5003.eqsin as stretch (2nd attempt)
  • 06:49 elukey: restart Yarn Resource Manager daemons on analytics100[12] to pick up the new Prometheus configuration file
  • 06:20 marostegui: Stop MySQL on db2033 to clone db2069 - T191275
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 03s)
  • 06:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 01s)
  • 05:28 Krinkle: manual coal back-fill still running with the normal coal disabled via systemd. Will restore normal coal when I wake up.
  • 05:22 marostegui: Deploy schema change on codfw s8 master (db2045) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182
  • 05:17 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 41s)
  • 00:12 bstorm_: Updated views and indexes on labsdb1011

2018-04-10

  • 23:32 XioNoX: depolled eqsin due to router issue
  • 23:04 Krinkle: Seemingly from 22:53 - 23:03 global traffic dropped by 30-60%, presumably due to issues in eqiad where 10 Gbits dropped to 3 Gbits sharper than ever before.
  • 22:49 joal@tin: Finished deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors (duration: 04m 46s)
  • 22:45 joal@tin: Started deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors
  • 21:18 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35 (duration: 06m 27s)
  • 21:12 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35
  • 20:41 sbisson@tin: Finished deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot) (duration: 03m 45s)
  • 20:37 sbisson@tin: Started deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot)
  • 20:30 mutante: deploy1001 - reinstalled with stretch - re-adding to puppet (T175288)
  • 20:30 mutante: deploy1001 - reinstalled with jessie - re-adding to puppet (T175288)
  • 20:13 urandom: increasing sample change-prop sample rate to 20% (from 10) in dev environment -- T186751
  • 20:06 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki back to 1.31.0-wmf.28
  • 20:02 sbisson@tin: Finished deploy [kartotherian/deploy@6e4d666]: Deploying kartotherian pre-i18n everywhere (duration: 04m 34s)
  • 19:58 sbisson@tin: Started deploy [kartotherian/deploy@6e4d666]: Deploying kartotherian pre-i18n everywhere
  • 19:57 sbisson@tin: Finished deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n everywhere (duration: 00m 48s)
  • 19:56 sbisson@tin: Started deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n everywhere
  • 19:48 sbisson@tin: Finished deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n to maps-test* (duration: 00m 27s)
  • 19:48 sbisson@tin: Started deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n to maps-test*
  • 19:16 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.29 and rebuild l10n cache (duration: 66m 28s)
  • 18:10 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.29 and rebuild l10n cache
  • 18:07 Krinkle: Stopping coal on graphite1001 to manually repopulate for T191239
  • 18:04 otto@tin: Finished deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 3 (duration: 04m 54s)
  • 17:59 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 3
  • 17:58 otto@tin: Finished deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2 (duration: 01m 50s)
  • 17:56 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2
  • 17:56 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2^
  • 17:49 joal@tin: Finished deploy [analytics/refinery@b8ea97f]: Analytics weekly deploy - Move to spark 2 (duration: 03m 55s)
  • 17:48 joal@tin: (no justification provided)
  • 17:47 joal@tin: (no justification provided)
  • 17:45 joal@tin: Started deploy [analytics/refinery@b8ea97f]: Analytics weekly deploy - Move to spark 2
  • 17:43 chasemp: add static route to neutron poc instance range for codfw 172.16.128.0/21
  • 17:22 papaul: shutting down cp2022 for main board replacement
  • 17:20 awight@tin: Finished deploy [ores/deploy@d35a1e6]: Test deploy virtualenv on ores1001, with logging and forced failure (duration: 02m 44s)
  • 17:17 awight@tin: Started deploy [ores/deploy@d35a1e6]: Test deploy virtualenv on ores1001, with logging and forced failure
  • 17:07 awight@tin: Finished deploy [ores/deploy@1e18fa6]: Test deploy virtualenv on ores1001, with logging (duration: 02m 28s)
  • 17:05 awight@tin: Started deploy [ores/deploy@1e18fa6]: Test deploy virtualenv on ores1001, with logging
  • 16:57 thcipriani: starting branch cut of 1.31.0-wmf.29
  • 16:45 andrew@tin: Synchronized wmf-config/CommonSettings.php: disable new accounts on labtestwikitech (duration: 01m 00s)
  • 16:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2045 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 16:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2045 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 16:21 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011
  • 16:11 marostegui: Stop MySQL on db2045 (s8 codfw master) to move it to another rack, this will break replication on codfw - T191193
  • 16:07 bstorm_: labsdb1010 now has the latest views available, including the comment table
  • 16:05 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010
  • 15:42 ottomata: disable puppet on analytics1003 and stop camus crons in preperation for spark 2 upgrade
  • 15:32 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010
  • 15:26 vgutierrez: Reimage lvs5003 as stretch
  • 15:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2040 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 15:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2040 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 15:08 volans: restarting Icinga on einsteinium, command file not working
  • 15:06 bd808: Wiki replicas: ran `sudo maintain-views --table page_assessments --database arwiki` on all 3 servers for T191455
  • 14:46 marostegui: Stop MySQL on db2040 for server move - this is s7 master, so replication will break in codfw T191193
  • 14:23 volans: restarted nsca server on einsteinium
  • 14:21 vgutierrez: re-enable puppet on primary LVS
  • 14:17 moritzm: installing python-crypto security updates on trusty
  • 13:55 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T188198 Enable TemplateStyles on ruwiki (duration: 01m 00s)
  • 13:51 vgutierrez: disable puppet on primary LVS to merge safely gerrit/425040 T177961
  • 13:47 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/: SWAT: Restore subtract method for backward compatibility (T191696) (duration: 01m 01s)
  • 13:41 moritzm: upgraded HHVM on mediawiki-deployment04/05/06 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 13:35 elukey: restart kafka on kafka-jumbo1001 for openjdk upgrades
  • 13:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Update wikis with consolidate editing feedback" (T168886) (duration: 00m 59s)
  • 13:29 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/: SWAT: Disable search for global filters (T191539) (duration: 01m 01s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update wikis with consolidate editing feedback (T168886) (duration: 01m 00s)
  • 13:19 ema: restart pybal on lvs1006 for config changes introduced by https://gerrit.wikimedia.org/r/#/c/425251/
  • 12:02 moritzm: upgrading naos and wasat to ICU57-enabled build of HHVM
  • 12:01 _joe_: uploading mcrouter 0.37.0 to stretch-wikimedia (T190979)
  • 11:59 _joe_: uploading mcrouter 0.37.0 to jessie-wikimedia (T190979)
  • 11:15 mobrovac@tin: Finished deploy [restbase/deploy@29df9db]: Use the MCS-provided content-type in the definition response - T191809 (duration: 24m 19s)
  • 11:07 moritzm: upgrading mwdebug servers in codfw to ICU57-enabled build of HHVM
  • 10:51 mobrovac@tin: Started deploy [restbase/deploy@29df9db]: Use the MCS-provided content-type in the definition response - T191809
  • 10:47 arturo: T188266 reimage labtestservices2002.wikimedia.org
  • 10:23 moritzm: upgrading job runners in codfw to ICU57-enabled build of HHVM
  • 09:29 moritzm: upgrading app servers in codfw to ICU57-enabled build of HHVM
  • 07:52 hoo: Updated operations/dumps/dcat (7ea4e75c..61154ca4) on snapshot1007
  • 07:37 moritzm: upgrading API servers in codfw to ICU57-enabled build of HHVM
  • 05:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2069 from config - T191275 (duration: 00m 58s)
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2069 from config - T191275 (duration: 00m 59s)
  • 05:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 after alter table (duration: 01m 11s)
  • 05:17 marostegui: Deploy alter table on s1 primary master (db1052) - T185128 T153182
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 39s)

2018-04-09

  • 21:11 XioNoX: cr1-eqsin 24h experiment on applying same local-pref to peers and transits - T186835
  • 20:48 arlolra: Updated Parsoid to edeeb60 (T191281, T187386, T185266)
  • 20:38 awight@tin: Finished deploy [ores/deploy@be69c1d]: Transitional virtualenv for ORES, T181071 (duration: 24m 14s)
  • 20:32 arlolra@tin: Finished deploy [parsoid/deploy@447fab2]: Updating Parsoid to edeeb60 (duration: 11m 03s)
  • 20:21 arlolra@tin: Started deploy [parsoid/deploy@447fab2]: Updating Parsoid to edeeb60
  • 20:14 awight@tin: Started deploy [ores/deploy@be69c1d]: Transitional virtualenv for ORES, T181071
  • 20:12 awight@tin: Finished deploy [ores/deploy@b61c338]: Transitional virtualenv for ORES, T181071 (duration: 00m 19s)
  • 20:12 awight@tin: Started deploy [ores/deploy@b61c338]: Transitional virtualenv for ORES, T181071
  • 20:01 herron: repooled rhodium (puppet master backend) https://gerrit.wikimedia.org/r/425078
  • 19:57 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2017.codfw.wmnet
  • 19:26 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Switch SET on frwiktionary to use wikitexteditor by default (T169741) (duration: 01m 00s)
  • 19:17 sbisson@tin: Finished deploy [kartotherian/deploy@a26712b]: Deploying kartotherian i18n to maps-test* (with updated source and style) (duration: 01m 46s)
  • 19:15 sbisson@tin: Started deploy [kartotherian/deploy@a26712b]: Deploying kartotherian i18n to maps-test* (with updated source and style)
  • 18:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 18:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2010.codfw.wmnet
  • 18:58 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable PageAssessments on arwiki (T185023) (duration: 01m 00s)
  • 18:50 papaul: shutting down cp2017 for memory replacement
  • 18:37 papaul: shutting down cp2010 for memory replacement
  • 18:21 papaul: shutting down cp2006 for memory replacement
  • 18:04 gehel@tin: Finished deploy [wdqs/wdqs@7116a56]: new GUI version (duration: 02m 11s)
  • 18:01 gehel@tin: Started deploy [wdqs/wdqs@7116a56]: new GUI version
  • 17:58 papaul: shutting down cp2022 for memory replacement
  • 16:53 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2017.codfw.wmnet
  • 16:53 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2010.codfw.wmnet
  • 16:52 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 15:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:28 dereckson@tin: Synchronized wmf-config/flaggedrevs.php: Always show latest revision even if not reviewed on hu.wikipedia (T121995) (duration: 00m 59s)
  • 14:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:11 marostegui: Deploy schema change on db1067 - T187089 T185128 T153182
  • 14:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 for alter table (duration: 00m 59s)
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 13:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db2092 in s1 T170662 (duration: 00m 59s)
  • 13:49 zeljkof: EU SWAT finished
  • 13:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RelatedArticles for vector at hewiki (T191573) (duration: 00m 59s)
  • 13:43 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add adm.dp.gov.ua to wgCopyUploadDomains, change if.gov.ua to www.if.gov.ua (T191692) (duration: 00m 59s)
  • 13:37 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix broken line that includes a group into a group by mistake (T191719) (duration: 00m 59s)
  • 13:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable <mapframe> on ku.wikipedia (T190944) (duration: 00m 57s)
  • 13:14 moritzm: upgrading Boost libraries on mwdebug with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 13:14 _joe_: started updateCollation.php maintenance script for the ICU 57 migration (T189295)
  • 13:03 marostegui: Stop MySQL on db1080 for mariadb and kernel upgrade
  • 13:03 _joe_: upgrading HHVM / libboost for ICU 57 upgrade (T189295)
  • 13:01 sbisson@tin: Finished deploy [tilerator/deploy@aef010b]: Deploying tilerator i18n to maps-test* (with updated source and style) (duration: 00m 33s)
  • 13:00 sbisson@tin: Started deploy [tilerator/deploy@aef010b]: Deploying tilerator i18n to maps-test* (with updated source and style)
  • 12:54 moritzm: upgrading Boost libraries on mwdebug with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 12:39 moritzm: upgrading Boost libraries on job runners with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 12:23 _joe_: preparing to run updateCollation from mw1338: stop videoscaler, disable puppet (T189295)
  • 12:05 _joe_: upgrading boost, hhvm on terbium for ICU 57 upgrade (T189295)
  • 12:01 elukey: upgrading Boost libraries on all mediawiki eqiad API server with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 11:50 moritzm: upgrading Boost libraries on remaining app servers with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 11:42 moritzm: removed profile::beta::icu57 from deployment-prep Hiera config now that the component is part of the standard app server manifests
  • 11:04 moritzm: upgrading Boost libraries on API server canaries with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:41 moritzm: upgrading Boost libraries on mw1300 with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:31 moritzm: upgrading Boost libraries on app server canaries with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:15 moritzm: upgrading tin/deploy1001 to a ICU 57-enabled HHVM build (T189295)
  • 10:13 elukey: completed upgrade of mw eqiad api appservers to ICU 57-enabled HHVM
  • 10:10 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 10:09 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 09:54 moritzm: upgrading mwdebug servers in eqiad to to ICU 57-enabled HHVM build (T189295)
  • 09:33 _joe_: all eqiad jobrunners migrated to ICU 57 (T189295)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db2092 to the config - T170662 (duration: 00m 59s)
  • 09:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db2092 to the config - T170662 (duration: 00m 58s)
  • 08:45 elukey: upgrading eqiad api appservers to ICU 57-enabled HHVM build (T189295)
  • 08:37 marostegui: Deploy schema change on db1080 - T187089 T185128 T153182
  • 08:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 for alter table (duration: 00m 59s)
  • 08:35 jynus@tin: Synchronized wmf-config/db-codfw.php: Repoo es2019 (duration: 00m 59s)
  • 08:32 moritzm: upgrading remaining app servers in eqiad to to ICU 57-enabled HHVM build (T189295)
  • 08:32 _joe_: upgrading eqiad jobrunners to ICU 57-enabled HHVM build (T189295)
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 after alter table (duration: 00m 58s)
  • 07:56 marostegui: Remove /var/log/wikidata/rebuildTermSqlIndex.log* as per Amir1's request
  • 07:48 moritzm: upgrading mw1276-1279 (API canaries) to ICU 57-enabled HHVM build (T189295)
  • 07:42 _joe_: repooling mw1300 now with ICU 57-enabled HHVM build (T189295)
  • 07:38 _joe_: upgrading mw1300 to ICU 57-enabled HHVM build (T189295)
  • 07:32 moritzm: upgrading mw1262-1265 to ICU 57-enabled HHVM build (T189295)
  • 07:24 moritzm: repooling mw1261 after upgrade to ICU 57-enabled HHVM build (T189295)
  • 07:17 moritzm: upgrading mw1261 to ICU 57-enabled HHVM build (T189295)
  • 07:09 elukey: upgrade burrow to 1.0 on kafkamon[12]* - T188719
  • 06:58 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=zhwiktionary --check-old --before 20180223210426 --sleep 2 (T184485)
  • 06:43 marostegui: Reboot db2072 for kernel upgrade
  • 06:41 marostegui: Stop MySQL on db2072 to clone db2092 from it - T170662
  • 06:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2072 - T170662 (duration: 00m 59s)
  • 06:24 elukey: upgrade burrow 1.0.0 to stretch/jessie wikimedia
  • 06:21 marostegui: Reboot db2092 for mariadb and kernel upgrade
  • 06:04 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2079 is now s8 candidate master (duration: 00m 59s)
  • 05:54 marostegui: Stop MySQL on db2079 to change its binlog format
  • 05:34 marostegui: Deploy schema change on db1106 with replication enabled (this will generate lag on labs replicas) - T187089 T185128 T153182
  • 05:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 for alter table (duration: 01m 00s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 57s)

2018-04-07

  • 23:44 Dereckson: OATHAuth disabled for Wikimedia SUL global account Barek (T191708)
  • 07:28 legoktm: disabled and cleaned up spam from @Farjksn on Phabricator
  • 00:14 mutante: bromine - scheduled downtime, reboot for reinstall, upgrade to stretch, misc_static_services switched to codfw (T188163)

2018-04-06

  • 22:35 mutante: rsyncing bugzilla-static raw html from eqiad to codfw VM
  • 19:59 herron: moved rhodium:/var/lib/git/operations/puppet away and triggered puppet agent run to re-create
  • 19:43 ottomata: running puppet-merge on rhodium after clash between puppet-merge and new patch submitted
  • 19:23 demon@tin: Finished scap: Forcing full scap. Mostly no-op, consistency, paranoia, that sort of thing (duration: 11m 51s)
  • 19:13 bd808: wiki replicas: ran maintain-views --database mediawikiwiki --clean on labsdb10{09,10,11} for T191387
  • 19:11 demon@tin: Started scap: Forcing full scap. Mostly no-op, consistency, paranoia, that sort of thing
  • 19:02 demon@tin: scap aborted: Forcing full scap, removed clean plugin updates (duration: 11m 03s)
  • 19:00 herron: depooled rhodium (puppet master backend) again https://gerrit.wikimedia.org/r/#/c/424646/
  • 18:51 demon@tin: Started scap: Forcing full scap, removed clean plugin updates
  • 18:49 demon@tin: scap failed: average error rate on 5/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 18:47 demon@tin: Pruned MediaWiki: 1.31.0-wmf.26 [keeping static files] (duration: 01m 51s)
  • 14:37 herron: repooled rhodium (puppet master backend)
  • 14:08 herron: upgraded apache on fermium for security updates
  • 14:07 anomie: Running populateArchiveRevId.php for group2 for T191307
  • 14:03 herron: apache updated on puppet masters — re-enabling puppet agents
  • 13:55 herron: temporarily disabling puppet agents for apache security update on puppet masters
  • 13:14 moritzm: installing apache security updates on thorium (running several analytics web services)
  • 12:38 moritzm: installing apache security updates on the Kibana nodes of the logstash cluster
  • 11:50 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=fawiki --before 20180223210426 --sleep 2 (T184485)
  • 10:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1114 (duration: 01m 00s)
  • 09:45 moritzm: installing apache security updates on graphite hosts
  • 09:39 marostegui: Deploy test alter table on db2038 to test osc_host.py in core
  • 09:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 09:24 moritzm: installing apache security updates on planet1001/planet.wikimedia.org
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:57 no_justification: gerrit: restarting services to pick up openjdk updates
  • 08:50 moritzm: installing apache security updates on prometheus hosts
  • 08:45 no_justification: installed apache updates to gerrit2001/cobalt
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:41 moritzm: installing apache security updates on mwlog*
  • 08:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:28 moritzm: installing apache security updates on releases.wikimedia.org
  • 08:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 59s)
  • 08:07 elukey: upgrade prometheus-burrow-exporter on kafkamon1001/2001 - T188719
  • 08:07 elukey: upload prometheus-burrow-exporter 0.0.5 to jessie/stretch-wikimedia - T188719
  • 08:00 marostegui: Stop MySQL on db1114 for kernel and mariadb upgrade
  • 07:40 moritzm: removed mediawiki-deployment07 from deployment-prep (T191578)
  • 07:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 after changing binlog format, upgrade mariadb and kernel (duration: 00m 59s)
  • 06:33 marostegui: Stop MySQL on db2047 for binlog format change, upgrade kernel and mariadb
  • 06:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 to change binlog format, upgrade mariadb and kernel (duration: 00m 59s)
  • 06:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046 as candidate master (duration: 00m 59s)
  • 05:59 marostegui: Restart MySQL on db2046 to change its binlog format - T191275
  • 05:44 marostegui: Deploy schema change on db1114 - T187089 T185128 T153182
  • 05:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 for alter table (duration: 00m 53s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after alter table (duration: 00m 55s)

2018-04-05

  • 21:44 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2017.codfw.wmnet
  • 21:44 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet
  • 21:43 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 21:43 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2010.codfw.wmnet
  • 21:34 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet
  • 21:09 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet,service=varnish-be
  • 20:10 twentyafterfour@tin: Synchronized php-1.31.0-wmf.28/extensions/Echo/: Sync https://gerrit.wikimedia.org/r/#/c/424379/ refs T183967 (duration: 01m 05s)
  • 20:07 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/424379/ refs T191335
  • 19:59 herron: added rhodium puppet master backend in offline mode
  • 19:52 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.28 refs T183967
  • 19:51 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet
  • 18:45 catrope@tin: Synchronized wmf-config/Wikibase-production.php: Disable writing wb_terms search fields on Wikidata (T189777) (duration: 01m 16s)
  • 18:25 catrope@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/includes/Views/AbuseFilterViewList.php: Unbreak Special:AbuseFilter (T191512) (duration: 01m 17s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable logging autopatrol actions on commonswiki (T184485) (duration: 01m 17s)
  • 17:56 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --before 20180223210426 --from-id 156008475 (T184485)
  • 17:42 Amir1: finished the script
  • 17:33 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --before 20180223210426 (T184485)
  • 17:18 bsitzmann@tin: Finished deploy [mobileapps/deploy@eed7961]: Update mobileapps to dbc0687 (T187430) (duration: 09m 45s)
  • 17:09 bsitzmann@tin: Started deploy [mobileapps/deploy@eed7961]: Update mobileapps to dbc0687 (T187430)
  • 16:41 robh: cp2008 shutting down for firmware updates
  • 16:09 vgutierrez: updating librdkafka1 to 0.11.3 on cache text
  • 15:54 vgutierrez: updating librdkafka1 to 0.11.3 on cache upload
  • 15:44 vgutierrez: updating librdkafka1 to 0.11.3 on cache misc
  • 15:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2039 IP as it is being moved to a different rack - T191193 (duration: 01m 17s)
  • 15:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2039 IP as it is being moved to a different rack - T191193 (duration: 01m 17s)
  • 15:26 vgutierrez: uploaded pybal 1.15.3 for stretch on apt.w.o
  • 15:17 jynus: stopping mariadb on db2039 T191193
  • 14:59 moritzm: installing apache security updates
  • 14:54 marostegui: Deploy schema change on db1066 - T187089 T185128 T153182
  • 14:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for alter table (duration: 01m 17s)
  • 14:43 moritzm: uploaded apache2 2.4.10-10+deb8u12+wmf1 to apt.wikimedia.org/jessie-wikimedia (rebase of our local patches against the latest DSA)
  • 14:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2053 is no longer a candidate master (duration: 01m 17s)
  • 14:03 andrew@tin: Finished deploy [horizon/deploy@cd1cda6]: Deploying potential fix for T191232 (duration: 03m 17s)
  • 14:00 andrew@tin: Started deploy [horizon/deploy@cd1cda6]: Deploying potential fix for T191232
  • 13:41 anomie: Running populateArchiveRevId.php on group 1 for T191307
  • 13:39 zeljkof: EU SWAT finished
  • 13:32 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter: SWAT: Make $mode optional for checkAllFilters (T191468) (duration: 01m 20s)
  • 13:23 marostegui: Stop MySQL on db2053 for binlog format change
  • 13:09 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Stop logging autopatrol actions in wikidatawiki (T184485) (duration: 01m 16s)
  • 12:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 after alter table (duration: 01m 17s)
  • 12:52 Amir1: finished the script
  • 12:41 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=wikidatawiki --before 20180223210426 (T189596)
  • 12:30 moritzm: installing net-snmp security updates on jessie (stretch not affected)
  • 12:12 ariel@tin: Finished deploy [dumps/dumps@88ca17c]: fix monitor to import status module after refactor (duration: 00m 04s)
  • 12:12 ariel@tin: Started deploy [dumps/dumps@88ca17c]: fix monitor to import status module after refactor
  • 12:04 hoo: Manually back-filled hashes for the Wikidata JSON dumps in https://dumps.wikimedia.org/wikidatawiki/entities/20180402/wikidata-20180402-*sums.txt (T190457)
  • 11:58 vgutierrez: updating libssl1.1 to 1.1.0h on cache text cluster (and nginx restart)
  • 11:36 vgutierrez: updating libssl1.1 to 1.1.0h on cache upload cluster (and nginx restart)
  • 11:22 vgutierrez: updating libssl-1-1 to 1.1.0h on cache misc cluster (and nginx restart)
  • 10:57 jynus: restart dbstore1001 for RAID re-setup and reimage
  • 10:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Specify that db1106 is sanitarium's master (duration: 01m 16s)
  • 10:33 marostegui: Deploy schema change on db1083 - T187089 T185128 T153182
  • 10:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 for alter table (duration: 01m 17s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 after alter table (duration: 01m 16s)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1089 original weight (duration: 01m 17s)
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 17s)
  • 08:30 jynus: starting backup of es2019, it may create lag T153440
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 16s)
  • 08:23 moritzm: installing net-snmp security updates on jessie (stretch not affected)
  • 08:16 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2019 (duration: 01m 16s)
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 16s)
  • 07:52 moritzm: removed unused/defunct deployment-videoscaler01 from deployment-prep (T191293)
  • 07:51 moritzm: removed unused/defunct deployment-tmh01 from deployment-prep (T191293)
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 after alter table, mariadb and kernel upgrade (duration: 01m 16s)
  • 07:44 moritzm: upgrading openjdk-7 on contint*
  • 07:36 marostegui: Stop MySQL on db1089 for kernel and mariadb upgrade
  • 07:33 marostegui: Deploy schema change on db1105:3311 - T187089 T185128 T153182
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 for alter table (duration: 01m 16s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2053 as candidate master (duration: 01m 09s)
  • 07:05 marostegui: Restart MySQL on db2053 for binlog format change
  • 06:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2038 (duration: 01m 13s)
  • 06:43 marostegui: Stop MySQL on db2038 to change binlog format, upgrade mariadb and kernel
  • 06:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2038 (duration: 01m 17s)
  • 06:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2058 is now a candidate master for s4 - T191275 (duration: 01m 16s)
  • 05:58 marostegui: Restart MySQL on db2058 to change its binlog to STATEMENT - T191275
  • 05:52 marostegui: Deploy schema change on db1089 - T187089 T185128 T153182
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 for alter table (duration: 01m 16s)
  • 05:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 after alter table (duration: 01m 18s)
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 07m 06s)

2018-04-04

  • 23:53 andrew@tin: Finished deploy [horizon/deploy@2c55bd5]: (no justification provided) (duration: 03m 10s)
  • 23:50 andrew@tin: Started deploy [horizon/deploy@2c55bd5]: (no justification provided)
  • 23:42 catrope@tin: Synchronized php-1.31.0-wmf.27/extensions/VisualEditor/lib/ve: Fix VE drag-and-drop bugs (T191103) (duration: 01m 17s)
  • 23:36 catrope@tin: Synchronized php-1.31.0-wmf.28/resources/src/mediawiki.rcfilters/: Fix missing bookmark icon (T191366) (duration: 01m 16s)
  • 23:12 catrope@tin: Synchronized wmf-config/CommonSettings.php: Set $wgVisualEditorSourceFeedbackTitle (no-op until later) (T157953) (duration: 01m 16s)
  • 23:09 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add Txikipedia namespace on euwiki (T191396) (duration: 01m 18s)
  • 22:54 akosiaris: increase the number of mathoid pods to 16 from 4
  • 21:53 bd808: Wiki replicas: ran `sudo maintain-views --table page_assessments --database trwiki` on all 3 servers for T191455
  • 20:27 arlolra: Updated Parsoid to d887aff (T177102, T189474)
  • 20:22 twentyafterfour@tin: Synchronized php-1.31.0-wmf.28/skins/MonoBook: sync https://gerrit.wikimedia.org/r/#/c/424041/ (duration: 01m 16s)
  • 20:22 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0460519]: Update mobileapps to 2d5ab5b (duration: 05m 58s)
  • 20:18 arlolra@tin: Finished deploy [parsoid/deploy@a8e759f]: Updating Parsoid to d887aff (duration: 11m 58s)
  • 20:16 mholloway-shell@tin: Started deploy [mobileapps/deploy@0460519]: Update mobileapps to 2d5ab5b
  • 20:15 mholloway-shell@tin: Started deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88
  • 20:06 arlolra@tin: Started deploy [parsoid/deploy@a8e759f]: Updating Parsoid to d887aff
  • 19:19 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.28 refs T183967 (duration: 01m 16s)
  • 19:18 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.28 refs T183967
  • food: re-enabled thank you mailer
  • 19:03 hasharAway: upgraded blubbler 0.2.0-1 -> 0.3.0-1 on contint1001 and contint2001
  • 18:17 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0125bc4]: Fixed the new metrics names. Again (duration: 00m 37s)
  • 18:17 ppchelko@tin: Started deploy [cpjobqueue/deploy@0125bc4]: Fixed the new metrics names. Again
  • 18:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0185e74]: Fix the metric names and support multi-topic rules (duration: 00m 35s)
  • 18:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@0185e74]: Fix the metric names and support multi-topic rules
  • 17:54 madhuvishy: Reset ttl for dumps.wikimedia.org CNAME to 1H post switchover to labstore1007 T188646
  • 17:26 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: gerrit:422414 Enable TemplateStyles on dewiki T190910 (duration: 01m 17s)
  • 17:19 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on all wikiquotes except frwikiquote T190726 (duration: 01m 17s)
  • 17:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on all wikimedia wikis T188881 (duration: 01m 18s)
  • 16:58 ppchelko@tin: Finished deploy [cpjobqueue/deploy@60a2292]: Revert: Support multi-topic rules (duration: 00m 21s)
  • 16:58 ppchelko@tin: Started deploy [cpjobqueue/deploy@60a2292]: Revert: Support multi-topic rules
  • 16:55 robh: dbstore1001 rebooting for bios firmware update
  • 16:47 ppchelko@tin: Finished deploy [cpjobqueue/deploy@d4a84ae]: Support multi-topic rules T191238 (duration: 00m 42s)
  • 16:47 ppchelko@tin: Started deploy [cpjobqueue/deploy@d4a84ae]: Support multi-topic rules T191238
  • 16:26 madhuvishy: Move cert for dumps.wikimedia.org to labstore1007 (do_acme: true) T188646
  • 16:22 madhuvishy: Change CNAME for dumps.wikimedia.org to labstore1007 T188646
  • 15:44 jynus: starting backup from es2015 (will create lag)
  • 15:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2015 (duration: 01m 17s)
  • 15:20 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Clean up config for the rest of high-traffic jobs after the switch - T190327 (duration: 01m 16s)
  • 15:14 madhuvishy: Update ttl for dumps.wikimedia.org CNAME to 1M in prep for switchover to labstore1007 T188646
  • 15:07 mobrovac@tin: Started restart [restbase/deploy@f3a53b6]: Pick up the net.ipv4.tcp_tw_reuse flag change - T190213
  • 15:06 elukey: delete /srv/deployment/prometheus from restbase* as clean up step for T181728
  • 14:30 anomie: Running populateArchiveRevId.php on group0 wikis for T191307
  • 14:20 elukey: apply net.ipv4.tcp_tw_reuse=1 to restbase* via https://gerrit.wikimedia.org/r/#/c/421901 - T190213
  • 14:15 moritzm: updating deployment-prep to HHVM 3.18.5+wmf6
  • 14:11 godog: purge cron smart-data-dump from lvs100[1-6]
  • 14:09 marostegui: Deploy schema change on db1099:3311 - T187089 T185128 T153182
  • 14:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 for alter table (duration: 01m 16s)
  • 14:08 moritzm: uploaded HHVM 3.18.5+wmf6 to component/icu57 for jessie-wikimedia (updated build with the security fix for CVE-2018-6334)
  • 13:59 marostegui: Deploy schema change on dbstore1002:s1 - T187089 T185128 T153182
  • 13:56 godog: rollout https://gerrit.wikimedia.org/r/c/423852 across ms-fe machines - T183902
  • 13:32 zeljkof: EU SWAT finished
  • 13:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add namespace to euwiki" (T191396) (duration: 01m 14s)
  • 13:08 godog: upgrade smartmontools to -backports version after https://gerrit.wikimedia.org/r/c/423871/
  • 12:02 elukey: removing /srv/deployment/prometheus from restbase2001/1007 - T181728
  • 12:00 akosiaris: revert scb hosts to apertium-fra-cat_1.2.0~r78602-1+wmf2
  • 11:47 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2057 is now a candidate master for s3 - T191275 (duration: 01m 17s)
  • 11:13 akosiaris: upgrade apertium on all scb hosts. Rolling update with in groups of 2 hosts with a 30 seconds delay
  • 11:06 marostegui: Stop MySQL on db2057 for binlog format change, mariadb and kernel upgrade
  • 11:02 akosiaris: upgrade apertium on scb1001
  • 09:46 marostegui: Deploy schema change on s1 codfw master db2048 (this will generate lag on codfw) - T187089 T185128 T153182
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1077 (duration: 01m 16s)
  • 09:25 Amir1: end of the deleteAutoPatrolLogs.php script on mediawikiwiki (T184485)
  • 09:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2041 is now a candidate master for s2 - T191275 (duration: 01m 16s)
  • 09:16 elukey: executed systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka1020
  • 09:02 Amir1: start of mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki--before 20180223210426 --sleep 2 (T184485)
  • 09:02 marostegui: Stop MySQL on db2041 for binlog format change and kernel upgrade
  • 09:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2041 (duration: 01m 17s)
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1072 (duration: 01m 17s)
  • 08:19 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki --check-old --before 20160423210426 (T184485)
  • 08:17 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki --dry-run --check-old --before 20160423210426
  • 08:08 marostegui: Deploy schema change on s3 primary master (db1075) - T153182 T185128
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1072 (duration: 01m 17s)
  • 07:59 godog: depool ms-fe2005 to test rewrite.py - T183902
  • 07:53 marostegui: Drop flaggedrevs from s3 mediawikiwiki - T186865
  • 07:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2055 is now a candidate master - T191275 (duration: 01m 16s)
  • 07:37 moritzm: running some apache/stretch tests on mw2261
  • 07:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2083 - T188279 (duration: 01m 17s)
  • 07:30 ema: finish up cache@eqiad reboots for retpoline kernel updates T188092
  • 07:26 marostegui: Restart MySQL on db2055 to change its binlog to STATEMENT - T191275
  • 05:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db2083 - T188279 (duration: 01m 17s)
  • 05:48 marostegui: Deploy schema change on db1072 - s3 - with replication. This will generate lag on labs T187089 T185128 T153182
  • 05:43 marostegui: Drop click_tracking_events table from where it still exists - T115982
  • 05:21 marostegui: Stop mariadb for upgrade and kernel upgrade on db1072 - this will generate lag on s3 labs
  • 05:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 for alter table, kernel and mariadb upgrade (duration: 01m 17s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 05m 31s)
  • 01:02 eileen: update civicrm - civicrm revision changed from d6855cd281 to 7010f0f5d6, config revision is 3b900436c9

2018-04-03

  • 23:55 XioNoX: re-activating graceful-switchover on cr1-codfw - T189588
  • 23:16 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Make a note about the loading order of GlobalPreferences and Echo (Gerrit:422642) (no-op) (duration: 01m 17s)
  • 23:10 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Rollout VirtualPageViews (final stage) (T189906) (duration: 01m 19s)
  • 22:34 mutante: cobalt - puppet disabled temporarily to apply fix to "simplify directory structure" change .. on gerrit2001 first
  • 22:25 mutante: restarting Apache on phab1001 - T182832
  • 22:14 twentyafterfour: Finished MediaWiki Train for group0, 1.31.0-wmf.28 refs T183967
  • 22:12 twentyafterfour@tin: Pruned MediaWiki: 1.31.0-wmf.25 [keeping static files] (duration: 01m 55s)
  • 22:10 twentyafterfour@tin: Pruned MediaWiki: 1.31.0-wmf.24 [keeping static files] (duration: 04m 18s)
  • 21:30 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.28 refs T183967
  • 21:15 twentyafterfour@tin: Finished scap: testwikis wikis to 1.31.0-wmf.28 refs T183967 (duration: 46m 38s)
  • {{safesubst:SAL entry|1=21:13 urandom: (re)starting restbase-dev1004-{a,b} (ooms), and enabling alternately patched cassandra 3.11.2 build - T186751}}
  • 20:29 twentyafterfour@tin: Started scap: testwikis wikis to 1.31.0-wmf.28 refs T183967
  • 20:22 ejegg: disabled thank you mail sender
  • {{safesubst:SAL entry|1=19:46 urandom: restarting restbase-dev1004-{a,b} to enable patched cassandra 3.11.2 build - T186751}}
  • 19:07 twentyafterfour: Preparing to deploy 1.31.0-wmf.28 refs T183967
  • 18:25 urandom: upgrading restbase-dev1006-b to cassandra 3.11.2 - T186751
  • 18:23 urandom: upgrading restbase-dev1006-a to cassandra 3.11.2 - T186751
  • 18:20 urandom: upgrading restbase-dev1005-b to cassandra 3.11.2 - T186751
  • 18:18 urandom: upgrading restbase-dev1005-a to cassandra 3.11.2 - T186751
  • 18:15 urandom: upgrading restbase-dev1004-b to cassandra 3.11.2 - T186751
  • 18:13 urandom: upgrading restbase-dev1004-a to cassandra 3.11.2 - T186751
  • 18:05 mutante: rhodium - closing idle screen session from maintenance work on puppetmasters
  • 18:03 mutante: elnath - fixing and re-enabling Icinga alert about screens, none are running, spare hosts should not have these
  • 17:59 mutante: restarting ferm on bromine
  • 17:40 elukey: manually set net.ipv4.tcp_tw_reuse=1 on restbase1007 as test for T190213
  • 17:35 sbisson@tin: Finished deploy [tilerator/deploy@8e68cb8]: Deploying tilerator i18n to maps-test* (take 2) (duration: 00m 25s)
  • 17:35 sbisson@tin: Started deploy [tilerator/deploy@8e68cb8]: Deploying tilerator i18n to maps-test* (take 2)
  • 17:28 awight@tin: Finished deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071 (duration: 01m 27s)
  • 17:28 sbisson@tin: Finished deploy [tilerator/deploy@03add2d]: Deploying tilerator i18n to maps-test* (duration: 04m 09s)
  • 17:27 awight@tin: Started deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071
  • 17:25 awight@tin: Finished deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071 (duration: 00m 27s)
  • 17:25 awight@tin: Started deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071
  • 17:24 sbisson@tin: Started deploy [tilerator/deploy@03add2d]: Deploying tilerator i18n to maps-test*
  • 17:24 moritzm: upgrading HHVM on labweb*
  • 17:18 jynus: reloading labsdb proxy configuration
  • 17:08 elukey: manually set net.ipv4.tcp_tw_reuse=1 on restbase2001 as test for T190213
  • 16:53 demon@tin: Finished deploy [gerrit/gerrit@aa1a1a0]: no-op, pushing empty motd.config file (duration: 00m 11s)
  • 16:53 demon@tin: Started deploy [gerrit/gerrit@aa1a1a0]: no-op, pushing empty motd.config file
  • 16:33 urandom: rebooting restbase-dev1006 - T186751
  • 16:10 urandom: rebooting restbase-dev1004 - T186751
  • 16:04 ariel@tin: Finished deploy [dumps/dumps@77dc467]: split up some large modules, prep work for prefetch changes (duration: 00m 04s)
  • 16:04 ariel@tin: Started deploy [dumps/dumps@77dc467]: split up some large modules, prep work for prefetch changes
  • 15:39 elukey: roll restart of zookeeper on conf100[123] to pick up prometheus monitoring
  • 15:09 godog: depool ms-fe2005 to test rewrite.py - T183902
  • 14:40 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch the remaining high-traffic jobs for all wikis, file 2/2 - T190327 (duration: 00m 59s)
  • 14:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@60a2292]: Switch all high traffic jobs to kafka T190327 (duration: 00m 44s)
  • 14:39 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the remaining high-traffic jobs for all wikis, file 1/2 - T190327 (duration: 00m 59s)
  • 14:38 ppchelko@tin: Started deploy [cpjobqueue/deploy@60a2292]: Switch all high traffic jobs to kafka T190327
  • 13:48 anomie@tin: Synchronized php-1.31.0-wmf.27/extensions/intersection/DynamicPageList.hooks.php: Backporting fix for T191116 (gerrit:423689) (duration: 00m 58s)
  • 13:47 anomie@tin: Synchronized php-1.31.0-wmf.27/includes/specials/SpecialWhatlinkshere.php: Backporting fix for T191116 (gerrit:423688) (duration: 00m 58s)
  • 13:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1077 after alter table (duration: 00m 58s)
  • 13:21 marostegui: Reimport  s51541_sulwatcher.logging from master to slave - T191020
  • 13:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1077 after alter table (duration: 00m 58s)
  • 13:18 elukey: roll restart of zookeeper on conf200[123] to pick up prometheus monitoring settings
  • 12:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1077 after alter table (duration: 00m 59s)
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1077 after alter table (duration: 00m 58s)
  • 11:16 godog: deploy thumbor 1.16 in codfw
  • 11:06 moritzm: installing libdatetime-timezone-perl update from Debian SUA
  • 09:51 godog: deploy thumbor 1.16 in codfw and eqiad - T186528 T179200 T189647 T191028
  • 08:46 marostegui: Deploy schema change on db1077 - s3 - T187089 T185128 T153182
  • 08:41 moritzm: upgrading HHVM on video scalers
  • 08:40 volans: temporarily disabled puppet (and re-enabling it one-by-one) on all prod puppetmasters to deploy g/422907 - T190918
  • 08:36 marostegui: Stop MySQL on db1077 for mysql and kernel upgrade
  • 08:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for alter table (duration: 00m 59s)
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1078 (duration: 00m 58s)
  • 08:29 godog: codfw-prod: more weight to ms-be204[0-3] - T189633
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 08:01 elukey: restart of druid-(overlord|middlemanager) on druid1004[456] as precautionary measure after zk restart
  • 08:01 moritzm: uploaded HHVM 3.18.5-dfsg-1+wmf5+deb9u1 for stretch-security to apt.wikimedia.org
  • 08:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 07:50 elukey: roll restart zookeeper on druid100[456] to enable prometheus monitoring
  • 07:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 with low weight (duration: 00m 58s)
  • 07:12 jynus: upgrade and restart of labsdb1010
  • 07:10 marostegui: Stop MySQL on db1078 for mariadb and kernel upgrade
  • 06:43 elukey: execute systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka102[23]
  • 06:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2035 rack comment (duration: 00m 58s)
  • 05:37 marostegui: Deploy schema change on db1078 - s3 - T187089 T185128 T153182
  • 05:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 for alter table (duration: 00m 59s)
  • 05:18 marostegui: Enable back gtid on db2035 - T191193
  • 02:44 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 19m 03s)
  • 00:11 Amir1: Evening SWAT is done

2018-04-02

  • 23:56 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Add several domains of Ukraine government to wgCopyUploadsDomains (T185399) (duration: 00m 59s)
  • 23:45 ladsgroup@tin: Synchronized tests/cirrusTest.php: Shift all search traffic to codfw, part II (T191236) (duration: 00m 58s)
  • 23:44 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Shift all search traffic to codfw (T191236) (duration: 00m 59s)
  • 23:29 Amir1: Persian Wikipedia logos have been purged using purgeList.php on terbium
  • 23:26 ladsgroup@tin: Synchronized static/images/project-logos: Update logo for the Persian Wikipedia (T191174) (duration: 00m 59s)
  • 22:41 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/CSSMin.php: sync https://gerrit.wikimedia.org/r/423574 (duration: 00m 58s)
  • 22:11 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.27
  • 21:59 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/CSSMin.php: (no justification provided) (duration: 01m 16s)
  • 21:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 14s)
  • 21:26 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 21:22 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/: Revert ceb7d61 refs T183966 T190960 (duration: 00m 59s)
  • 21:09 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.26
  • 20:59 twentyafterfour: MediaWiki Train: rolling back to 1.31.0-wmf.26 refs T183966, T190960
  • 20:38 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqlBase.php: sync 36c5235 refs T190960 (duration: 01m 16s)
  • 20:22 mholloway-shell@tin: Finished deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88 (duration: 05m 52s)
  • 20:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88
  • 19:56 herron: puppetdb postgres update complete — puppet agents re-enabled
  • 19:46 herron: temporarily disabling puppet agents for puppetdb postgres security update
  • 19:41 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqlBase.php: sync 779e7fd refs T190960 (duration: 01m 16s)
  • 19:17 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 15s)
  • 19:16 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 19:09 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/: sync I57dd8d refs T183966 T190960 (duration: 01m 19s)
  • 19:06 twentyafterfour: sync rdbms: avoid lag estimates in getLagFromPtHeartbeat ruined by snapshots Bug: T190960 Change-Id: I57dd8d
  • 19:04 twentyafterfour: Getting the train back on track: deploying 1.31.0-wmf.27 to Group0
  • 17:49 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the remaining high-traffic jobs to EventBus, test wikis only, file 2/2 - T190327 (duration: 01m 15s)
  • 17:48 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9e1b203]: Switch remaining high traffic jobs for test wikis. T190327 (duration: 00m 43s)
  • 17:47 ppchelko@tin: Started deploy [cpjobqueue/deploy@9e1b203]: Switch remaining high traffic jobs for test wikis. T190327
  • 17:47 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch the remaining high-traffic jobs to EventBus, test wikis only, file 1/2 - T190327 (duration: 01m 16s)
  • 17:36 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Shift serach traffic for enwiki to codfw (duration: 01m 17s)
  • 17:21 smalyshev@tin: Finished deploy [wdqs/wdqs@49f4eed]: GUI update (duration: 09m 49s)
  • 17:11 smalyshev@tin: Started deploy [wdqs/wdqs@49f4eed]: GUI update
  • 16:37 madhuvishy: Rolling out new symlinks to /public/dumps for labstore1006 dumps nfs mount T188643
  • 15:59 madhuvishy: Absenting /public/dumps mount from labstore1003 across the VPS fleet T188643
  • 15:56 ebernhardson: restart elasticsearch on elastic1024, been stuck at 100% cpu for 3+ hours
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2035 IP - T191193 (duration: 01m 15s)
  • 15:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2035 IP - T191193 (duration: 01m 15s)
  • 15:28 marostegui: Stop MySQL and power off db2035 (s2 codfw master - this will stop replication on s2 codfw slaves) for rack change - T191193
  • 15:06 madhuvishy: Reenabled puppet and rolled out mounting new dumps NFS shares from labstore1006|7 on VPS instances T188643
  • 14:40 cmjohnson1: disabling puppet on decom host db1020
  • 14:28 madhuvishy: Disabling puppet across VPS instances with dumps mounted (https://phabricator.wikimedia.org/P6921) T188643
  • 14:22 marostegui: Drop contest* tables from s3 - T186867
  • 14:12 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=recommendation-api,cluster=scb,name=scb1003.*
  • 14:12 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=recommendation-api,cluster=scb,name=scb1004.*
  • 14:10 akosiaris: lower weight for scb1001, scb1002 from 10 to 8 for all services. T191199. scb1003, scb1004 have a weight of 15 already
  • 14:09 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,cluster=scb,name=scb1002.*
  • 14:09 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,cluster=scb,name=scb1001.*
  • 13:54 ariel@tin: Finished deploy [dumps/dumps@0363d50]: add check that xml files don't have binary corruption (nulls) after the header (duration: 00m 04s)
  • 13:54 ariel@tin: Started deploy [dumps/dumps@0363d50]: add check that xml files don't have binary corruption (nulls) after the header
  • 13:48 twentyafterfour@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Sync initializesettings for T190445 (duration: 01m 16s)
  • 13:36 twentyafterfour@tin: Synchronized wmf-config/throttle.php: SWAT: Sync throttle rules for T191187 (duration: 01m 15s)
  • 13:30 twentyafterfour@tin: Synchronized wmf-config/throttle.php: SWAT: Sync throttle rules for T191168 (duration: 01m 16s)
  • 13:27 jynus: restarting pdfrender on scd1003 (Socket timeout)
  • 12:49 akosiaris: upgrade mediawiki servers for hhvm upgrade
  • 12:06 marostegui: Deploy schema change on dbstore1002 - s3 - T187089 T185128 T153182
  • 11:51 akosiaris: repool mediawiki canary servers after hhvm upgrade
  • 11:44 akosiaris: depool mediawiki canary servers for hhvm upgrade
  • 10:16 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 10:15 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 09:13 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove references to virt1000 (duration: 01m 16s)
  • 09:12 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove references to virt1000 (duration: 01m 16s)
  • 08:50 marostegui: Deploy schema change on s3 codfw master db2043 (this will generate lag on codfw) - T187089 T185128 T153182
  • 08:21 jynus: stop mariadb at labsdb1009 and labsdb1010
  • 08:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Specify current m5 codfw master (duration: 01m 17s)
  • 08:11 jynus: depool labsdb1011 from web wikirreplicas
  • 07:21 apergos: restarted pdfrender on scb1004 after poking around there a bit
  • 07:01 apergos: restarted pdfrender on scb1001,2, service paged and no jobs were being processed
  • 06:06 marostegui: Drop localisation table from the hosts where it still existed - T119811
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.26) (duration: 12m 53s)

2018-03-31

  • 21:15 mutante: bast1001 has been shutdown and decom'ed as planned. if you have any issues with shell access make sure you have replaced with bast1002 or any other bast host
  • 11:26 urandom: removing corrupt commitlog segment, restbase1009-c
  • 11:25 urandom: removing corrupt commitlog segment, restbase1009-b
  • 11:19 urandom: starting restbase1009-c
  • 11:18 urandom: truncating hints, restbase1009-a
  • 11:14 urandom: restarting restbase1009-b
  • 11:13 urandom: stopping restbase1009-a (high hints storage)

2018-03-30

  • 14:16 akosiaris: T189076 upload apertium-fra-cat to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189076 upload apertium-cat to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189075 upload apertium-lex-tools to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189075 upload apertium-separable to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189076 upload apertium-fra to apt.wikimedia.org/jessie-wikimedia/main
  • 11:44 dcausse: running forceSearchIndex from terbium to cleanup elastic indices for (testwiki, mediawikiwiki, labswiki, labtestwiki, svwiki) (T189694)
  • 11:40 dcausse: elastic@codfw cluster restarts complete (T189239)
  • 10:55 dcausse: resuming elastic@codfw cluster restarts
  • 10:17 elukey: roll restart of zookeeper daemons on druid100[123] (Druid analytics cluster) to pick up the new prometheus jmx agent
  • 09:31 elukey: restart oozie/hive daemons on an1003 for openjdk-8 upgrades
  • 08:38 elukey: rolling restart of hadoop-hdfs-datanode on all the hadoop worker nodes after https://gerrit.wikimedia.org/r/423000
  • 07:39 elukey: rolling restart of yarn-hadoop-nodemanagers on all the hadoop worker nodes after https://gerrit.wikimedia.org/r/423000

2018-03-29

  • 23:47 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T189252: Enable perf oversampling for remaining countries in Asia (duration: 01m 16s)
  • 23:40 ebernhardson@tin: Synchronized php-1.31.0-wmf.27/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Start cirrus AB test (duration: 01m 16s)
  • 23:37 ebernhardson@tin: Synchronized php-1.31.0-wmf.26/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Start cirrus AB test (duration: 01m 16s)
  • 23:12 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Configure 5 buckets for cirrus AB test (duration: 01m 17s)
  • 22:10 andrew@tin: Finished deploy [horizon/deploy@14d3e7d]: Updating Horizon with possible fix for T189706 (duration: 03m 16s)
  • 22:06 andrew@tin: Started deploy [horizon/deploy@14d3e7d]: Updating Horizon with possible fix for T189706
  • 20:07 robh: shuttdown cp2022 for hw testing
  • 18:49 maxsem@tin: Synchronized php-1.31.0-wmf.27/skins/MinervaNeue: https://gerrit.wikimedia.org/r/#/c/423012/ (duration: 01m 17s)
  • 18:27 maxsem@tin: Synchronized php-1.31.0-wmf.26/includes/: Shorten summary length to 500 (duration: 02m 06s)
  • 18:22 maxsem@tin: Synchronized php-1.31.0-wmf.27/includes/: Shorten summary length to 500 (duration: 02m 14s)
  • 17:55 dcausse: pausing restarts of elastic@codfw (6 nodes left)
  • 17:35 mobrovac@tin: Finished deploy [restbase/deploy@af592d6]: Add bawikibooks - T191033 (duration: 30m 35s)
  • 17:30 demon@tin: Synchronized docroot/wwwportal/w/search-redirect.php: removing symlink indirection (duration: 01m 16s)
  • 17:05 mobrovac@tin: Started deploy [restbase/deploy@af592d6]: Add bawikibooks - T191033
  • 14:54 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Cleanup: Use only EventBus for refreshLinks - T185052 (duration: 01m 18s)
  • 14:00 moritzm: restarting parsoid and related service on ruthenium to pick up openssl update
  • 13:52 dcausse: reverted and rebased tin for undeployed patch due to scap issues (https://gerrit.wikimedia.org/r/#/c/422906/ https://gerrit.wikimedia.org/r/#/c/422929/)
  • 13:34 dcausse: aborted scap sync-dir php-1.31.0-wmf.27/extensions/CirrusSearch/ (was taking too much time at: waiting on sync-masters, ok: 1, left: 1)
  • 12:54 moritzm: installing ICU security updates on trusty
  • 12:29 dcausse: recreating replicas for skwiki_content in elastic@codfw due to stalled shard recovery
  • 12:18 ariel@tin: Finished deploy [dumps/dumps@982cebd]: ability to configure production of recombined metacurrent page content file (duration: 00m 02s)
  • 12:18 ariel@tin: Started deploy [dumps/dumps@982cebd]: ability to configure production of recombined metacurrent page content file
  • 11:02 ariel@tin: Finished deploy [dumps/dumps@96ba844]: cleanup 'latest' links, rss files from old runs (duration: 00m 04s)
  • 11:02 ariel@tin: Started deploy [dumps/dumps@96ba844]: cleanup 'latest' links, rss files from old runs
  • 10:50 dcausse: restarting elastic@codfw for JVM and plugin upgrade (T189239)
  • 09:16 elukey: roll restart aqs on aqs100* for icu/openssl upgrades
  • 08:18 akosiaris: T189075 upload apertium_3.5.1-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 08:18 moritzm: installing OpenJDK security updates on elastic* hosts (along with current version of the search plugins package)
  • 08:07 elukey: roll restart of cassandra on aqs* for openjdk-8 upgrades
  • 07:20 moritzm: installing openssl security updates
  • 07:18 ema: reboot cache@eqiad for retpoline kernel updates: T188092
  • 04:35 twentyafterfour: ran scap pull on deploy1001
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-28

  • 23:50 eileen: update civicrm revision changed from 9478ca39f1 to d6855cd281 (further security module updates, engage import dedupe)
  • 23:38 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Configure next Cirrus AB test (duration: 01m 16s)
  • 23:18 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T184969: Enable PageAssessments on trwiki (duration: 01m 09s)
  • 23:13 MaxSem: created PageAssessments tables on trwiki
  • 22:17 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.26 (duration: 01m 18s)
  • 22:16 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.26
  • 22:13 twentyafterfour: deploy of 1.31.0-wmf.27 resulted in a lot of SlowTimer errors for SlowTimer [10000ms] at runtime/ext_mysql: slow query: SELECT MASTER_GTID_WAIT(...)
  • 22:12 eileen: civicrm revision changed from 3f6028b24f to 9478ca39f1 (drupal security update)
  • 22:09 twentyafterfour@tin: rebuilt and synchronized wikiversions files: sync https://gerrit.wikimedia.org/r/#/c/422563/ group1 wikis to 1.31.0-wmf.27 refs T183966 T190960
  • 22:08 twentyafterfour: rolling forward group1 to 1.31.0-wmf.27 refs T183966 T190960
  • 22:05 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/: sync https://gerrit.wikimedia.org/r/#/c/422565/ (duration: 02m 15s)
  • 22:03 twentyafterfour: syncing https://gerrit.wikimedia.org/r/#/c/422565/ refs T190960 T183966
  • 21:53 mutante: deploy1001 - revoking old puppet certs and signing new ones
  • 21:42 twentyafterfour: getting the train back on track, group1 wikis to 1.31.0-wmf.27
  • 20:51 XenoRyet: updated civicrm from 85c89c7d0a to 3f6028b24f
  • 20:50 bsitzmann@tin: Finished deploy [mobileapps/deploy@6a0d877]: Update mobileapps to a5833a0 (duration: 05m 36s)
  • 20:44 bsitzmann@tin: Started deploy [mobileapps/deploy@6a0d877]: Update mobileapps to a5833a0
  • 20:12 mlitn@tin: Finished deploy [3d2png/deploy@c447488]: Updating 3d2png (duration: 02m 26s)
  • 20:09 mlitn@tin: Started deploy [3d2png/deploy@c447488]: Updating 3d2png
  • 19:54 mutante: deploy1001 - schedule downtime for reinstall with jessie, reinstalling (T175288)
  • 19:24 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.26 (duration: 01m 17s)
  • 19:22 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.26
  • 19:20 twentyafterfour: Rolling back to wmf.26 due to increase in fatals: "Replication wait failed: lost connection to MySQL server during query"
  • 19:12 milimetric@tin: Finished deploy [analytics/refinery@c22fd1e]: Fixing python import bug (duration: 02m 48s)
  • 19:09 milimetric@tin: Started deploy [analytics/refinery@c22fd1e]: Fixing python import bug
  • 19:09 milimetric@tin: Started deploy [analytics/refinery@c22fd1e]: (no justification provided)
  • 19:06 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 17s)
  • 19:05 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 19:02 ebernhardson: restore elasticsearch eqiad disk high/low watermarks to 75/80% with all large reindexes complete
  • {{safesubst:SAL entry|1=18:52 urandom: upgrading restbase-dev1005-{a,b} to cassandra 3.11.2 -- T178905}}
  • 18:17 urandom: upgrading restbase-dev1004-b to cassandra 3.11.2 (canary) -- T178905
  • 18:12 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.27
  • 18:12 urandom: upgrading restbase-dev1004-a to cassandra 3.11.2 (canary) -- T178905
  • 18:03 twentyafterfour: deploying 1.31.0-wmf.27 to group0. group1 in an hour. See T183966 for blockers.
  • 17:38 joal@tin: Finished deploy [analytics/refinery@7135d44]: Regular weekly analytics deploy - Scheduled hadoop jobs updates (duration: 05m 21s)
  • 17:32 joal@tin: Started deploy [analytics/refinery@7135d44]: Regular weekly analytics deploy - Scheduled hadoop jobs updates
  • 16:37 akosiaris: T189075 upload lttoolbox_3.4.0~r84331-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 15:37 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable oversampling for IN, GU, MP in preparation for eqsin (T189252) (duration: 01m 18s)
  • 15:13 andrewbogott: restarting nodepool on labnodepool1001 (cleanup from T189115)
  • 15:08 andrewbogott: restarting nova-fullstack on labnet1001
  • 15:07 andrewbogott: restarting nova-network on labnet1001 in case it's upset by the rabbit outage
  • 15:02 andrewbogott: rebooting labservices1001 and labcontrol1001 for T189115
  • 15:00 andrewbogott: stopping nova-fullstack on labnet1001 for T189115
  • 15:00 andrewbogott: stopping nodepool on labnodepool1001
  • 14:58 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Disable redis queue for cirrusSearch jobs for test wikis, file 2/2 - T189137 (duration: 01m 17s)
  • 14:56 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Disable redis queue for cirrusSearch jobs for test wikis, file 1/2 - T189137 (duration: 01m 17s)
  • 14:54 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c84880a]: Switch CirrusSearch jobs to kafka for test wikis (duration: 00m 44s)
  • 14:54 ppchelko@tin: Started deploy [cpjobqueue/deploy@c84880a]: Switch CirrusSearch jobs to kafka for test wikis
  • 13:51 elukey: reduced number of jobrunner runners on the videoscalers after the last burst of jobs that maxed out the cluster
  • 13:51 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateStyles on all Wikivoyages (T189838) (duration: 01m 17s)
  • 13:42 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Wikidata description override on enwik (T184000) (duration: 01m 18s)
  • 13:36 catrope@tin: Synchronized php-1.31.0-wmf.27/extensions/Echo/modules/nojs/mw.echo.badge.less: Prevent FOUC when loading notification badges (duration: 01m 20s)
  • 13:35 jynus: upgrade mariadb client on sarin, neodymium, terbium and wasat
  • 13:18 catrope@tin: Synchronized dblists/flow.dblist: Enable Flow on euwiki (T190500) (duration: 01m 17s)
  • 13:07 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Translate extension on amwikimedia (T180879) (duration: 01m 22s)
  • 12:35 twentyafterfour@tin: Finished scap: test running full scap sync from tin (duration: 46m 05s)
  • 11:49 twentyafterfour@tin: Started scap: test running full scap sync from tin
  • 11:48 twentyafterfour@tin: Synchronized README: test deploy from tin.eqiad.wmnet (duration: 03m 35s)
  • 10:59 volans: performing a few minutes live test of reporting Puppet reports to puppetdb too on puppetmaster1001 - T190918
  • 10:27 godog: reload icinga on einsteinium after https://gerrit.wikimedia.org/r/c/413142
  • 10:05 jynus: upgrade and restart db2093
  • 09:25 godog: disable puppet on icinga servers before merging https://gerrit.wikimedia.org/r/c/413142/
  • 08:25 arturo: reboot labstore200[2,3,4] for T189115
  • 08:25 godog: add more weight to ms-be204[0-3] - T189633
  • 08:18 arturo: reboot labstore2001 for T189115
  • 08:17 arturo: reboot labstore1002 for T189115
  • 08:15 arturo: reboot labstore1001 for T189115
  • 07:49 moritzm: uploaded openssl 1.0.2o to apt.wikimedia.org/jessie-wikimedia
  • 06:51 moritzm: installing remaining ICU security updates
  • 02:28 l10nupdate@deploy1001: scap sync-l10n completed (1.31.0-wmf.26) (duration: 13m 33s)

2018-03-27

  • 23:18 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T189906: (duration: 00m 55s)
  • 23:08 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Update enwiki search ranking model (duration: 00m 54s)
  • 22:56 twentyafterfour@deploy1001: Finished scap: Deploy 1.31.0-wmf.27 to test wikis (duration: 41m 00s)
  • 22:28 mutante: DNS - switching deployment service name to deploy1001 (T175288)
  • 22:15 twentyafterfour@deploy1001: Started scap: Deploy 1.31.0-wmf.27 to test wikis
  • 22:14 demon@deploy1001: Synchronized wmf-config/abusefilter.php: beta-only sync (duration: 00m 53s)
  • 22:12 demon@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: beta-only sync (duration: 02m 32s)
  • 21:26 twentyafterfour@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.738JVwJRDN" ' returned non-zero exit status 127 (duration: 00m 43s)
  • 21:26 twentyafterfour@deploy1001: Started scap: Deploy 1.31.0-wmf.27 to test wikis
  • 21:25 mutante: deploy100 rm /var/lock/scap-global-lock to switch to active server, puppet code only adds lock file to inactive servers (T175288)
  • 21:22 twentyafterfour@deploy1001: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap-global-lock"; owner is "root"; reason is "Not the active deployment server, use tin.eqiad.wmnet" (duration: 00m 00s)
  • 21:22 mutante: deployment_server has been switched to deploy1001.eqiad.wmnet. tin is not the active server anymore as of right now
  • 20:55 twentyafterfour@deploy1001: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap-global-lock"; owner is "root"; reason is "Not the active deployment server, use tin.eqiad.wmnet" (duration: 00m 00s)
  • 20:47 twentyafterfour@tin: Finished scap: 2nd Sync to co-masters to initialize deploy1001.eqiad.wmnet (duration: 12m 50s)
  • 20:34 twentyafterfour@tin: Started scap: 2nd Sync to co-masters to initialize deploy1001.eqiad.wmnet
  • 20:32 twentyafterfour@tin: Finished scap: Sync to co-masters to initialize deploy1001.eqiad.wmnet (duration: 21m 30s)
  • 20:11 twentyafterfour@tin: Started scap: Sync to co-masters to initialize deploy1001.eqiad.wmnet
  • 20:09 twentyafterfour@tin: Synchronized README: (no justification provided) (duration: 00m 52s)
  • 19:41 mutante: deploy1001 - deleting /srv and letting puppet recreate it, so _not_ rsyncing manually from tin but just a clean version of what puppet pulls in (T175288)
  • 18:42 twentyafterfour: branching 1.31.0-wmf.27
  • 18:03 andrewbogott: rebooting labsdb1007 for T189115
  • 17:59 demon@tin: Finished deploy [gerrit/gerrit@4910e7c]: motd plugin (duration: 00m 11s)
  • 17:59 demon@tin: Started deploy [gerrit/gerrit@4910e7c]: motd plugin
  • 17:55 andrewbogott: rebooting labsdb1006 for T189115
  • 17:51 foks: disable 2FA from User:CĂŠrĂŠales Killer
  • 16:51 madhuvishy: Running rsync catch up job for dumps from ms1001 to labstore1007
  • 16:43 moritzm: uploaded openssl 1.1.0h for jessie-wikimedia to apt.wikimedia.org
  • 16:18 godog: point eqiad puppet traffic to eqiad
  • 15:58 godog: point esams puppet agent traffic to eqiad
  • 15:35 hashar: Bumping operations-puppet-tests-docker job to docker-registry.wikimedia.org/releng/operations-puppet:0.3.1 | https://gerrit.wikimedia.org/r/#/c/422169/ | ping vgutierrez
  • 15:23 godog: reenable puppet fleetwide for CA failover - T189891
  • 15:10 godog: stop puppet fleetwide for CA failover - T189891
  • 14:45 andrewbogott: rebooting labpuppetmaster1001 for T189115
  • 14:36 andrewbogott: rebooting labpuppetmaster1002 for T189115
  • 14:12 ppchelko@tin: Finished deploy [restbase/deploy@e19bad9]: Deploy without feed check to verify that misterious deploy timeouts still happen (duration: 10m 52s)
  • 14:04 zeljkof: EU SWAT finished
  • 14:01 ppchelko@tin: Started deploy [restbase/deploy@e19bad9]: Deploy without feed check to verify that misterious deploy timeouts still happen
  • 13:54 ppchelko@tin: Started restart [restbase/deploy@e19bad9]: Restart to verify that misterious deploy timeouts still happen
  • 13:37 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Change wording for AbuseFilter global block durations (T190602) (duration: 00m 57s)
  • 13:31 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Enable $wgAbuseFilterProfile on itwiki (T190137) (duration: 00m 57s)
  • 13:30 godog: deactivate/clean iridium.eqiad.wmnet -- decom'd
  • 13:24 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AbuseFilter runtime profile on more Wikis (T175954) (duration: 00m 58s)
  • 11:36 moritzm: installing ICU security updates
  • 10:50 arturo: reboot labtestvirt2002 to test if it would boot or not
  • 09:44 elukey: reboot aqs1009 for kernel + cassandra upgrades
  • 09:28 elukey: reboot aqs1008 for kernel + cassandra upgrades
  • 09:25 vgutierrez: uploaded mtail-3.0.0~rc5-1 to apt.w.o for jessie-wikimedia
  • 09:09 elukey: reboot aqs1007 for kernel + cassandra upgrades
  • 08:36 kartik@tin: Finished deploy [cxserver/deploy@a6b029f]: Update cxserver to 9e8ebda (Fix etag parsing and T188403) (duration: 03m 09s)
  • 08:33 kartik@tin: Started deploy [cxserver/deploy@a6b029f]: Update cxserver to 9e8ebda (Fix etag parsing and T188403)
  • 08:33 elukey: reboot aqs1006 for kernel + openjdk-8 + cassandra upgrade
  • 08:29 godog: add more weight to ms-be204[0-3] - T189633
  • 08:15 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1005.eqiad.wmnet
  • 08:11 elukey: reboot aqs1005 for kernel + openjdk-8 + cassandra upgrade
  • 06:59 elukey: powercycle restbase2007 (no ssh, vsp not available via mgmt console)
  • 05:59 Krinkle: Manually purge https://en.wikipedia.org/static/images/project-logos/nds_nlwiki-2x.png – T190051
  • 05:59 Krinkle: Manually purge https://en.wikipedia.org/static/images/project-logos/nds_nlwiki-1.5x.png – T190051
  • 02:57 Krinkle: Fix retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/ve/*)
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-26

  • 23:41 niharika29@tin: Synchronized static/images/project-logos/: Correct high-density logos for the Dutch Low Saxon Wikipedia T190051 (duration: 00m 59s)
  • 22:38 mutante: syncing /srv from tin.eqiad to deploy1001.eqiad (T175288)
  • 22:09 demon@tin: Finished deploy [gerrit/gerrit@b14b43b]: wikimedia plugin (duration: 00m 10s)
  • 22:09 demon@tin: Started deploy [gerrit/gerrit@b14b43b]: wikimedia plugin
  • 21:43 urandom: rolling restart of restbase dev environment
  • 20:50 demon@tin: Pruned MediaWiki: 1.31.0-wmf.25 [keeping static files] (duration: 01m 26s)
  • 20:46 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e223f51]: Update mobileapps to 534f95d (duration: 05m 23s)
  • 20:41 mholloway-shell@tin: Started deploy [mobileapps/deploy@e223f51]: Update mobileapps to 534f95d
  • 20:29 no_justification: gerrit: restarting services to pick up bugfix
  • 20:26 demon@tin: Finished deploy [gerrit/gerrit@f6c5350]: update to 2.14.7-9-g0f04397dbd (duration: 00m 10s)
  • 20:25 demon@tin: Started deploy [gerrit/gerrit@f6c5350]: update to 2.14.7-9-g0f04397dbd
  • 19:55 andrew@tin: Finished deploy [horizon/deploy@99153e4]: Rolling out fix for security groups, 421983 (duration: 03m 12s)
  • 19:52 andrew@tin: Started deploy [horizon/deploy@99153e4]: Rolling out fix for security groups, 421983
  • 19:44 ejegg: updated payments-wiki from 9e83e7f7a0 to 320a6c2600
  • 19:23 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 01m 22s)
  • 19:21 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:21 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 04m 16s)
  • 19:17 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:17 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 03m 29s)
  • 19:13 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:12 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 03m 49s)
  • 19:09 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:06 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 19m 28s)
  • 18:46 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 18:28 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mobile-only Mediawiki:MainPageCss styles for Hindi wiki T190101 (duration: 00m 58s)
  • 17:26 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2010.codfw.wmnet
  • 17:26 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 16:38 demon@tin: Pruned MediaWiki: 1.31.0-wmf.23 (duration: 05m 03s)
  • 13:55 hashar: restarting CI Jenkins . Upgrades Mail plugin from 1.20 to 1.21 | T190393
  • 13:30 moritzm: restarting HHVM on app server canaries to pick up ICU security update (not rebooting as logged before)
  • 13:30 moritzm: rebooting app server canaries to pick up ICU security update
  • 13:27 zeljkof: EU SWAT finished
  • 13:26 zfilipin@tin: Synchronized php-1.31.0-wmf.26/extensions/MobileFrontend/: SWAT: Squash: Hygiene: Auto namespace ResourceLoader modules and Add $wgMFMobileMainPageCss config flag; Hygiene: Auto namespace ResourceLoader modules; Add $wgMFMobileMainPageCss config flag (T190101) (duration: 01m 01s)
  • 13:23 ottomata: temporarily stopping puppet on kafka102[023] to use --new.consumer mirrormaker consuming from end
  • 13:21 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Enable AbuseFilter profiler at zh.wikipedia (T190663) (duration: 01m 00s)
  • 13:13 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add tboverride to engineer at ruwiki (T190619) (duration: 01m 01s)
  • 12:47 godog: add ms-be204[0-3] with minimal weight - T189633
  • 12:40 arturo: reboot labservices1002 for T189115
  • 12:30 arturo: reboot labnet100[2,3,4]* for T189115
  • 12:30 arturo: reboot labbwr100[2,3,4] for T189115
  • 12:00 arturo: reboot labmon100[1,2] for T189115
  • 12:00 moritzm: restarting HHVM on mediawiki canaries to pick up ICU security update
  • 11:47 arturo: reboot labcontrol100[3,4] for T189115
  • 11:31 arturo: reboot labcontrol1002 for T189115
  • 11:16 akosiaris: depool scb hosts for mathoid service. T184919
  • 11:16 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: service=mathoid,cluster=scb,name=scb.*
  • 10:56 moritzm: installing ICU security updates for jessie/stretch
  • 10:39 arturo: reboot silver for T189115
  • 10:34 arturo: reboot californium for T189115
  • 10:26 moritzm: upgrading debdeploy across the fleet to 0.0.99.4
  • 10:23 moritzm: uploaded debdeploy 0.0.99.4 to apt.wikimedia (for trusty/jessie/stretch)
  • 08:17 moritzm: upgrading debdeploy across the fleet to latest release
  • 07:33 elukey: stop eventlogging zmq-forwarder on eventlog1001 as part of decom process - T189566
  • 05:39 _joe_: restarting pdfrenderer on scb1001,1003
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-24

  • 20:22 foks: rm 2fa from Awight@officewiki
  • 15:00 elukey: rm -rf /srv/mediawiki/core on stat100[456] and force puppet run (git pull returned fatal: protocol error: bad pack header)
  • 02:33 bblack: powercycle cp3048
  • 02:31 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3048.esams.wmnet
  • 01:27 Krinkle: Correct retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/VisualEditor/*)
  • 00:39 Krinkle: Correct retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/mw/*)

2018-03-23

  • 21:35 ebernhardson: delete indices for deleted wikis (from deleted.dblist) in eqiad and codfw elasticsearch clusters: alswikiquote, alswiktionary, mowiki, mowiktionary, ukwikimedia
  • 19:24 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 4) (duration: 06m 58s)
  • 19:17 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 4)
  • 19:11 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 3) (duration: 04m 19s)
  • 19:07 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 3)
  • 19:06 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 2) (duration: 06m 23s)
  • 18:59 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 2)
  • 18:28 sbisson@tin: Finished deploy [kartotherian/deploy@a66ff1d]: Deploying i18n feature to maps-test* (duration: 00m 29s)
  • 18:28 sbisson@tin: Started deploy [kartotherian/deploy@a66ff1d]: Deploying i18n feature to maps-test*
  • 15:43 moritzm: uploaded debdeploy 0.0.99.3 to apt.wikimedia.org (now based on Python 3 for the clients)
  • 15:08 ema: cache_codfw: begin reboots for retpoline kernel upgrades T188092
  • 15:02 bawolff@tin: Synchronized php-1.31.0-wmf.26/includes/api/ApiQueryUserContributions.php: T190507 (duration: 00m 59s)
  • 13:24 moritzm: installing postgres security updates on rhenium
  • 12:51 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=appserver,service=apache2
  • 12:48 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=api_appserver,service=apache2
  • 11:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1072 weight (duration: 00m 59s)
  • 11:19 moritzm: installing libvorbis security updates on trusty (Debian already fixed)
  • 11:09 elukey: restarting jvm daemons on analytics100[12] (Hadoop Masters) for openjdk-8 upgrade
  • 10:59 jynus: deployed new replication filter for labsdb1004 on u2815__p.all_articles T190488
  • 10:49 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Disable reading wb_terms search fields on wikidata (T189777) (duration: 00m 59s)
  • 10:36 elukey: upload cassandra2.2.6-wmf3 to jessie/stretch-wikimedia -C component/cassandra22 - T189529
  • 10:22 moritzm: restarting apache on krypton to pick up curl security update
  • 10:00 moritzm: installing plexus-utils2 security updates
  • 09:49 moritzm: armed keyholder on deploy1001
  • 08:19 elukey: reboot eventlog1001 for kernel upgrades

2018-03-22

  • 23:40 Amir1: Evening SWAT is done
  • 23:40 Amir1: Just to note, if you are seeing any performance regression (specially database-wise) 421333 might be the reason
  • 23:39 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Disable reading wb_terms search fields on wikidata (T189777) (duration: 00m 58s)
  • 23:29 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 58s)
  • 23:27 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-2x.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 56s)
  • 23:26 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-1.5x.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 58s)
  • 23:23 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-1.5x.png: static/images/project-logos/nds_nlwiki-2x.png static/images/project-logos/nds_nlwiki.png Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 59s)
  • 22:47 mutante: restarting Gerrit to apply config changes gerrit:406145 and gerrit:410474
  • 22:25 mutante: icinga - re-enabling notifications for a LOT of "systemd checks" that were all OK since a longer time but had not been re-enabled after some maintenance
  • 20:18 andrewbogott: reimaged labtestvirt2002
  • 19:52 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.26
  • 19:34 cmjohnson1: db1052 replacing disk slot 8
  • 18:52 XioNoX: done with the asw-a/b/c-eqiad switches uplink work
  • 18:43 Amir1: Morning SWAT is done
  • 18:41 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable VirtualPageViews on s6 (ja,ru,fr) wikis (T189906) (duration: 01m 16s)
  • 17:59 ppchelko@tin: Finished deploy [changeprop/deploy@4f9fbe4]: Purge page metadata and references on html change and page deletion. (duration: 01m 16s)
  • 17:57 ppchelko@tin: Started deploy [changeprop/deploy@4f9fbe4]: Purge page metadata and references on html change and page deletion.
  • 17:44 mutante: install1002 - restarted dhcp server to confirm there was no syntax error
  • 17:21 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, with increased timeout (duration: 03m 15s)
  • 17:18 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, with increased timeout
  • 17:14 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 3 (duration: 02m 54s)
  • 17:11 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 3
  • 17:10 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 2 (duration: 03m 00s)
  • 17:07 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 2
  • 17:03 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints (duration: 08m 39s)
  • 16:55 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints
  • 16:42 moritzm: installing postgres security updates on netmon*
  • 16:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Redeploy GlobalPreferences to test wikis and mw.org" (T189806) (duration: 01m 14s)
  • 16:28 moritzm: restarting graphite on labmon1001 to pick up uwsgi security update
  • 16:04 XioNoX: starting the asw-a/b/c-eqiad switches uplink work
  • 15:43 sbisson@tin: Finished deploy [tilerator/deploy@e259530]: Weekly progress to production (duration: 00m 43s)
  • 15:42 sbisson@tin: Started deploy [tilerator/deploy@e259530]: Weekly progress to production
  • 15:37 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Weekly progress to production (duration: 02m 27s)
  • 15:35 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Weekly progress to production
  • 15:23 sbisson@tin: Finished deploy [tilerator/deploy@e259530]: Deploying weekly progress to maps-test* (duration: 00m 26s)
  • 15:23 ottomata: ran puppet-merge on puppetmaster2001, got ssh: connect to host puppetmaster1001.eqiad.wmnet port 22: Connection timed out, hope all is ok. T189891
  • 15:23 sbisson@tin: Started deploy [tilerator/deploy@e259530]: Deploying weekly progress to maps-test*
  • 15:17 moritzm: installing openssh updates from stretch point release
  • 15:14 cmjohnson1: db1054 replacing disk at slot 1
  • 15:10 cmjohnson1: replacing disk slot 11 db1061
  • 15:09 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Deploying weekly progress to maps-test* (duration: 01m 59s)
  • 15:08 moritzm: installing java-atk-wrapper updates from stretch point release
  • 15:07 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Deploying weekly progress to maps-test*
  • 14:57 moritzm: installing cups update from stretch point release (we only install the client libs)
  • 14:24 jynus: killing ongoing truncate to investigate s3 issues
  • 14:16 elukey: rolling restart of the three hadoop hdfs journal nodes (an1028/35/52) for openjdk-8 upgrades
  • 14:00 godog: reimage puppetmaster1001 - T184562
  • 13:57 zeljkof: EU SWAT finished
  • 13:55 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Properly setup ProofreadPage namespaces for cywikisource (T181406) (duration: 01m 16s)
  • 13:38 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Make eswikibooks logo normal size (T190366) (duration: 01m 16s)
  • 13:29 mobrovac@tin: Finished deploy [zotero/translators@1c30955]: Update translators - T188893 (duration: 00m 08s)
  • 13:29 mobrovac@tin: Started deploy [zotero/translators@1c30955]: Update translators - T188893
  • 13:27 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change bewikibooks logo (T189218) (duration: 01m 15s)
  • 13:25 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change bewikibooks logo (T189218) (duration: 01m 16s)
  • 13:23 godog: reenabling puppet fleetwide to enable CA switch - T189891
  • 13:11 ladsgroup@tin: Synchronized wmf-config/Wikibase.php: Remove forceWriteTermsTableSearchFields from testwikidatawiki, part II (T189776) (duration: 01m 15s)
  • 13:09 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Remove forceWriteTermsTableSearchFields from testwikidatawiki, part I (T189776) (duration: 01m 16s)
  • 13:05 godog: stop rsync of ca/volatile on puppetmaster1001
  • 12:31 godog: chown puppet:puppet /var/lib/puppet/server/ssl/ca on puppetmaster2001
  • 12:20 godog: running puppet on puppetmaster[21]001 - T189891
  • 12:12 godog: stopping puppet fleetwide for ca migration - T189891
  • 11:20 elukey: rolling restart of the hadoop hdfs datanode daemons on all the analytics hadoop workers for openjdk-8 upgrade
  • 11:18 apergos: and a third time to try updating the puppet compiler facts, this time using puppetmaster2001
  • 11:09 arturo: T189722 reboot labtestvirt2002 to downgrade kernel
  • 11:02 moritzm: installing plexus-utils security updates
  • 11:01 arturo: T189722 reboot labtestvirt2001 to downgrade kernel
  • 10:53 apergos: due to miscommunication, second update of puppet compiler facts happening now. oh well
  • 10:42 elukey: update puppet compiler's fact
  • 10:28 ema: cp-upload_esams: carry on with reboots for retpoline kernel updates T188092
  • 10:10 ema: repool cp3010
  • 09:55 elukey: rolling restart of yarn nodemanagers on the analytics hadoop workers for openjdk-8 upgrade
  • 09:21 marostegui: Truncate updatelog on s3 - T174804
  • 09:19 marostegui: Truncate updatelog on s1 - T174804
  • 09:04 marostegui: Truncate updatelog on s7 - T174804
  • 08:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 (duration: 01m 15s)
  • 08:45 marostegui: Truncate updatelog on s2 - T174804
  • 08:30 marostegui: Truncate updatelog on s4,s5,s6,s8 - T174804
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1006 after kernel, mariadb and socket location upgrade (duration: 01m 11s)
  • 08:21 jynus: upgrade and restart db1060
  • 08:17 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 (duration: 01m 15s)
  • 08:06 marostegui: Restart pt-heartbeat on pc2006
  • 08:05 marostegui: Restart pt-heartbeat on pc2004 and pc2005
  • 08:04 marostegui: Restart pt-heartbeat on pc1004 and pc1005
  • 07:59 marostegui: Stop MySQL on pc1006 for kernel, mariadb and socket path upgrade
  • 07:58 elukey: depool cp3010 + powercycle (no ssh access, mgmt console frozen)
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1006 for kernel, mariadb and socket location upgrade (duration: 01m 16s)
  • 06:25 marostegui: Remove db1001 from tendril - T190262
  • 06:25 marostegui: Stop MySQL on db1001 to get ready to decommission it - T190262
  • 06:16 marostegui: Reload dbproxy1006 to pick up the new standby host - T183469
  • 06:16 marostegui: Reload dbproxy1001 to pick up the new standby host - T183469
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 07m 46s)
  • 01:52 ebernhardson: increase cluster.routing.allocation.disk.watermark.low to 80% on eqiad elasticsearch due to shards not allocating during reindex
  • 01:10 ebernhardson: started in-place reindex of all wikis on both elasticsearch clusters
  • 00:02 andrewbogott: restarted nova-network on labnet1001 and nova-compute on labvirt1015 as part of debugging T190367
  • 00:00 Amir1: Evening SWAT is done
  • 00:00 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: guwiki: fix rollback -> rollbacker (group) (T190370) (duration: 01m 16s)

2018-03-21

  • 23:53 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Migrate $wgOresModels to the new config system (T189948) (duration: 01m 16s)
  • 23:41 ladsgroup@tin: Synchronized wmf-config/throttle.php: Add new throttle rule and add task for one in comment (duration: 01m 16s)
  • 23:36 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: guwiki: clean up $wg{Add,Remove}Groups configuration (duration: 01m 16s)
  • 23:21 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgAbuseFilterProfile & $wgAbuseFilterRuntimeProfile on eswikibooks, part II (T190264) (duration: 01m 15s)
  • 23:19 ladsgroup@tin: Synchronized wmf-config/abusefilter.php: Enable $wgAbuseFilterProfile & $wgAbuseFilterRuntimeProfile on eswikibooks, part I (T190264) (duration: 01m 15s)
  • 22:33 eileen: civicrm revision changed from 3291ad35c9 to 85c89c7d0a, config revision is 03511638ed
  • 22:32 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: Revert global prefs (duration: 01m 15s)
  • 22:18 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/421185/ (duration: 01m 15s)
  • 21:55 andrew@tin: Synchronized wmf-config/CommonSettings.php: turning off wgReadOnly on labtestwikitech (duration: 01m 16s)
  • 20:34 mlitn@tin: Finished deploy [3d2png/deploy@812a68a]: Updating 3d2png (duration: 02m 57s)
  • 20:31 mlitn@tin: Started deploy [3d2png/deploy@812a68a]: Updating 3d2png
  • 20:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@675837f]: Update mobileapps to e6b50a0 (duration: 05m 33s)
  • 20:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@675837f]: Update mobileapps to e6b50a0
  • 19:12 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.26
  • 19:09 demon@tin: Synchronized php: symlink bump (duration: 01m 15s)
  • 19:05 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: rvv (duration: 01m 15s)
  • 19:03 anomie: Deleted some 12-year-old open proxy blocks to resolve T189840.
  • 18:36 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: beta-only (duration: 01m 16s)
  • 18:34 demon@tin: Synchronized scap/plugins/prep.py: consistency (duration: 01m 17s)
  • 18:09 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: name=mw22(59|[6-9][0-9])\.codfw\.wmnet
  • 18:08 _joe_: pooling all the new codfw appservers
  • 18:05 maxsem@tin: Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/420397/ (duration: 01m 15s)
  • 18:02 godog: delete obsolete metrics from prometheus following https://gerrit.wikimedia.org/r/c/421086
  • 17:46 maxsem@tin: Synchronized wmf-config/Wikibase.php: https://gerrit.wikimedia.org/r/#/c/420336/ (duration: 01m 15s)
  • 17:43 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/421046/ (duration: 01m 15s)
  • 17:35 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/420947/ (duration: 01m 15s)
  • 17:30 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/420910/ (duration: 01m 16s)
  • 17:26 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/419528/ (duration: 01m 15s)
  • 17:22 volans@tin: Finished deploy [puppetboard/deploy@d6514d6]: Adjust wsgi config - T184563 (duration: 00m 06s)
  • 17:22 volans@tin: Started deploy [puppetboard/deploy@d6514d6]: Adjust wsgi config - T184563
  • 17:17 maxsem@tin: Synchronized dblists/flow.dblist: https://gerrit.wikimedia.org/r/#/c/420799/ (duration: 01m 12s)
  • 17:11 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/419611/ (duration: 01m 15s)
  • 17:07 ppchelko@tin: Finished deploy [cpjobqueue/deploy@545cb61]: Increase refreshLinks concurrency to 20 per partition (duration: 00m 37s)
  • 17:06 ppchelko@tin: Started deploy [cpjobqueue/deploy@545cb61]: Increase refreshLinks concurrency to 20 per partition
  • 16:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 fully, post-silver cleanup (duration: 01m 14s)
  • 16:53 _joe_: running systemd-tmpfiles --create on the new appservers
  • 16:51 jynus@tin: Synchronized wmf-config/db-codfw.php: Post-silver cleanup (duration: 01m 03s)
  • 16:48 andrew@tin: Synchronized wmf-config/CommonSettings.php: one of many wikitech cleanups (duration: 01m 38s)
  • 16:46 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: one of many wikitech cleanups (duration: 03m 12s)
  • 16:42 andrew@tin: Synchronized wmf-config/wikitech.php: first of many wikitech cleanups (duration: 03m 16s)
  • 16:12 andrew@tin: Synchronized wmf-config/filebackend.php: labtestwikitech -> swift (duration: 01m 14s)
  • 16:10 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: labtestwikitech -> swift (duration: 01m 15s)
  • 16:07 oblivian@puppetmaster2001: conftool action : set/pooled=inactive; selector: name=mw22(59|[6-9][0-9])\.codfw\.wmnet
  • 15:53 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2 (duration: 00m 40s)
  • 15:53 ppchelko@tin: Started deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2
  • 15:51 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738 (duration: 03m 03s)
  • 15:48 ppchelko@tin: Started deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738
  • 15:28 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 15:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1019 after socket location upgrade (duration: 01m 12s)
  • 15:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 with low load (duration: 01m 15s)
  • 15:11 volans@tin: Finished deploy [puppetboard/deploy@81cd93a]: Adjust wsgi config - T184563 (duration: 00m 06s)
  • 15:11 volans@tin: Started deploy [puppetboard/deploy@81cd93a]: Adjust wsgi config - T184563
  • 15:05 jynus: stop, upgrade and restart db1079
  • 15:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 (duration: 01m 15s)
  • 13:39 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 13:23 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 13:20 zeljkof: EU SWAT finished
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: config: Enable testwiki NavTiming oversample for a bunch more countries (T190229) (duration: 01m 15s)
  • 13:12 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T189778) (duration: 01m 16s)
  • 11:33 moritzm: rolling restart of Kibana/Logstash to pick up OpenJDK security update
  • 11:32 ema: cache_misc@esams: upgrade varnish to 5.1.3-1wm7
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Cleanup old hosts (duration: 01m 18s)
  • 11:29 jynus@tin: Synchronized wmf-config/db-codfw.php: Cleanup old hosts (duration: 01m 13s)
  • 11:17 ema: varnish 5.1.3-1wm7 uploaded to apt.w.o
  • 10:51 marostegui: Stop MySQL on db1016 to clone db1065 - T183469
  • 10:47 moritzm: rolling restart of elasticsearch on logstash to pick up OpenJDK security update
  • 10:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1065 from config - T183469 (duration: 01m 15s)
  • 10:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1065 from config - T183469 (duration: 01m 15s)
  • 10:37 moritzm: rolling restart of elasticsearch on relforge to pick up OpenJDK security update
  • 10:16 volans: re-enabling puppet on einsteinium (icinga host) see T177253#4067901
  • 09:57 moritzm: installing php5 security updates on trusty (jessie already fixed)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T183469 (duration: 01m 15s)
  • 09:47 moritzm: installing tiff security updates on trusty
  • 09:40 marostegui: Stop db1065 and db1106 in sync - this will generate lag on labs
  • 09:23 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 09:11 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 09:11 marostegui: Stop mysql on db2078 for new socket config
  • 08:56 marostegui: Stop mysql on db2037 for new socket config
  • 08:46 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:46 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3030.esams.wmnet,service=varnish-be
  • 08:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T183469 (duration: 01m 14s)
  • 08:35 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3030.esams.wmnet,service=varnish-be
  • 08:35 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1005 after kernel, mariadb and socket location upgrade (duration: 01m 15s)
  • 08:20 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:19 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033,service=varnish-be
  • 08:10 hashar: contint1001: deleting some old docker images
  • 08:09 hashar: contint1001: docker image prune ; docker container prune # T178663
  • 08:09 hashar: contint1001: docker image prune ; docker container prune
  • 08:08 marostegui: Stop MySQL on pc1005 for kernel, mariadb and socket path upgrade
  • 08:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1005 for kernel, mariadb and socket location upgrade (duration: 01m 15s)
  • 07:07 marostegui: Remove db1020 from tendril - T189773
  • 07:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1020 from config - T189773 (duration: 01m 15s)
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1020 from config - T189773 (duration: 01m 13s)
  • 06:50 marostegui: Stop MySQL on db1020 - T189773
  • 06:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1019 after socket location upgrade (duration: 01m 14s)
  • 06:29 marostegui: Stop MySQL on es1019 to upgrade socket path
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 - socket location upgrade (duration: 01m 21s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 06m 18s)
  • 01:51 herron: codfw puppetdb upgrade complete. eqiad puppetmaster remains depooled T177253

2018-03-20

  • 23:41 Krinkle: Mass no-op resizing of Whisper files on graphite2001 and graphite1001 for T179622 (webpagetest.* namespace)
  • 23:01 MaxSem: Cleaned centralauth.global_preferences after testing
  • 22:58 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: Revert GlobalPreferences (duration: 01m 17s)
  • 22:55 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 05s)
  • 22:55 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 22:54 maxsem@tin: Finished scap: Test deployment of GlobalPreferences (duration: 39m 31s)
  • 22:41 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 07s)
  • 22:41 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 22:14 maxsem@tin: Started scap: Test deployment of GlobalPreferences
  • 21:02 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 26s)
  • 21:02 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:55 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 09s)
  • 20:55 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:44 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 02m 19s)
  • 20:42 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:28 papaul: OS install on mw2259-mw2290
  • 19:36 herron: temporarily disabling puppet agents for puppetdb upgrade
  • 19:28 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.26
  • 19:23 ejegg: updated payments-wiki from 30f5f3edfb to 9e83e7f7a0
  • 18:54 demon@tin: Finished scap: bootstrap wmf.26 (duration: 42m 16s)
  • 18:33 ema: varnish 5.1.3-1wm6 uploaded to apt.w.o
  • 18:30 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751) (duration: 02m 43s)
  • 18:27 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751)
  • 18:24 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751) (duration: 05m 58s)
  • 18:18 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751)
  • 18:12 demon@tin: Started scap: bootstrap wmf.26
  • 18:10 demon@tin: Synchronized wmf-config/CommonSettings.php: instantcommons for labstestwiki (duration: 01m 58s)
  • 17:30 mholloway-shell@tin: Finished deploy [mobileapps/deploy@fad1009]: Update mobileapps to 634a15f (duration: 05m 34s)
  • 17:29 elukey: test a depool/repool action for kafka1001 (eventbus/jobqueue) - part of an investigation to figure out where timeouts come from
  • 17:24 mholloway-shell@tin: Started deploy [mobileapps/deploy@fad1009]: Update mobileapps to 634a15f
  • 17:06 demon@tin: Pruned MediaWiki: 1.31.0-wmf.24 [keeping static files] (duration: 01m 23s)
  • 17:04 demon@tin: Pruned MediaWiki: 1.31.0-wmf.22 (duration: 02m 57s)
  • 16:38 jynus: running reset slave all on db1063 T189655
  • 16:16 akosiaris: restart bacula-dir T189655
  • 16:14 akosiaris: restart etherpad T189655
  • 16:13 jynus: db1063 in read-write (m1) again
  • 16:10 jynus: set m1 in read only
  • 16:09 jynus: heartbeat killed on m1-master
  • 16:02 herron: restarted apache2 on puppetmaster1001
  • 16:00 jynus: disable puppet on db1063, db1016
  • 15:57 jynus: changing replication topology of m1
  • 15:51 no_justification: gerrit: restarting services to pick up 2.14.6 -> 2.14.7 upgrade
  • 15:49 demon@tin: Finished deploy [gerrit/gerrit@09534cb]: gerrit 2.14.7 (duration: 00m 12s)
  • 15:49 demon@tin: Started deploy [gerrit/gerrit@09534cb]: gerrit 2.14.7
  • 15:20 marostegui: Drop empty (confirmed) table slots from s3 - T190153
  • 14:59 herron: codfw puppet masters upgraded to puppetdb4. placing puppet agents into icinga downtime and beginning puppet —noop runs (to send facts to new puppetdb) T177253
  • 14:58 marostegui: Drop empty (confirmed) table slots from s7 - T190153
  • 14:55 marostegui: Drop empty (confirmed) table slots from s6 - T190153
  • 14:53 twentyafterfour@tin: testing scap on tin
  • 14:53 marostegui: Drop empty (confirmed) table slots from s8 - T190153
  • 14:52 marostegui: Drop empty (confirmed) table slots from s5 - T190153
  • 14:52 marostegui: Drop empty (confirmed) table slots from s4 - T190153
  • 14:47 godog: upload scap 3.7.7-1 - T189306
  • 14:42 marostegui: Drop empty (confirmed) table slots from s2 - T190153
  • 14:40 marostegui: Drop empty (confirmed) table slots from s1 - T190153
  • 14:14 moritzm: rolling restart of elasticsearch in deployment-prep for new Java update
  • 14:03 ema: cp3007: upgrade varnish to 5.1.3-1wm5
  • 14:00 ema: upload varnish_5.1.3-1wm5 to apt.w.o
  • 13:59 ayounsi@tin: Finished deploy [netbox/deploy@7e29963]: Fixing netbox typo LDAP (duration: 00m 28s)
  • 13:59 ayounsi@tin: Started deploy [netbox/deploy@7e29963]: Fixing netbox typo LDAP
  • 13:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065, give main traffic to db1106 - T183469 (duration: 00m 58s)
  • 13:29 herron: depooling codfw puppet masters via dns T177253
  • 12:59 moritzm: restarting apache on bohrium/piwik to pick up curl security update
  • 12:53 jynus: applying schema change to wikishared.cx_translations T190133
  • 12:50 arturo: reboot labtestservices2003 for T189722
  • 12:33 arturo: reboot labtestservices2002 for T189722
  • 12:04 arturo: reboot labtestservices2001 for T189722
  • 11:28 godog: run compiler-update-facts
  • 11:07 arturo: reboot labtestnet2002 for T189722
  • 11:03 jynus: upgrade and reboot db1095 - this can create temp. lag on wikireplicas
  • 10:50 arturo: reboot again labtestnet2001 for T189722. Now with a proper grub menu
  • 10:44 jynus: upgrade and reboot db1102 - this can create tempory lag on wikireplicas
  • 10:44 arturo: reboot labtestnet2001 for T189722
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1012 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 09:28 jynus: repool labsdb1009 after upgrade
  • 09:11 moritzm: restarting apache on netmon* to pick up curl security updates
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1012 after kernel, mariadb and socket location upgrade (duration: 00m 57s)
  • 09:01 hashar: restarting Jenkins for java update
  • 08:50 marostegui: Stop MySQL on es1012 for mariadb, kernel and socket location upgrade
  • 08:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1012 for kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1004 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:41 jynus: upgrade and restart labsdb1009
  • 08:34 jynus: depool labsdb1009
  • 08:25 moritzm: installing curl security updates
  • 08:23 marostegui: Stop MySQL on pc1004 for mariadb, kernel and socket location upgrade
  • 08:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1004 for kernel, mariadb and socket location upgrade (duration: 00m 57s)
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1106 in s1 - T183469 (duration: 00m 58s)
  • 06:18 marostegui: Deploy schema change on s4 primary master db1068 - T187089 T185128 T153182
  • 04:18 krinkle@tin: Synchronized wmf-config/throttle-analyze.php: (no justification provided) (duration: 00m 58s)
  • 04:17 krinkle@tin: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 00m 58s)
  • 03:56 Krinkle: Deleting stale webpagetest.* metrics on graphite1001 and graphite2001 (any wsp file last modified 600+ days ago) – T179622
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 06m 40s)
  • 00:00 reedy@tin: Synchronized wmf-config/CommonSettings.php: Allow protocol-relative URLs in TemplateStyles (duration: 00m 59s)

2018-03-19

  • 23:43 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Wikidata description override on testwiki (duration: 00m 58s)
  • 23:39 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Log ReadingLists warning (duration: 00m 58s)
  • 23:36 ayounsi@tin: Finished deploy [netbox/deploy@f7faa04]: Fixing netbox deploy issue (duration: 00m 38s)
  • 23:35 reedy@tin: Synchronized multiversion/MWRealm.php: T45956 (duration: 00m 57s)
  • 23:35 ayounsi@tin: Started deploy [netbox/deploy@f7faa04]: Fixing netbox deploy issue
  • 23:27 ayounsi@tin: Finished deploy [netbox/deploy@bed8da1]: Fixing netbox deploy issue (duration: 00m 37s)
  • 23:27 ayounsi@tin: Started deploy [netbox/deploy@bed8da1]: Fixing netbox deploy issue
  • 20:32 mutante: signing puppet certs for new host bast1002. initial puppet run, will replace bast1001 soon (T186623)
  • 20:19 bblack: discarding unused vcl on all cp frontends, 1-at-a-time
  • 20:14 bblack: discarding unused vcl on all cp backends, 1-at-a-time
  • 19:53 andrew@tin: Synchronized wmf-config/wikitech.php: fix for T189347 take 2 (duration: 00m 57s)
  • 19:42 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751) (duration: 10m 28s)
  • 19:38 andrew@tin: Synchronized wmf-config/wikitech.php: fix for T189347 (duration: 00m 57s)
  • 19:31 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751)
  • 19:25 mobrovac@tin: (no justification provided)
  • 19:24 herron: upgraded compiler03.puppet3-diffs.eqiad.wmflabs (depooled) to puppetdb4/postgres backend
  • 19:14 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751) (duration: 08m 30s)
  • 19:05 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751)
  • 19:01 mutante: DNS - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones on ns servers to recreate zone files to add new language "gor" to langs.tmpl (T189109)
  • 19:00 mutante: adding gor.wikipedia.org - new language Gorontalo https://www.ethnologue.com/language/gor | https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Gorontalo
  • 18:44 smalyshev@tin: Finished deploy [wdqs/wdqs@d6bc746]: GUI update (duration: 02m 24s)
  • 18:43 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): bring dev environment current w/ production (T186751) (duration: 10m 16s)
  • 18:42 smalyshev@tin: Started deploy [wdqs/wdqs@d6bc746]: GUI update
  • 18:33 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgFlowReadOnly on commonswiki (T186463) (duration: 00m 57s)
  • 18:33 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): bring dev environment current w/ production (T186751)
  • 18:27 catrope@tin: Synchronized dblists/: Uninstall Flow from wikis where it was never used (T188812) (duration: 00m 57s)
  • 18:14 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mapframe on knwiki (T189883) (duration: 00m 58s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add enwiki and commons as import sources to mrwikisource (T188486) (duration: 00m 58s)
  • 15:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1091 (duration: 00m 59s)
  • 15:23 elukey: reboot kafka1003 for kernel upgrades (jobqueues/eventbus)
  • 15:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1091 (duration: 01m 01s)
  • 15:05 hashar: upgrading java on contint1001 / contint2001
  • 14:42 akosiaris: T184919 pool all kubernetes for service mathoid.
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:34 elukey: reboot kafka1002 (eventbus/jobqueue) for kernel upgrades
  • 14:28 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:28 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:18 ema: cp3040: discard old VCL T189892
  • 14:09 moritzm: restarting apache on contint1001 to pick up curl security update
  • 13:48 anomie: Cleaning up orphaned image_comment_temp rows on all wikis for T189985
  • 13:44 anomie@tin: Synchronized php-1.31.0-wmf.25/includes/filerepo/file/LocalFile.php: Applying fix for T189985 (duration: 00m 58s)
  • 13:22 zeljkof: EU SWAT finished
  • 13:20 zfilipin@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: Revert "Restrict FlaggedRevs to only operated on NS_MAIN on arwiki" (T148603 T189224) (duration: 00m 58s)
  • 13:10 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollbacker user right at arwikiquote (T189732) (duration: 00m 57s)
  • 13:09 moritzm: reimage mw1294-1296 as video scalers
  • 13:02 arturo: labtestcontrol2001: set GRUB_TIMEOUT=30 in /etc/default/grub, the previous value (10) wasn't enough to display the menu via mgmt
  • 12:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1091 (duration: 00m 57s)
  • 12:40 arturo: T189722 reboot labtestcontrol2001
  • 12:37 moritzm: installing curl security updates
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore es1016 original weight after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 11:55 _joe_: stopping hhvm on terbium for a test.
  • 11:44 moritzm: reimage mw1293 as video scaler
  • 11:29 godog: point codfw puppet to puppetmaster2001
  • 11:27 hashar@tin: Synchronized docroot/wwwportal/portal: (no justification provided) (duration: 00m 57s)
  • 11:17 ema: cache_misc@esams: upgrade to varnish 5.1.3-1wm4
  • 11:14 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Switching portals submodule to portals-deploy (T180777) (duration: 00m 58s)
  • 11:13 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Switching portals submodule to portals-deploy (T180777) (duration: 00m 58s)
  • 11:06 moritzm: uploaded openjdk-8 8u162-b12-1~bpo8+1 for jessie-wikimedia to apt.wikimedia.org
  • 10:58 godog: point eqsin puppet to puppetmaster2001
  • 10:53 moritzm: restarting jenkins on releases1001 to pick up Java security update
  • 10:47 godog: point ulsfo puppet to puppetmaster2001
  • 10:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T183469 (duration: 00m 58s)
  • 10:25 marostegui: Remove db1009 from tendril - T189216
  • 10:14 ema: cp3008: upgrade to varnish 5.1.3-1wm4
  • 09:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1009 from config - T189216 (duration: 00m 57s)
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1009 from config - T189216 (duration: 00m 58s)
  • 09:45 marostegui: Stop MySQL on db1009 - T189216
  • 09:37 elukey: restart hadoop daemons on analytics1070 for openjdk upgrades (canary)
  • 09:27 godog: reimage puppetmaster2001 with stretch - T184562
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1016 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 09:10 godog: depool codfw puppetmaster - T184562
  • 09:08 marostegui: Stop MySQL on es1016 for kernel, mariadb and socket location upgrade
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1016 for kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:57 moritzm: installing openjdk-8 security updates
  • 08:41 elukey: reboot thorium for kernel security upgrades (hosts all analytics websites, they will go down temporary)
  • 08:26 moritzm: installing libvorbis security updates
  • 08:22 elukey: revert previous state on aqs1004, the new pkg might need some more work - T189529
  • 08:19 marostegui: Reset slave on db1106 to get it ready for s1 - https://phabricator.wikimedia.org/T183469
  • 08:11 marostegui: Reboot db1106 for kernel upgrade
  • 07:58 elukey: manually installed cassandra-2.2.6-wmf3 on aqs1004 - T189529
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T183469 (duration: 00m 57s)
  • 07:47 elukey: drain cassandra instances and reboot aqs1004 for kernel upgrades
  • 07:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1106 from s5 to s1 - T183469 (duration: 01m 00s)
  • 07:27 marostegui: Reload dbproxy1002 and dbproxy1007 to get the new config - T189773
  • 06:20 marostegui: Deploy schema change on db1091 - T187089 T185128 T153182
  • 06:13 marostegui: Stop MySQL on db1091 for kernel and mariadb upgrade
  • 06:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 for schema change, kernel upgrade and mariadb upgrade (duration: 00m 58s)
  • 02:39 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 10m 54s)

2018-03-17

  • 18:41 elukey: executed apt-get clean on scb1004 to free some space (root partition disk space warning)
  • 03:09 krinkle@tin: Synchronized docroot/noc/db.php: noc: I410a56431a (duration: 00m 59s)
  • 00:13 mutante: running puppet on all cache::misc to rename director bromine to webserver_misc_static (T188163)

2018-03-16

  • 23:32 mutante: signing puppet cert for vega.codfw.wmnet, initial puppet run after fresh stretch install (T188163)
  • 18:43 mutante: creating new ganeti VM vega.codfw.wmnet to be equivalent of bromine, 1G RAM, 30G disk, 1vCPU (T189899)
  • 18:13 jynus: switching back wikireplica cloud dns to the original config
  • 17:32 jynus: reimage dbproxy1010
  • 16:29 jynus: updating wikireplica_dns 2/3
  • 16:22 moritzm: installing curl security updates
  • 16:09 marostegui: Stop MySQL on db1020 - T189773
  • 14:48 andrewbogott: reset contintcloud quotas as per https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#incorrect_quota_violations
  • 14:48 jynus: reimage dbproxy1011
  • 14:27 andrewbogott: restarting nodepool on nodepool1001
  • 14:25 elukey: reboot druid1002 for kernel updates
  • 14:14 andrewbogott: restarting rabbitmq on labcontrol1001
  • 13:57 andrewbogott: stopping nodepool temporarily during changes to nova.conf
  • 13:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2050 (duration: 00m 58s)
  • 13:15 chasemp: disable puppet across cloud things for safe rollout
  • 12:52 moritzm: uploaded libsodium23/php-acpu/php-mailparse to thirdparty/php72 (deps/extentions needed by Phabricator)
  • 12:51 ema: text-esams: reboot for kernel upgrades T188092 and to mitigate https://grafana.wikimedia.org/dashboard/db/varnish-failed-fetches?panelId=7&fullscreen&orgId=1&from=1518746284946&to=1521204628041
  • 12:12 marostegui: Reboot dbproxy1005 for kernel upgrade
  • 12:02 marostegui: Run pt-table-checksum on m2
  • 12:00 marostegui: Run pt-table-checksum on m5
  • 11:11 hashar: zuul: reenqueue all coverage jobs lost when restarting Zuul
  • 10:53 hashar: Upgrading zuul to zuul_2.5.1-wmf4 to resolve a mutex deadlock T189859
  • 10:45 jynus: disable puppet and load balance between 3 wikirreplicas on dbproxy1010
  • 10:19 jynus: upgrade and restart of dbproxy1009 (passive)
  • 10:01 elukey: restart eventlogging_sync on db1108 (eventlogging db slave) as precautions after the change of m4-master.eqiad.wmnet's CNAME
  • 10:00 moritzm: reverting the HHVM/ICU 57 setup on mwdebug2001 which was used for the dry run tests
  • 09:57 elukey: restart eventlogging-consumer@mysql-eventbus on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:56 hashar: Zuul coverage pipeline is deadlocked on an unreleased mutex. Will need a new Zuul version.
  • 09:51 elukey: restart eventlogging-consumer@mysql-m4 on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1015 after kernel, mariadb and socket upgrade (duration: 00m 57s)
  • 09:27 oblivian@tin: Finished deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2 (duration: 00m 29s)
  • 09:26 oblivian@tin: Started deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2
  • 09:17 oblivian@tin: (no justification provided)
  • 09:17 oblivian@tin: Finished deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts (duration: 00m 47s)
  • 09:16 oblivian@tin: Started deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts
  • 09:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T183469 (duration: 00m 57s)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1015 after kernel, mariadb and socket upgrade (duration: 00m 56s)
  • 08:49 jynus: upgrade and restart of dbproxy1004 (passive)
  • 08:41 marostegui: Stop MySQL on es1015 for maintenance
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1015 for kernel, mariadb and socket upgrade (duration: 00m 58s)
  • 08:40 elukey: reboot druid1006 for kernel updates
  • 08:29 elukey: reboot druid1005 for kernel updates
  • 07:53 moritzm: reimage mc2036 after mainboard replacement (T185587)
  • 07:15 marostegui: Stop MySQL on es2017 (es3 codfw master) for maintenance
  • 07:06 marostegui: Stop MySQL on es2016 (es2 codfw master) for maintenance
  • 06:52 marostegui: Stop MySQL on db2048 (s1 codfw master) for maintenance
  • 06:41 marostegui: Stop MySQL on db2051 (s4 codfw master) for maintenance
  • 06:28 marostegui: Stop MySQL on db2045 (s8 codfw master) for maintenance
  • 06:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 (duration: 00m 58s)
  • 01:46 XioNoX: librenms IRC bot moved to -operations channel. Doc on how to turn it off is on https://wikitech.wikimedia.org/wiki/LibreNMS#IRC_Alerting
  • 01:00 reedy@tin: Synchronized php-1.31.0-wmf.25/includes/specials/pagers/NewFilesPager.php: Fix T189846 (duration: 00m 58s)

2018-03-15

  • 23:25 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: Fix display issues (duration: 00m 59s)
  • 23:20 ebernhardson@tin: Synchronized php-1.31.0-wmf.25/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off Cirrus AB test (duration: 00m 58s)
  • 22:58 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: add some missing globals (duration: 00m 58s)
  • 20:38 demon@tin: Synchronized robots.txt: minor tidying (duration: 00m 58s)
  • 20:05 chasemp: disable puppet for cloud things for a safe rollout
  • 19:50 XenoRyet: updated civicrm from 9e79d63426 to 3291ad35c9
  • 19:14 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.25
  • 18:51 niharika29@tin: Synchronized php-1.31.0-wmf.25/extensions/MobileApp/: https://gerrit.wikimedia.org/r/#/c/419785/; https://gerrit.wikimedia.org/r/#/c/419784/; https://gerrit.wikimedia.org/r/#/c/419776/ (duration: 01m 14s)
  • 18:25 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/417329/ (duration: 01m 15s)
  • 18:11 maxsem@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 16s)
  • 18:09 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 15s)
  • 17:27 ppchelko@tin: Finished deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources (duration: 01m 23s)
  • 17:26 bsitzmann@tin: Finished deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327) (duration: 05m 38s)
  • 17:25 ppchelko@tin: Started deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources
  • 17:20 bsitzmann@tin: Started deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327)
  • 17:18 moritzm: installing dbus updates from stretch 9.4 point release
  • 16:43 ppchelko@tin: Finished deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints (duration: 15m 22s)
  • 16:28 ppchelko@tin: Started deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints
  • 16:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2050 for data checks (duration: 01m 15s)
  • 15:58 volans: updated facts on both CI puppet-compilers
  • 15:56 moritzm: pruning obsolete packages from jessie-wikimedia/experimental
  • 15:56 marostegui: Stop MySQL on s5 codfw master (db2052) this will break replication on s5 codfw
  • 15:51 godog: repool puppetmaster1002
  • 15:47 moritzm: installing libvirt security updates
  • 15:20 elukey: reboot druid1003 for kernel updates
  • 15:13 marostegui: Stop MySQL on s6 codfw master (db2039) this will break replicaiton on s6 codfw
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after socket path location update (duration: 01m 15s)
  • 15:05 _joe_: restarted jobrunner, jobchron on the eqiad jobrunners
  • 14:30 elukey: reboot druid1004 for kernel updates
  • 13:51 elukey: reboot kafka1001 (eventbus/job-queues eqiad) for kernel updates
  • 13:49 zeljkof: EU SWAT finished
  • 13:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 14s)
  • 13:33 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout, again. Last time didn't pick the right partman config
  • 13:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 15s)
  • 13:09 moritzm: restarting HHVM on canaries to pick up curl security update
  • 13:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule, clean expired rules (T189442) (duration: 01m 15s)
  • 12:54 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 12:36 moritzm: installing curl security updates on jessie/stretch
  • 12:26 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout
  • 12:08 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1007 after kernel security update (duration: 01m 14s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after socket path location update (duration: 01m 14s)
  • 11:59 moritzm: rebooting rdb1007 for kernel security update
  • 11:56 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1007 for kernel security update (duration: 01m 14s)
  • 11:52 marostegui: Stop MySQL on es1013 for socket path upgrade
  • 11:51 moritzm: rebooted rdb1005 for kernel security update
  • 11:49 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1005 after kernel security update (duration: 01m 14s)
  • 11:48 godog: reimage puppetmaster1002 with stretch
  • 11:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for socket path location update (duration: 01m 14s)
  • 11:42 godog: depool puppetmaster1002 for stretch reimage
  • 11:29 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1005 for kernel security update (duration: 01m 10s)
  • 11:16 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1003 after kernel security update (duration: 01m 14s)
  • 11:04 moritzm: rebooting rdb1003 for kernel security update
  • 11:01 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1003 for kernel security update (duration: 01m 14s)
  • 10:48 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1001 after kernel security update (duration: 01m 14s)
  • 10:32 moritzm: rebooting rdb1001 for kernel security update
  • 10:24 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1001 for kernel security update (duration: 01m 14s)
  • 10:22 ema: apt.w.o: upload varnish=5.1.3-1wm4 to jessie-wikimedia/main (upstream "extrachance" fixes) T174932
  • 10:12 gehel@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic1021.eqiad.wmnet
  • 09:56 ema: apt.w.o: move varnish=5.1.3-1wm3, varnish-modules=0.12.1-1+wmf1, libvmod-netmapper=1.6-1 from jessie-wikimedia/experimental to jessie-wikimedia/main T188545
  • 09:56 moritzm: installing curl security updates on Debian
  • 09:30 godog: repool puppetmaster2002
  • 09:16 jynus: reset slave all @db1051
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore normal weight for es1017 (duration: 01m 14s)
  • 08:44 godog: roll-restart thumbor in eqiad/codfw to enable access to swift private container
  • 08:42 jynus: end of maintenance for m2
  • 08:31 jynus: setting m2 as read only
  • 08:29 gilles: setZoneAccess done
  • 08:28 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 08:18 jynus: disable puppet on db1051, db1020 for switchover preparation
  • 08:06 ayounsi@tin: Finished deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1 (duration: 01m 02s)
  • 08:05 ayounsi@tin: Started deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1
  • 08:01 jynus: switching db2044 to be a direct replica of db1051
  • 07:49 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 01m 07s)
  • 07:48 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 05s)
  • 07:30 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 39s)
  • 07:29 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1017 (duration: 01m 14s)
  • 07:21 moritzm: reimaging mc2036 after hardware replacement T185587
  • 07:07 marostegui: Stop mariadb on es1017 for kernel, mariadb and socket location upgrade
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1017 (duration: 01m 14s)
  • 07:01 marostegui: Deploy schema change on db1084 - T187089 T185128 T153182
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 01m 15s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 (duration: 01m 15s)
  • 06:29 marostegui: Stop MySQL on db1064 for mariadb upgrade
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 10m 10s)
  • 00:25 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/client/includes/RecentChanges/ExternalChangeFactory.php: T189320 Use only local part of username when building the RC line (duration: 01m 18s)
  • 00:22 tgr@tin: Synchronized php-1.31.0-wmf.24/includes/user/ExternalUserNames.php: T189320 Add ExternalUserNames::getLocal() to get local part of username (duration: 01m 15s)
  • 00:20 ejegg: updated payments-wiki from 9068692c32 to 30f5f3edfb
  • 00:08 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/: VE fixes followup (duration: 01m 15s)
  • 00:03 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 15s)
  • 00:02 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 16s)

2018-03-14

  • 23:45 XenoRyet: updated payments-wiki from 86715f6e9e to 9068692c32
  • 23:45 tgr@tin: Synchronized wmf-config/Wikibase.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 14s)
  • 23:43 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:41 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:21 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 20s)
  • 23:18 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 15s)
  • 22:13 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/Thanks: T189752 (duration: 01m 16s)
  • 21:27 hoo: Ran scap pull on mwdebug1001 after testing https://gerrit.wikimedia.org/r/417180
  • 21:26 andrewbogott: rebuilding labtestweb2001 with Debian Stretch
  • 20:34 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.25
  • 20:32 demon@tin: Synchronized php: symlink bump to wmf.25 (duration: 01m 14s)
  • 20:27 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c (duration: 05m 37s)
  • 20:24 demon@tin: Finished scap: trying a php5/hhvm theory (duration: 06m 37s)
  • 20:21 mholloway-shell@tin: Started deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c
  • 20:17 demon@tin: Started scap: trying a php5/hhvm theory
  • 20:16 demon@tin: Finished scap: scapping, pt. 2. prior one failed because i tested something (duration: 69m 43s)
  • 19:06 demon@tin: Started scap: scapping, pt. 2. prior one failed because i tested something
  • 19:06 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "demon"; reason is "rebuilding l10n" (duration: 00m 00s)
  • 18:20 jynus: running pt-table-checksum on all m2, some lag will happen on passive replicas
  • 18:16 jynus: running pt-table-checksum on all m1, some lag will happen on passive replicas
  • 17:56 demon@tin: Started scap: rebuilding l10n
  • 17:55 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/CentralNotice: updates! (duration: 01m 16s)
  • 17:54 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "reedy"; reason is "updates!" (duration: 00m 00s)
  • 17:54 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralNotice: updates! (duration: 01m 18s)
  • 17:26 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/416489 (duration: 01m 14s)
  • 17:18 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/419077 (duration: 01m 15s)
  • 16:58 hoo: Manually running extensions/Wikibase/repo/maintenance/dispatchChanges.php on terbium, so that dispatching can catch up
  • 16:56 jynus: deploying new firewall rules to dbproxy1001 and 7
  • 16:40 moritzm: installing cron updates from stretch 9.4 point release
  • 16:35 demon@tin: Synchronized .gitignore: ignore scap logs (duration: 01m 15s)
  • 16:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1074 original weight (duration: 01m 13s)
  • 16:12 godog: temporarily add back puppetmaster2002 as a low-weight backend
  • 15:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 15:47 andrew@tin: Synchronized multiversion/MWMultiVersion.php: wikitech cleanup (duration: 01m 14s)
  • 15:25 XioNoX: Re-enabling BGP on cr2-codfw Zayo transit - T189452
  • 15:12 XioNoX: Disabling BGP on cr2-codfw Zayo transit - T189452
  • 15:02 jynus: disabling puppet in preparation for reimage of dbproxy1002 and 6
  • 14:59 moritzm: installing virt-what updates from stretch point release
  • 14:58 paravoid: rebooting furud
  • 14:44 ottomata: beginning migration of eventlogging analtyics from Kafka analytics to Kafka jumbo: T183297
  • 14:33 godog: depool puppetmaster2002 for reimage
  • 14:06 Reedy: created wbc_entity_uages on ruwikimedia T188456
  • 13:50 zeljkof: EU SWAT finished
  • 13:49 zfilipin@tin: Synchronized dblists/wikidataclient.dblist: SWAT: Revert "Add ruwikimedia to wikidataclient" (T188456) (duration: 01m 14s)
  • 13:42 zfilipin@tin: Synchronized docroot/noc/conf/: SWAT: Revert "Publish throttle-analyze at noc" (T187894) (duration: 01m 15s)
  • 13:21 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull (duration: 00m 33s)
  • 13:21 ppchelko@tin: Started deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull
  • 13:21 zfilipin@tin: Synchronized docroot/noc/conf/throttle-analyze.php.txt: SWAT: Publish throttle-analyze at noc (T187894) (duration: 01m 13s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination (duration: 00m 38s)
  • 13:20 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination
  • 13:12 zfilipin@tin: Synchronized dblists/commonsuploads.dblist: SWAT: Disable upload for non-admins on kowikiversity (T189021) (duration: 01m 14s)
  • 13:06 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Remove obsolete throttle rules, add one new (T189241) (duration: 01m 15s)
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 14s)
  • 12:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 12:22 kartik@tin: Finished deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c (duration: 03m 12s)
  • 12:19 kartik@tin: Started deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1074 (duration: 01m 14s)
  • 11:45 marostegui: Stop db1074 for kernel upgrade
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for data checks and kernel upgrade (duration: 01m 14s)
  • 11:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1018 after kernel and mariadb upgrade (duration: 01m 15s)
  • 11:02 moritzm: rebooting einsteinium / icinga.wikimedia.org for kernel security update
  • 10:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slwoly repool es1018 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:37 marostegui: Stop mariadb on es1018 for kernel and mariadb upgrade + change socket location
  • 10:35 moritzm: rebooting hydrogen for kernel security update
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1018 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2006 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:22 jynus: dropping testotrs from m2
  • 10:16 jynus: archiving and dropping bugzilla_testing from m2
  • 10:10 marostegui: Stop mariadb on pc2006 for kernel and mariadb upgrade + change socket location
  • 10:09 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2006 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:07 jynus: archiving and dropping testblog from m2
  • 10:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2005 after kernel and mariadb upgrade (duration: 01m 15s)
  • 09:50 marostegui: Stop mariadb on pc2005 for kernel and mariadb upgrade + change socket location
  • 09:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2005 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:44 moritzm: installing samba security update (just the client side libraries)
  • 09:40 marostegui: Stop mysql on es2015 to upgrade socket path
  • 09:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2004 after kernel and mariadb upgrade (duration: 01m 14s)
  • 09:34 marostegui: Stop mysql on es2014 to upgrade socket path
  • 09:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2004 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:23 marostegui: Stop mariadb on pc2004 for kernel upgrade
  • 09:13 marostegui: Stop mysql on es2013 to upgrade socket path
  • 09:08 marostegui: Stop mysql on es2012 to upgrade socket path
  • 08:57 ema: cp3041: restart varnish-be
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after kernel and mariadb upgrade (duration: 01m 15s)
  • 08:28 ema: cp3040: restart varnish-be
  • 08:21 hashar: Restarting the CI Jenkins
  • 07:45 marostegui: Reboot es2004 for kernel upgrade
  • 07:45 marostegui: Reboot es2003 for kernel upgrade
  • 07:34 marostegui: Reboot es2002 for kernel upgrade
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after kernel and mariadb upgrade (duration: 01m 14s)
  • 07:03 marostegui: Stop mariadb on es1013 for mariadb and kernel upgrade
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for kernel and mariadb upgrade (duration: 01m 14s)
  • 06:45 marostegui: Deploy schema change on db1064 with replication (this will generate lag on s4 on labs hosts) - T187089 T185128 T153182
  • 06:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 for alter table (duration: 01m 14s)
  • 06:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 after alter table (duration: 01m 15s)
  • 03:13 mutante: bacula is working again - restored missing file set (https://gerrit.wikimedia.org/r/419341 )
  • 02:49 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 40s)
  • 02:44 Jamesofur: deleted 46 archived files
  • 02:18 mutante: helium - running bacula-dir with -f in foreground revealed: ERROR TERMINATION at parse_conf.c:485 - Config error: Could not find config Resource mysql-srv-backups - line 7, col 33 of file /etc/bacula/jobs.d/bohrium.eqiad.wmnet-mysql-predump-piwik-Weekly-Wed-production.conf
  • 02:17 mutante: helium - bacula director process failed (Bacula interrupted by signal 11: Segmentation violation), icinga alerted. attempted to restart it. then: bacula-dir - the configtest failed!
  • 00:01 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: crwiki logo (duration: 01m 15s)
  • 00:00 reedy@tin: Synchronized static/images/project-logos/crwiki.png: (no justification provided) (duration: 01m 14s)

2018-03-13

  • 23:46 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/MobileFrontend/: T188825 (duration: 01m 18s)
  • 23:43 mutante: tin: chmod -R g+w /srv/mediawiki-staging/.git/objects/* ; chmod -R g+w /srv/mediawiki-staging/php-1.31.0-wmf.24/.git/objects/*
  • 23:35 Reedy: that was Enable VirtualPageViews on Hungarian Wikipedia T184793
  • 23:35 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 15s)
  • 23:26 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: moar logos (duration: 01m 15s)
  • 23:24 reedy@tin: Synchronized static/images/project-logos/: YOU GET A LOGO, YOU GET A LOGO. YOU ALL GET LOGOS (duration: 01m 16s)
  • 23:11 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHTML on 96 wikis T188010 (duration: 01m 16s)
  • 23:10 mutante: restbase-dev1006 - reinstalling, manually skipping " Volume group name already in use" (T185494)
  • 22:52 eileen: civicrm revision changed from c8458c4a2f to 9e79d63426, config revision is 08b7e6216e (Benevity comma fix)
  • 20:40 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.25
  • 20:09 demon@tin: Finished scap: bootstrap wmf.25 (duration: 67m 17s)
  • 19:02 demon@tin: Started scap: bootstrap wmf.25
  • 18:47 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:46 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:42 gehel: repool wdqs1004 & wdqs2001 now that data reload is completed T189548
  • 18:39 XenoRyet: updated civicrm from 8652db05f5 to c8458c4a2f
  • 18:37 moritzm: installing reportbug updates from stretch point release
  • 18:32 moritzm: installing w3m updates from stretch point release
  • 17:55 moritzm: installing ncurses updates from stretch point release
  • 17:53 moritzm: installing ncurses updates from stretch point release
  • 17:19 awight@tin: Started scap: Beta: Fix ORES thresholds and enable JADE, T181159, T176333
  • 17:06 godog: cleanup integration-slave-jessie-1001:/srv/pbuilder/build - T189587
  • 16:45 marostegui: Clean iptables rules on dbproxy1001 to leave it as dbproxy1006
  • 16:33 marostegui: Retroactive: cleared iptables rules on dbproxy1007
  • 16:32 jynus: restarting gerring on cobalt, stalled
  • 16:26 jynus: restarting gerring on cobalt, stalled
  • 16:18 jynus: update CNAME for m1-master and m2-master
  • 15:50 marostegui: Deploy schema change on db1097:3314 - T187089 T185128 T153182
  • 15:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 for alter table (duration: 00m 56s)
  • 15:39 jynus: upgrade and restart dbproxy1007
  • 15:33 vgutierrez: upgrading eqiad LVSs to pybal 1.15.2
  • 15:32 jynus: upgrade and restart dbproxy1001
  • 14:55 vgutierrez: upgrading codfw LVSs to pybal 1.15.2
  • 14:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 after alter table (duration: 00m 57s)
  • 14:51 jynus: stopping db2044 (this will make proxies complain about redundancy)
  • 14:42 moritzm: rebooting chromium for kernel security update
  • 14:11 chasemp: add chico to wmf-nda (verified nda things with moritz and all the goodness)
  • 13:29 jynus: stop db1001 for maintenance (proxies will temporarely complain about lack of redundancy)
  • 13:20 zeljkof: EU SWAT finished
  • 13:20 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: wmf-config: enable Singapore oversample as default on all wikis (T188652) (duration: 00m 57s)
  • 12:32 akosiaris: reboot ganeti VMs on row_A in eqiad for cache=none setting. T181121
  • 12:26 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 12:04 reedy@tin: Synchronized wmf-config/interwiki.php: T188537 (duration: 00m 57s)
  • 11:59 moritzm: rebooting DNS recursors in codfw for kernel security update
  • 11:43 _joe_: include our own etcd package (3.2.16) on stretch
  • 11:37 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 11:33 kartik@tin: Finished deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc (duration: 03m 30s)
  • 11:30 kartik@tin: Started deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc
  • 11:23 jynus: ran update-netboot-stretch.sh
  • 11:21 moritzm: rebooting DNS recursors in esams for kernel security update
  • 10:22 moritzm: rebooting DNS recursors in ulsfo and eqsin for kernel security update
  • 10:17 vgutierrez: upgrading esams LVSs to pybal 1.15.2
  • 10:08 jynus: stopping mysql on db1063 and db1051 to validate the depool before full reimage
  • 10:07 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1001 after kernel security update (duration: 00m 57s)
  • 10:00 gehel: shuttind down blazegraph on wdqs2001 for data transfer to wdqs1004 - T189548
  • 09:48 vgutierrez: upgrading ulsfo LVSs to pybal 1.15.2
  • 09:37 moritzm: rebooting poolcounter1001 for kernel security update
  • 09:15 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling poolcounter1001 for kernel security update (duration: 00m 56s)
  • 09:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 09:02 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 08:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 (duration: 00m 57s)
  • 06:58 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:56 marostegui: Deploy schema change on db1081 - T187089 T185128 T153182
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 for alter table (duration: 00m 56s)
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 after alter table (duration: 01m 19s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 30s)

2018-03-12

  • 22:52 eileen: update civicrm revision changed from a819d64d98 to 8652db05f5, config revision is 08b7e6216e - update civicrm.settings.php
  • 20:44 arlolra: Updated Parsoid to 16ced34 (T188670, T90902)
  • 20:37 arlolra@tin: Finished deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34 (duration: 10m 16s)
  • 20:36 andrewbogott: updated wikitech-static as detailed in https://wikitech.wikimedia.org/wiki/Wikitech-static#Manual_updates
  • 20:27 arlolra@tin: Started deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34
  • 20:26 andrewbogott: apt-get upgrade and reboot on wikitech-static
  • 20:25 andrewbogott: stopping apache2 on Silver in anticipation of it being decommissioned
  • 20:16 mholloway-shell@tin: Finished deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7 (duration: 05m 29s)
  • 20:11 mholloway-shell@tin: Started deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7
  • 19:53 MaxSem: disabled 2FA for User:Ctac (T189520)
  • 19:48 chasemp: labstore1003:~# service nfs-kernel-server restar
  • 19:44 chasemp: labstore1003:~# exportfs -ra
  • 18:53 Krinkle: Clean up left-over .wsp.bak files under frontend.navtiming* on graphite1001 (following T179622)
  • 18:44 mutante: added to DNS: romd.wikimedia.org (and romd.m) for Wikimedians of Romania and Moldova User Group
  • 18:43 mutante: added to DNS: hi.wikimedia.org (and hi.m) for Hindi Wikimedian User Group
  • 18:25 ppchelko@tin: Finished deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries (duration: 15m 25s)
  • 18:09 ppchelko@tin: Started deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries
  • 17:48 ottomata: removed kafka.protocol.version setting for varnishkafka webrequest instances; version should now be properly negotiated
  • 17:29 gehel@tin: Finished deploy [wdqs/wdqs@ce72538]: new wdqs updater (duration: 04m 47s)
  • 17:27 _joe_: poweroff mw2097-2134, T189111
  • 17:24 gehel@tin: Started deploy [wdqs/wdqs@ce72538]: new wdqs updater
  • 16:34 joal@tin: Finished deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug (duration: 08m 50s)
  • 16:25 joal@tin: Started deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug
  • 15:56 mepps: updated payments-wiki from ce68e8e80b to 86715f6e9e
  • 15:51 gehel: restart blazegraph on wdqs2001 to validate new config - T175919
  • 15:43 vgutierrez: eqsin LVSs: upgrade pybal to 1.15.2
  • 15:39 ottomata: bouncing kafka main-eqiad -> jumbo-eqiad mirror maker instances
  • 15:37 ottomata: disabling puppet on kafka1020,1022,1023 to test partition.assigment.strategy change for mirror maker
  • 15:28 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swift user for private containers (T187822) (duration: 00m 54s)
  • 15:26 demon@tin: Pruned MediaWiki: 1.31.0-wmf.23 [keeping static files] (duration: 01m 19s)
  • 15:24 vgutierrez: lvs1007,lvs1010 upgraded pybal to 1.15.2
  • 15:17 demon@tin: Pruned MediaWiki: 1.31.0-wmf.22 [keeping static files] (duration: 01m 22s)
  • 15:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 (duration: 02m 35s)
  • 15:12 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120 (duration: 00m 31s)
  • 15:11 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120
  • 15:08 joal: Provide correct log message for analytics/refinery scap deploy: Regular deploy of analytics-hadoop code
  • 15:07 joal@tin: Finished deploy [analytics/refinery@fd0a90f]: Regular a (duration: 04m 54s)
  • 15:07 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 (duration: 03m 58s)
  • 15:02 joal@tin: Started deploy [analytics/refinery@fd0a90f]: Regular a
  • 14:42 jynus: upgrade and restart es2001
  • 14:09 sbisson@tin: Finished deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test* (duration: 00m 34s)
  • 14:09 sbisson@tin: Started deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test*
  • 14:02 zeljkof: EU SWAT finished
  • 13:59 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 57s)
  • 13:31 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 08s)
  • 13:24 moritzm: synchronised PHP 7.2.3 to thirdparty/php72 for stretch-wikimedia
  • 13:17 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 09s)
  • 12:44 godog: start a catalog compilation on elnath to check for puppetdb4 diffs - T177253
  • 11:26 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1002 after kernel security update (duration: 03m 09s)
  • 11:14 moritzm: reboot poolcounter1002 for kernel security update
  • 11:10 jmm@tin: Synchronized wmf-config/ProductionServices.php: depooling poolcounter1002 for kernel security update (duration: 03m 09s)
  • 10:39 _joe_: running decommission_appserver on mw2097-2134 T189111
  • 10:23 XioNoX: labs->cloud vlan rename in eqiad - T187933
  • 09:56 elukey: restart kafka mirror maker (main eqiad -> jumbo) on kafka1020 (all consumers not assigned to any partition on kafka102*)
  • 09:53 moritzm: installing util-linux security updates
  • 09:31 _joe_: decommission mw2097-mw2134 from conftool T189111
  • 08:40 moritzm: rebooting iron for kernel security update
  • 08:32 ema: cp3033/cp3031: restart varnish-be
  • 08:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2015 after kernel upgrade (duration: 00m 58s)
  • 08:20 ema: cp3033/cp3031: set transaction_timeout to 60s
  • 08:14 marostegui: Stop MySQL on es2015 for kernel upgrade
  • 08:06 ema: cp3042: restart varnish-be
  • 08:03 ema: cp3042: set transaction_timeout to 30s
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2015 for kernel upgrade (duration: 00m 58s)
  • 07:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2014 after kernel upgrade (duration: 01m 01s)
  • 07:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 59s)
  • 07:26 marostegui: Stop MySQL on es2014 for kernel upgrade
  • 07:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 58s)
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3316 as vslow,dump in s6 - T184161 (duration: 00m 58s)
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3315 as vslow,dump in s5 - T184161 (duration: 00m 58s)
  • 06:27 marostegui: Deploy schema change on db1103:3314 - T187089 T185128 T153182
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 for alter table (duration: 01m 06s)
  • 02:52 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 11m 56s)

2018-03-11

  • 08:50 elukey: executed sudo rm /etc/logrotate.d/kafkatee-webrequest-analytics on oxygen/rhenium to stop daily cronspam

2018-03-10

  • 14:56 ema: cp1053: restart varnish-be
  • 13:29 ema: cp1068/cp1055: restart varnish-be

2018-03-09

  • 23:29 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/ReadingLists/src/Api/ApiQueryReadingListEntries.php: T189272 fix stupid ReadingLists typo breaking production (duration: 00m 54s)
  • 19:43 foks: changed global email for User:Mathmensch
  • 19:19 MaxSem: restarted my script on tin, now with more aggressive writes
  • 18:26 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/AbuseFilter/includes/AbuseFilter.class.php: Unbreak AbuseFilter tagging T189299 (duration: 00m 59s)
  • 17:35 andrew@tin: Finished deploy [horizon/deploy@9c234d6]: Another try at fixing T188458 (duration: 03m 00s)
  • 17:32 andrew@tin: Started deploy [horizon/deploy@9c234d6]: Another try at fixing T188458
  • 16:14 andrewbogott: test log
  • 16:07 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3034.esams.wmnet
  • 15:59 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303 (duration: 00m 38s)
  • 15:59 andrewbogott: moving wikitech dns record to point to misc-web and the new labweb cluster, https://gerrit.wikimedia.org/r/#/c/417926/
  • 15:59 ppchelko@tin: Started deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303
  • 15:54 andrew@tin: Finished deploy [horizon/deploy@f59f568]: rolling out a fix for T188458 (duration: 03m 11s)
  • 15:51 andrew@tin: Started deploy [horizon/deploy@f59f568]: rolling out a fix for T188458
  • 15:30 moritzm: installing zsh security update on trusty
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 after cloning db1113:3316 - T184161 (duration: 00m 58s)
  • 15:15 moritzm: installing sensible-utils security update on trusty (Debian already fixed)
  • 15:11 ema: cp-upload_esams: reboot for retpoline kernel updates T188092
  • 13:12 marostegui: Compress s6 on db1113:3316 - T184161
  • 12:41 elukey: manually executed systemctl reset-failed to some old (not present anymore) units on kafka analytics hosts
  • 12:26 marostegui: Compress s5 on db1113:3315 - T184161
  • 12:16 marostegui: Stop mysql on db1063 to clone db1113:3316 - T184161
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 to clone db1113:3316 - T184161 (duration: 00m 58s)
  • 12:11 jynus: dropping test databases on dbstore2* instances
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 11:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:15 marostegui: Stop MySQL on db1051 to clone db1113 - https://phabricator.wikimedia.org/T184161
  • 11:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 to clone db1113 - T184161 (duration: 00m 58s)
  • 09:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with normal load (duration: 00m 58s)
  • 09:22 ema: cp-misc_esams: reboot for retpoline kernel updates T188092
  • 08:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2058 and db2084 (duration: 00m 58s)
  • 08:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with low load (duration: 00m 58s)
  • 07:35 marostegui: Stop mariadb on db2058 and db2084 for mariadb+kernel upgrade
  • 07:34 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2058 and db2084 (duration: 00m 58s)
  • 07:33 marostegui: Logging for the record: es2013 was stopped and rebooted for mariadb and kernel upgrade
  • 07:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2013 (duration: 00m 58s)
  • 07:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2012, depool es2013 (duration: 00m 58s)
  • 06:52 marostegui: Stop MariaDB on es2012 to upgrade mariadb and kernel
  • 06:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2012 for kernel and mariadb upgrade (duration: 00m 58s)
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore es1019 normal weight (duration: 00m 59s)
  • 05:00 andrew@tin: Finished deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278 (duration: 02m 59s)
  • 04:57 andrew@tin: Started deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278
  • 00:40 thcipriani@tin: Synchronized static/images/project-logos: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART IV (duration: 00m 58s)
  • 00:38 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART III (duration: 00m 58s)
  • 00:36 thcipriani@tin: Synchronized static/images/project-logos/urwiki-2x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART II (duration: 00m 58s)
  • 00:33 thcipriani@tin: Synchronized static/images/project-logos/urwiki-1.5x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART I (duration: 00m 59s)
  • 00:03 urandom: set compression chunk length to 32, parsoid tables (group "enwiki") - T189057

2018-03-08

  • 23:10 urandom: set compression chunk length to 32, parsoid tables (group "wikipedia") - T189057
  • 22:31 urandom: set compression chunk length to 32, parsoid tables (group "commons") - T189057
  • 22:16 reedy@tin: Synchronized php-1.31.0-wmf.24/includes/specials/pagers/BlockListPager.php: T189251 (duration: 00m 59s)
  • 22:07 MaxSem: guess what? trying T187516 again
  • 21:41 urandom: set compression chunk length to 32, parsoid tables (group "others") - T189057
  • 21:15 otto@tin: Synchronized wmf-config/ProductionServices.php: Revert: point monolog avro producer back at Kafka analytics. Too many TCP connections? T188136 (duration: 00m 58s)
  • 21:09 sbisson@tin: Finished deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3) (duration: 04m 42s)
  • 21:04 sbisson@tin: Started deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3)
  • 20:40 urandom: set compression chunk length to 32, mobile tables - T189057
  • 20:34 urandom: set compression chunk length to 32, page_summary tables - T189057
  • 20:30 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to php-1.31.0-wmf.24
  • 20:26 thcipriani@tin: Synchronized php: Ensure symlink for 1.31.0-wmf.24 is up-to-date (duration: 01m 15s)
  • 19:52 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/Echo/: https://gerrit.wikimedia.org/r/#/c/417330/ and https://gerrit.wikimedia.org/r/#/c/417340/ (duration: 01m 21s)
  • 19:33 anomie: Running `cleanupUsersWithNoId.php --table recentchanges --prefix wikidata --force` on wikidata client wikis for T181731. This shouldn't create any local SUL accounts.
  • 19:29 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/: Hooks: Don't register beta features if they're enabled for all https://gerrit.wikimedia.org/r/#/c/417277/ (duration: 01m 14s)
  • 19:24 sbisson@tin: Finished deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test* (duration: 02m 40s)
  • 19:23 niharika29@tin: Synchronized wmf-config/CommonSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 15s)
  • 19:22 sbisson@tin: Started deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test*
  • 19:21 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 16s)
  • 18:43 bsitzmann@tin: Finished deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167 (duration: 06m 14s)
  • 18:37 bsitzmann@tin: Started deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167
  • 17:19 andrew@tin: Synchronized wmf-config/wikitech.php: wikitech varnish updates (duration: 01m 15s)
  • 17:05 jynus: stop and reboot db1114 for kernel regression
  • 16:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool es1019 with less weight after HW maintenance (duration: 01m 15s)
  • 16:32 bd808: Running wikireplica_dns from labcontrol1001
  • 16:14 cmjohnson: wdqs1004 down for systemboard replacement
  • 15:56 andrewbogott: restarting nova-fullstack on labnet1001
  • 15:54 andrewbogott: restarting nodepool again
  • 15:42 andrewbogott: stopping nodepool again because something isn't quite right
  • 15:41 marostegui: Power off es1019 - T187530
  • 15:32 otto@tin: Synchronized wmf-config/ProductionServices.php: Point Mediawiki Monolog at new Kafka jumbo-eqiad cluster: T188136 (duration: 01m 16s)
  • 15:29 ottomata: merging and then deploying mediawiki-config to point monolog avro kafka producer at new kafka jumbo cluster: https://phabricator.wikimedia.org/T188136
  • 15:29 andrewbogott: disabling puppet on labnodepool1001
  • 15:17 andrewbogott: silencing nova and other openstack alerts in anticipation of service interruptions for https://phabricator.wikimedia.org/T189005
  • 15:01 marostegui: Disable puppet on db1073 - T189005
  • 15:00 marostegui: Change topology in m5, db2037 to become a slave of db1073 - T189005
  • 14:56 oblivian@tin: Synchronized wmf-config/CommonSettings.php: Use EtcdConfig everywhere (duration: 01m 15s)
  • 14:38 zeljkof: EU SWAT finished
  • 14:38 marostegui: Stop mysql on es1019 - T187530
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: SWAT: Blacklist Web of Trust junk from being added to pages (T189148) (duration: 01m 15s)
  • 14:35 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTarget.js: SWAT: Follow-up I5357a909: Fix logic for autosave from edited state (T189071) (duration: 01m 16s)
  • 14:28 mobrovac@tin: Finished deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052 (duration: 00m 33s)
  • 14:27 mobrovac@tin: Started deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052
  • 14:26 vgutierrez: uploaded pybal_1.15.2_all.deb to apt.wikimedia.org jessie-wikimedia
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: 2017 wikitext editor: Enable by default on officewiki (T188028) (duration: 01m 16s)
  • 14:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create the rollbacker group at ar.wikinews (T189206) (duration: 01m 16s)
  • 13:56 gehel: restart wdqs-updater on wdqs1005 to validate new config option - T188716
  • 13:52 sbisson@tin: Finished deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers (duration: 08m 31s)
  • 13:44 moritzm: depooling mwdebug2001, the host will temporarily be using an HHVM build linked against libicu57 to perform some tests
  • 13:43 sbisson@tin: Started deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers
  • 13:40 elukey: eventlogging analytics migrated from eventlog1001 to eventlog1002
  • 13:35 ariel@tin: Finished deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly (duration: 00m 03s)
  • 13:35 ariel@tin: Started deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly
  • 13:29 ema: cp-ulsfo: reboot for retpoline kernel updates T188092
  • 12:50 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:47 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 fully (duration: 01m 16s)
  • 11:32 moritzm: installing isc-dhcp security updates
  • 10:43 moritzm: installing libvpx security updates
  • 10:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Change db1114 load (duration: 01m 16s)
  • 10:14 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on T181121
  • 10:13 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on
  • 09:57 dcausse: restaring mjolnir-kafka-daemon.service on relforge1002 to switch to kafka jumbo
  • 09:56 dcausse: restaring mjolnir-kafka-daemon.service on relforge1001 to switch to kafka jumbo
  • 09:56 _joe_: decommissioning mw2017-2099 T187467
  • 09:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 partially (duration: 01m 16s)
  • 09:44 moritzm: rearming keyholder on neodymium after reboot
  • 09:40 moritzm: rebooting neodymium for kernel security update
  • 09:22 ema: cp-eqsin: reboot for retpoline kernel updates T188092
  • 09:12 ema: cp3043: varnish-be-restart T189085
  • 09:08 moritzm: rebooting bast1001 for kernel security update
  • 08:58 elukey: restart varnish backend on cp3041 (failed fetches)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046, db2053 and db2060 after kernel upgrade (duration: 01m 15s)
  • 08:58 moritzm: reset RAC on bast1001, serial console was stuck
  • 08:50 elukey: rebooting analytics1003 (Hadoop Hive, Oozie, etc..) for kernel updates
  • 08:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2046, db2053 and db2060 for kernel upgrade (duration: 01m 17s)
  • 08:31 elukey: reboot analytics1002 (Hadoop master standby) for kernel upgrades
  • 08:28 marostegui: Stop MySQL on db2046, db2053 and db2060 for kernel upgrade
  • 08:19 elukey: reboot analytics1001 (Hadoop master) for kernel upgrade (temp failover to analytics1002)
  • 08:09 ema: cp3040: varnish-be-restart T189085
  • 08:00 ema: cp3032: varnish-be-restart T189085
  • 07:44 elukey: reboot kafka2003 (eventbus codfw) for kernel updates
  • 07:24 elukey: reboot kafka2002 (eventbus codfw) for kernel updates
  • 07:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 for maintenance - T187530 (duration: 01m 16s)
  • 07:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Revert: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 31s)
  • 04:27 Krinkle: Running whisper-mass-resize for ResourceLoader.* metrics on graphite1001 and graphite2001 (T179622)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 07m 37s)
  • 02:15 tgr@tin: Synchronized wmf-config/throttle.php: T189161 Temporarely remove account creation limit for event on Portuguese Wikipedia on March 08, 2018 (duration: 01m 10s)
  • 01:17 twentyafterfour: phabricator update completed
  • 01:13 twentyafterfour: preparing for phabricator update 2018-03-07/1
  • 00:37 thcipriani@tin: Synchronized wmf-config/db-eqiad.php: SWAT: wikitech: use FQDNs for m5 cluster members (duration: 01m 16s)
  • 00:28 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add configuration for CirrusSearch to instantly index new Wikidata items T183053 (duration: 01m 15s)
  • 00:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable loginOnly mode for local auth provider on group 2 T57420 (duration: 01m 16s)

2018-03-07

  • 23:36 MaxSem: aborted due to growing DB lag
  • 23:08 MaxSem: running script for T187516
  • 23:00 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/AntiSpoof/: https://gerrit.wikimedia.org/r/#/c/417013/ (duration: 01m 16s)
  • 22:52 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/417014/ (duration: 01m 20s)
  • 22:44 MaxSem: dumping centralauth.spoofuser from db1079
  • 22:27 ejegg: deployed patch for T171987 to 1.31.0-wmf.23
  • 22:23 ejegg: deployed patch for T171987 to 1.31.0-wmf.24
  • 21:51 herron: puppetdb server reboots complete — re-enabling puppet agents
  • 21:45 herron: temporarily disabling puppet agents while puppetdb servers nitrogen and nihal are rebooted for kernel updates
  • 21:24 thcipriani@tin: Synchronized wmf-config: Improve load-order documentation for CommonSettings and InitialiseSettings noop doc change (duration: 01m 18s)
  • 21:05 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: Switch wikitech to swift (duration: 01m 15s)
  • 20:58 andrew@tin: Synchronized wmf-config/filebackend.php: Preparing wikitech to use swift for images, step two (duration: 01m 12s)
  • 20:56 andrew@tin: Synchronized wmf-config/CommonSettings.php: Preparing wikitech to use swift for images, step one (duration: 01m 16s)
  • 20:45 andrew@tin: Synchronized multiversion/MWMultiVersion.php: (no justification provided) (duration: 01m 16s)
  • 20:27 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to php-1.31.0-wmf.24
  • 19:43 Amir1: ladsgroup@terbium:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https (T183019)
  • 19:35 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https on fawiki and hewiki (T183019)
  • 19:18 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=mediawikiwiki --force-protocol https (T183019)
  • 18:56 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: retry (duration: 01m 15s)
  • 18:42 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 16s)
  • 18:40 tgr@tin: Synchronized static/images/project-logos: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 17s)
  • 18:30 tgr@tin: Synchronized debug.json: T187468 Switch to mwdebug hosts in codfw too (duration: 01m 15s)
  • 18:26 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T57420 Enable loginOnly mode for local auth provider on group 1 (duration: 01m 20s)
  • 17:41 moritzm: rebooting restbase-test* for kernel security update
  • 16:55 ema: cp5001: reboot for retpoline kernel updates T188092
  • 16:46 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052 (duration: 00m 33s)
  • 16:46 ppchelko@tin: Started deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052
  • 16:08 elukey: updating pcc facts for new hosts
  • 15:54 moritzm: rebooting rdb* fallback hosts in eqiad for kernel security update
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 16s)
  • 15:26 marostegui: Set disk 32:2 on db1064 as offline
  • 15:20 moritzm: rebooting krypton (running grafana among others) for kernel security update
  • 15:17 reedy@tin: Synchronized wmf-config/throttle.php: T189121 (duration: 01m 15s)
  • 14:45 Amir1: EU SWAT is done
  • 14:42 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052 (duration: 00m 36s)
  • 14:41 ppchelko@tin: Started deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052
  • 14:37 moritzm: rebooting rdb* hosts in codfw for kernel security update
  • 14:37 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 16s)
  • 14:35 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 15s)
  • 14:27 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:19 _joe_: adding mwdebug200{1,2} to ganeti in codfw, T187468
  • 14:17 urandom: reducing compression chunk length to 32kb on "wikipedia_T_page__summary".data - T189057
  • 14:10 zfilipin@tin: Synchronized wmf-config/: SWAT: Load Wikibase Quality extensions using extension registration (T106104) (duration: 01m 17s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T188626) (duration: 01m 18s)
  • 14:01 urandom: setting trace probability to 0.0, restbase eqiad cassandra cluster - T189057
  • 13:22 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all refreshLinks jobs to EventBus, file #2 - T185052 (duration: 01m 15s)
  • 13:22 moritzm: rebooting tungsten for kernel security update
  • 13:21 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all refreshLinks jobs to EventBus - T185052 (duration: 01m 15s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052 (duration: 00m 43s)
  • 13:20 moritzm: rebooting install2002 for kernel security update
  • 13:19 ppchelko@tin: Started deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052
  • 10:55 marostegui: Deploy schema change on codfw s4 master (db2051) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182
  • 10:54 moritzm: rearmed keyholders on netmon1002 and netmon2001
  • 10:50 elukey: reboot stat100[56] for kernel upgrades
  • 10:49 moritzm: reboot memcached hosts in codfw for kernel security update
  • 10:34 moritzm: rebooting netmon2001 for kernel security update
  • 10:29 moritzm: rebooting netmon1002 for kernel security update
  • 10:26 moritzm: rebooting boron for kernel security update
  • 10:11 moritzm: rebooting openldap/WMCS servers for kernel security update
  • 10:05 moritzm: rebooting openldap/corp servers for kernel security update
  • 10:03 elukey: reboot analytics10[35,52] for kernel updates - hadoop hdfs journal nodes (didn't manage to complete the work yesterday)
  • 10:03 moritzm: rebooting pool counters in codfw for kernel security update
  • 10:02 akosiaris: upload apertium-rus-ukr_0.2.0~r82706-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:56 moritzm: rebooting tureis/roentgenium for kernel security update
  • 09:53 akosiaris: upload apertium-rus_0.2.0~r82706-1+wmf1 and apertium-ukr_0.1.0~r82563-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:46 moritzm: rebooting etherpad1001 (etherpad.wikimedia.org) for kernel security update
  • 09:31 moritzm: rebooting darmstadtium (docker registry) for kernel security update
  • 09:24 moritzm: rearming keyholder on sarin after reboot
  • 09:16 moritzm: rebooting sarin for kernel security update
  • 08:57 ema: cp3033: restart varnish-be, backend connections piling up (~12k)
  • 08:40 marostegui: Deploy schema change on s7 primary master db1062 - T153182 T185128
  • 08:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 after alter table (duration: 01m 16s)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2089,db2079 and db2065 after mariadb and kernel upgrade (duration: 01m 16s)
  • 07:30 marostegui: Stop mariadb on db2089,db2079 and db2065 for kernel upgrade
  • 07:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2089,db2079 and db2065 (duration: 01m 15s)
  • 06:49 marostegui: Deploy schema change on db1079 with replication enabled (this will generate lag on labs) - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 for alter table (duration: 01m 16s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 06m 03s)
  • 00:57 Amir1: Evening SWAT is done
  • 00:32 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Re-enable Wikidata descriptions (T188182) (duration: 01m 16s)

2018-03-06

  • 23:10 MaxSem: cancelled
  • 23:05 MaxSem: refreshing spoofuser
  • 23:00 MaxSem: dumping centralauth.spoofuser from db1094
  • 21:22 mutante: restbase-dev1006 powercycled via console (T185494)
  • 20:49 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.24
  • 20:44 ottomata: reverted change to point mediawiki monolog kafka producers at kafka jumbo-eqiad until deployment train is done T188136
  • 20:36 mutante: phab1001 (phabricator) - rebooting for maintenance
  • 20:35 ottomata: pointing mediawiki monolog kafka producers at kafka jumbo-eqiad cluster: T188136
  • 20:08 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache (duration: 29m 13s)
  • 19:39 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache
  • 18:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af (duration: 05m 28s)
  • 18:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af
  • 18:22 godog: puppet-merge Revert: Use hiera3 role/nuyaml backends on >= stretch
  • 17:58 marostegui: Reload haproxy on dbproxy1004 and dbproxy1009
  • 17:53 thcipriani: starting branch cut for 1.31.0-wmf.24
  • 17:53 andrewbogott: disabling puppet and apache on labpuppetmatser1001 and 1002
  • 17:47 moritzm: rebooting dbmonitor1001 for kernel security update
  • 17:42 moritzm: rebooting dbmonitor2001 for kernel security update
  • 17:38 moritzm: rebooting hassaleh for kernel security update
  • 17:34 vgutierrez: update pybal to 1.15.1 on lvs5003
  • 17:32 vgutierrez: update pybal to 1.15.1 on lvs1010
  • 17:28 vgutierrez: uploaded pybal_1.15.1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 after alter table (duration: 00m 58s)
  • 16:58 cmjohnson1: powering off rhenium to reset the idrac
  • 16:44 sbisson@tin: Finished deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch (duration: 05m 47s)
  • 16:38 sbisson@tin: Started deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch
  • 16:11 oblivian@tin: Synchronized wmf-config: Fetch data from etcd on all appservers (duration: 01m 01s)
  • 16:01 marostegui: Deploy schema change on db1069 - T187089 T185128 T153182
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 for alter table (duration: 00m 57s)
  • 15:54 jynus: deploying new query killer logic to all wikidata (s8) db replicas T188505
  • 15:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 after alter table (duration: 00m 57s)
  • 15:51 moritzm: installing libvpx security updates
  • 15:50 oblivian@tin: Synchronized wmf-config: Expose etcd last modified index (duration: 01m 00s)
  • 15:45 moritzm: rebooting ununpentium for kernel security update
  • 15:39 oblivian@tin: Finished scap: Deploying Expose the latest modified index seen by EtcdConfig (duration: 09m 49s)
  • 15:29 oblivian@tin: Started scap: Deploying Expose the latest modified index seen by EtcdConfig
  • 15:28 moritzm: rebooting bromine for kernel security update
  • 15:19 mobrovac@tin: Synchronized php-1.31.0-wmf.23/includes/jobqueue/JobQueueSecondTestQueue.php: [JobQueueSecondTestQueue] Support read-only mode - T185052 (duration: 00m 58s)
  • 15:09 vgutierrez: update to pybal 1.15.0 on lvs5003
  • 15:02 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Article counts: Change 'comma' method to 'any' - T188472 (duration: 01m 00s)
  • 14:50 vgutierrez: update pybal to 1.15.0 on lvs1010
  • 14:46 hashar: tin: /srv/mediawiki-staging/php-1.31.0-wmf.23 rebased on tip of https://gerrit.wikimedia.org/r/#/c/416686/ (that revert a merge of master branch)
  • 14:42 gehel: rebooting maps1* (eqiad) for kernel security update completed
  • 14:36 ottomata: beginning migration of webrequest text varnishkafka logs from Kafka analytics to Kafka jumbo-eqiad T185136
  • 14:21 moritzm: rebooting labweb* for kernel security update
  • 14:13 moritzm: rebooting sca* for kernel security update
  • 14:07 gehel: rebooting maps1* (eqiad) for kernel security update
  • 14:07 moritzm: rebooting pybal-test for kernel security update
  • 14:00 _joe_: SWAT is suspended for investigation on tin's git status
  • 14:00 moritzm: rebooting oxygen for kernel security update
  • 13:16 moritzm: powercycling ms-be1038, stuck after reboot
  • 13:10 marostegui: Deploy schema change on db1094 - T187089 T185128 T153182
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 for alter table (duration: 00m 58s)
  • 12:55 moritzm: rebooting URL downloaders for kernel security update
  • 12:51 mobrovac@tin: Finished deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052 (duration: 00m 34s)
  • 12:50 mobrovac@tin: Started deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 after alter table (duration: 00m 58s)
  • 12:33 moritzm: rebooting mwlog* for kernel security update
  • 12:04 moritzm: rebooting graphite hosts in eqiad for kernel security update
  • 11:29 moritzm: rebooting k8s masters for kernel security update
  • 11:05 elukey: reboot analytics10[28,35,52] for kernel updates (one at the time, hadoop hdfs journal nodes)
  • 10:46 moritzm: powercycling ms-be1021, stuck after reboot
  • 10:45 akosiaris@tin: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 01m 22s)
  • 10:43 moritzm: rearming keyholder on naos after reboot
  • 10:39 akosiaris: emergency add a captcha in metawiki contact pages like https://meta.wikimedia.org/wiki/Special:Contact/Stewards to stop bot abuse. phab Task to be filed later on
  • 10:39 godog: reboot ms-be1013 to try fix disk ordering
  • 10:35 moritzm: rebooting naos for kernel security update
  • 10:32 moritzm: rearming keyholder on tin after reboot
  • 10:30 gehel: kafka poller active on all production wdqs nodes - T188252
  • 10:28 moritzm: rebooting tin for kernel security update
  • 10:20 gehel: reboot completed for maps2* and maps-test*
  • 09:51 moritzm: rebooting graphite hosts in codfw for kernel security update
  • 09:42 marostegui: Stop MySQL on db1107 for mariadb and kernel upgrade
  • 09:41 vgutierrez: pybal_1.15.0_all.deb to apt.wikimedia.org jessie-wikimedia
  • 09:40 marostegui: Start proxysql on wasat
  • 09:38 moritzm: rebooting wezen for kernel security update
  • 09:27 elukey: reboot kafka2001 (eventbus codfw) for kernel updates
  • 09:24 marostegui: Deploy schema change on db1086 - T187089 T185128 T153182
  • 09:18 marostegui: Stop and reboot db1086 for kernel and mariadb upgrade
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 for alter table (duration: 00m 57s)
  • 09:17 moritzm: rebooting swift backend servers in eqiad for kernel security update
  • 09:17 moritzm: rebooting wwift backend servers in eqiad for kernel security update
  • 09:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 after alter table (duration: 00m 57s)
  • 09:05 gehel: rolling restart of maps* for kernel upgrade
  • 08:50 elukey: reboot meitnerium (archiva) for kernel updates
  • 08:38 paravoid: rebooting furud
  • 08:35 moritzm: rebooting wasat for kernel security update
  • 08:30 elukey: drain+reboot analytics[1065-1067] for kernel updates
  • 08:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Update db1069 IP (duration: 00m 57s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Update db1069 IP (duration: 00m 57s)
  • 08:15 moritzm: rebooting ruthenium for kernel security update
  • 08:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Revert depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 57s)
  • 08:10 moritzm: rebooting bast5001 for kernel security update
  • 08:01 elukey: drain+reboot analytics[61,63,64] for kernel updates
  • 07:59 moritzm: rebooting tegmen for kernel security update
  • 07:43 marostegui: Stop mysql on db2090 db2080 db2076 db2073 db2067 for mariadb and kernel upgrade
  • 07:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 58s)
  • 07:36 moritzm: rebooting remaining swift backend servers in codfw for kernel security update
  • 07:18 marostegui: Stop MySQL on db2093 to get some data from the event scheduler
  • 06:56 marostegui: Deploy schema change on db1101:3317 - T187089 T185128 T153182
  • 06:51 marostegui: Stop mysql on db2037 to upgrade it
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 for alter table (duration: 00m 58s)
  • 05:00 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend.php: T180183: I6d72873b9d3 (duration: 00m 56s)
  • 04:59 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 - Ie5a164a9e2b (duration: 00m 57s)
  • 04:58 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta: no-op (duration: 00m 54s)
  • 04:57 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend-labs.php: beta: no-op (duration: 00m 57s)
  • 04:29 bblack: eqsin router maintenance starting soon-ish. all of eqsin will be offline and isn't in production service to begin with. We've tried to downtime all the things, but don't be shocked at spurious alerts! - T187807
  • 04:08 krinkle@tin: Synchronized multiversion/MWMultiVersion.php: Ia2acf57c6 (duration: 00m 57s)
  • 04:01 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 (duration: 01m 33s)
  • 02:26 tgr@tin: Synchronized wmf-config/CommonSettings.php: T186296 Increase ReadingLists list size limit to 5k (duration: 01m 06s)
  • 02:07 tgr@tin: Finished scap: T187226#4025352 update ReadingLists (duration: 18m 49s)
  • 01:48 tgr@tin: Started scap: T187226#4025352 update ReadingLists
  • 01:00 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: refresh wmf-config/InitialiseSettings, seems to have stuck in old state on some servers after doing the initial sync in the wrong order (duration: 00m 57s)
  • 00:54 tgr@tin: Synchronized wmf-config: T57420 Enable loginOnly mode for local auth provider on group 0 (duration: 01m 00s)
  • 00:41 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op I33f09b164e7 (duration: 00m 58s)
  • 00:38 krinkle@tin: Synchronized wmf-config/CommonSettings-labs.php: beta-only: I02a4d4 (duration: 00m 57s)

2018-03-05

  • 22:44 bawolff@tin: Synchronized php-1.31.0-wmf.23/includes/logging/LogPager.php: T188145 (duration: 00m 58s)
  • 21:32 arlolra: Updated Parsoid to d115592 (T188591)
  • 21:25 arlolra@tin: Finished deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592 (duration: 12m 12s)
  • 21:13 arlolra@tin: Started deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592
  • 20:04 gehel@tin: Finished deploy [wdqs/wdqs@1983ddf]: wdqs GUI update (duration: 01m 36s)
  • 20:03 gehel@tin: Started deploy [wdqs/wdqs@1983ddf]: wdqs GUI update
  • 20:02 hashar@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase: Fix empty condition list in metadata lookup - T188313 (duration: 01m 58s)
  • 19:51 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/416219/ (duration: 00m 57s)
  • 19:43 maxsem@tin: Synchronized php-1.31.0-wmf.23/extensions/Cite: https://gerrit.wikimedia.org/r/#/c/416467/ (duration: 00m 58s)
  • 19:30 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update (duration: 02m 36s)
  • 19:28 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update
  • 19:23 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416456/ (duration: 00m 58s)
  • 19:21 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 01m 23s)
  • 19:20 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 19:14 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416457/ (duration: 00m 58s)
  • 18:54 jynus: stop slave on db2044
  • 18:24 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken (duration: 00m 54s)
  • 18:23 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken
  • 18:20 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 03m 08s)
  • 18:16 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 17:34 elukey: drain + reboot analytics10[58-60] for kernel updates
  • 17:32 bd808: Added zhuyifei1999_ and chicocvenancio to the "toollabs-trusted" gerrit group
  • 16:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186699 (duration: 00m 57s)
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 after alter table (duration: 00m 57s)
  • 16:00 elukey: test
  • 15:56 akosiaris: upload tiller on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:56 akosiaris: upload helm on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:55 urandom: setting trace probability to 0.001 (.1%), eqiad datacenter, restbase cassandra cluster
  • 15:52 urandom: updating `system_traces` keyspace replication strategy, restbase cassandra cluster
  • 15:51 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all of the cdnPurge to EventBus, file 2/2 - T188540 (duration: 00m 57s)
  • 15:50 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 15:49 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all of the cdnPurge to EventBus, file 1/2 - T188540 (duration: 00m 57s)
  • 15:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka (duration: 00m 35s)
  • 15:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka
  • 15:42 marostegui: stop and poweroff db1069 for rack change - T186699
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186699 (duration: 00m 57s)
  • 15:41 elukey: drain + reboot analytics 1055->57 for kernel updates
  • 15:38 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch 50% for refreshLinks to EventBus - T185052 (duration: 00m 57s)
  • 15:31 ppchelko@tin: Finished deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs (duration: 00m 39s)
  • 15:31 ppchelko@tin: Started deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs
  • 15:28 marostegui: Mark as failed disk 32:9 on db1068 (s4 primary master) - T188187
  • 15:20 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobExecutor.php: [JobExecutor] Wait for the replicas if the transaction takes too long (duration: 00m 57s)
  • 15:14 moritzm: rebooting webperf2001 for kernel security update
  • 14:57 hashar: European SWAT completed
  • 14:57 hashar@tin: Finished scap: 2017 wikitext editor: Simplify config part 2 (duration: 02m 57s)
  • 14:54 hashar@tin: Started scap: 2017 wikitext editor: Simplify config part 2
  • 14:52 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:48 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:44 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable rollbacker user right at arwikiversity - T188633 (duration: 00m 57s)
  • 14:41 hashar@tin: Finished scap: core + Flow, master/replicate race condition - T182358 T184670 (duration: 04m 24s)
  • 14:36 hashar@tin: Started scap: core + Flow, master/replicate race condition - T182358 T184670
  • 14:34 elukey: graphite metrics mw.error.* deprecated in T188749
  • 14:31 hashar@tin: Finished scap: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 23m 08s)
  • 14:11 hashar: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=bdwikimedia translate # T188853
  • 14:08 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 14:06 hashar@tin: scap aborted: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 00m 16s)
  • 14:06 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 13:55 moritzm: rolling reboot of swift backends in codfw for kernel security update
  • 13:49 moritzm: rebooting releases2001 for kernel security update
  • 13:37 moritzm: rebooting neon for kernel security update
  • 13:37 mobrovac@tin: Started restart [cpjobqueue/deploy@b5255f0]: Force RecordLintJob rebalance in Kakfa - T188870
  • 13:04 moritzm: rebooting bast4002 for kernel security update
  • 13:00 marostegui: Deploy schema change on db1098:3317 - T187089 T185128 T153182
  • 13:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for alter table (duration: 00m 57s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:40 moritzm: rebooting bast4001 for kernel security update
  • 12:30 marostegui: Remove db1011 from tendril as it will be decommissioned - T184703
  • 12:19 moritzm: installing libvpx security updates
  • 12:13 moritzm: installing wavpack security updates
  • 12:08 moritzm: installing freexl security updates
  • 11:59 moritzm: upgrading tor on radium
  • 11:40 moritzm: updating tor packages to 0.3.2.10
  • 11:19 moritzm: running "racadm racreset" on rhenium, mgmt inaccessible
  • 11:09 elukey: drain + reboot analytics10[50,51,53,54] for kernel updates
  • 10:53 moritzm: rebooting bast2001 for kernel security update
  • 10:46 moritzm: rebooting lithium for kernel security update
  • 10:24 elukey: drain + reboot analytics10[46-49] for kernel updates
  • 10:23 moritzm: rolling reboot of logstash* for kernel security update
  • 09:33 godog: roll restart swift in codfw to add thumbor private user
  • 09:15 marostegui: Deploy schema change on s7 codfw master (db2040), this will generate lag on codfw - T187089 T185128 T153182
  • 09:01 godog: roll-restart thumbor to apply https://gerrit.wikimedia.org/r/416240
  • 08:54 marostegui: Stop mariadb on db2037 to copy it to db1073
  • 08:25 marostegui: Stop MySQL on db2078 for mariadb and kernel upgrade
  • 07:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1073 from config (duration: 00m 58s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1073 from config (duration: 00m 59s)
  • 07:06 marostegui: Deploy schema change on s2 primary master db1054 - T185128 T153182
  • 02:08 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2018-03-04

  • 20:16 tgr: T188721 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --ignorestatus --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 18:05 musikanimal: T188721 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 15:59 elukey: powercycle stat1004 - available via mgmt, root login freezes while trying

2018-03-03

  • 14:16 akosiaris: 13:56:20 ema: powercycle ganeti1005 T181121
  • 13:56 ema: powercycle ganeti1005
  • 13:25 andrewbogott: forced quota update in admin-monitoring as well; the reserved fixed_ip value was incorrect
  • 13:23 andrewbogott: forcing quota update in nova with update quota_usages set reserved='-1' where project_id='contintcloud';
  • 13:10 andrewbogott: restarting rabbitmq-server on labcontrol1001
  • 13:08 andrewbogott: retarting nodepool
  • 13:05 andrewbogott: restarting nova-conductor
  • 13:02 andrewbogott: stopping nodepool for a bit while investigating openstack issues
  • 02:14 chasemp: labnodepool1001:~# service nodepool start
  • 01:30 chasemp: root@labnet1001:~# service nova-fullstack restart
  • 01:21 chasemp: labnodepool1001:~# service nodepool stop

2018-03-02

  • 19:44 jynus: restarting labsdb1010
  • 17:22 mepps: updated payments-wiki 498f49a758 to ce68e8e80b
  • 15:19 elukey: drain + reboot analytics10[41-45] for kernel updates
  • 15:15 moritzm: rebooting auth* for kernel security updates
  • 13:46 elukey: drain + reboot analytics10[38,39,40,41] for kernel updates
  • 13:22 elukey: drain + reboot analytics10[33,34,36,37] for kernel updates
  • 13:17 moritzm: upgrading labtest trusty hosts to latest 4.4 kernel
  • 12:23 moritzm: rebooting kubetcd/kubestagetcd for kernel security update
  • 12:00 moritzm: rebooting etcd* for kernel security updates
  • 11:58 elukey: drain + reboot analytics10[29,31,32] for kernel updates
  • 11:33 moritzm: draining restbase1018 for eventual reboot for kernel security update
  • 11:28 akosiaris: upload to apt.wikimedia.org component thirdparty/ci distro jessie-wikimedia docker-ce_17.12.1~ce-0~debian_amd64 T177499
  • 11:07 moritzm: rebooting mwdebug* for kernel security update
  • 10:54 ema: spare LVSs lvs[1011-1012], lvs[4001-4004]: reboot for retpoline kernel updates T188092
  • 10:53 moritzm: draining restbase1017 for eventual reboot for kernel security update
  • 10:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 (duration: 00m 57s)
  • 10:18 moritzm: draining restbase1016 for eventual reboot for kernel security update
  • 10:18 jynus: shutting down labsdb1010
  • 10:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 56s)
  • 10:01 elukey: deleted /etc/burrow/* from zookeeper main eqiad/codfw after https://gerrit.wikimedia.org/r/415818 (garbage to cleanup)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 57s)
  • 09:40 moritzm: draining restbase1015 for eventual reboot for kernel security update
  • 09:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1114 in s1 after cloning it from db1073 - T183469 (duration: 01m 01s)
  • 08:57 moritzm: rebooting scb1004 for kernel security update (was omitted from earlier reboots due to hardware issues on scb1003)
  • 08:51 moritzm: repooling scb1003 after memory module was replaced (T188385)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 57s)
  • 07:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:11 moritzm: rebooting xenon/praseodymium/cerium for kernel security update
  • 07:11 moritzm: rebooting xenon/praseodymium/xenon for kernel security update
  • 06:52 marostegui: Stop MySQL on db1073 to clone db1114 - T183469
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 to clone db1114 - T183469 (duration: 00m 58s)
  • 02:48 legoktm: manually purged ExtensionDistributor cache (T188692)
  • 01:54 mutante: cobalt (gerrit) - rebooting for kernel upgrade
  • 01:46 mutante: LDAP: added lucaswerkmeister-wmde to 'wmde' and 'nda' groups (T188105)
  • 00:49 ebernhardson@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: T148603: (duration: 00m 57s)
  • 00:48 herron: fermium (lists) and mx systems rebooted for kernel update
  • 00:46 ebernhardson@tin: Synchronized php-1.31.0-wmf.23/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT T187148: Start cirrus query explorer AB test (duration: 00m 57s)
  • 00:25 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148 Configure Cirrus AB test (step 2) (second try) (duration: 00m 57s)
  • 00:23 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: T187148 Configure Cirrus AB test (step 1) (second try) (duration: 00m 57s)
  • 00:12 ebernhardson@tin: Synchronized wmf-config/: REVERT SWAT: T187148 Configure Cirrus AB test (duration: 00m 59s)
  • 00:09 ebernhardson@tin: Synchronized wmf-config/: SWAT: T187148 Configure Cirrus AB test (duration: 01m 00s)

2018-03-01

  • 22:35 gehel: rolling restart of elsticsearch / cirrus - eqiad complete, cluster is green
  • 21:45 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.23
  • 21:33 bsitzmann@tin: Finished deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833) (duration: 05m 15s)
  • 21:28 bsitzmann@tin: Started deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833)
  • 21:17 thcipriani@tin: Synchronized php-1.31.0-wmf.23/extensions/GeoData/includes/api/ApiQueryGeoSearchElastic.php: Fix undefined property error in ApiQueryGeoSearchElastic T188659 (duration: 01m 15s)
  • 20:30 thcipriani@tin: Synchronized php: php link to 1.31.0-wmf.23 (duration: 01m 12s)
  • 20:29 andrewbogott: restarting labweb1002
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.23
  • 20:15 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/specials/pagers/NewPagesPager.php: SWAT: NewPagesPages: Use array_merge rather than + for RC query info fields T188555 (duration: 01m 14s)
  • 20:15 andrewbogott: rebooting labweb1001
  • 19:56 thcipriani@tin: Synchronized langlist-labs: SWAT: beta: add nlwiki to langlist T188582 (beta-only change) (duration: 01m 13s)
  • 19:50 gehel: new kafka based poller for wdqs now enabled on wdqs2001 - T188252
  • 19:48 thcipriani@tin: Synchronized wmf-config/throttle-analyze.php: SWAT: Revert "Automatically include commons and wikidata in $wmgThrottlingExceptions" (duration: 01m 14s)
  • 19:36 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollback for editors at zh_classicalwiki T188064 (duration: 01m 14s)
  • 19:31 gehel@tin: Finished deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues (duration: 02m 12s)
  • 19:29 gehel@tin: Started deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues
  • 19:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable responsive references by default on rowiki T187997 (duration: 01m 15s)
  • 19:21 mutante: scb1003 depooled scb1003 from all services on scb because it went down, including mgmt
  • 19:20 dzahn@neodymium: conftool action : set/pooled=no; selector: name=scb1003.eqiad.wmnet
  • 19:17 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Make last throttle limit raise work accross all wikis T188630 (duration: 01m 13s)
  • 19:15 mutante: powercycling crashed scb1003
  • 19:13 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Fix throttle date for outreach dashboard T188630 (duration: 01m 13s)
  • 18:47 demon@tin: Synchronized wmf-config/: killing extension-list-labs (duration: 01m 17s)
  • 18:45 demon@tin: Synchronized wmf-config/InitialiseSettings.php: disable performance inspector in prod explicitly (duration: 01m 14s)
  • 18:43 demon@tin: Synchronized docroot/noc/: killing extension-list-labs (duration: 01m 14s)
  • 18:13 bsitzmann@tin: Finished deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833) (duration: 06m 01s)
  • 18:07 bsitzmann@tin: Started deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833)
  • 17:51 gehel: depooling wdqs2001 and switching to kafka poller - T188252
  • 17:47 gehel: restarting wdqs-updater on wdqs1004 -T188045
  • 17:46 mutante: re-enabling icinga notifications for wdqs1004 services, ethernet cable has been replaced (T188045)
  • 17:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 01m 14s)
  • 17:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 28s)
  • 17:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 13s)
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 01m 13s)
  • 16:41 jynus: reimporting database testreduce_0715 from db1009 to db2037
  • 16:36 marostegui: Restart mariadb on db1093 for binlog format change - T186321
  • 16:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T186321 (duration: 01m 13s)
  • 16:14 moritzm: rebooting hafnium for kernel security update
  • 16:06 marostegui: Fix s7 replication on labsdb1010 - T186579
  • 16:00 moritzm: rebooting radium (tor relay) for kernel security update
  • 15:52 moritzm: draining restbase1014 for eventual reboot for kernel security update
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 as API (duration: 01m 13s)
  • 15:32 bblack: disabling puppet on A:cp for deploy of https://gerrit.wikimedia.org/r/#/c/415204/ and friends
  • 15:30 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.23) - T188540 (duration: 01m 14s)
  • 15:26 mobrovac@tin: Synchronized php-1.31.0-wmf.22/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.22) - T188540 (duration: 01m 13s)
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 after alter table (duration: 01m 13s)
  • 15:22 moritzm: draining restbase1013 for eventual reboot for kernel security update
  • 15:19 zeljkof: EU SWAT finished
  • 15:18 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/Popups: SWAT: Fix: dont assume thumbnail URLs contain pixel size (T187955) (duration: 01m 14s)
  • 15:17 moritzm: rolling restart of swift frontends in eqiad for kernel security update
  • 15:12 godog: upload puppetdb 4.4.0-1~wmf1 to component/puppetdb4 - T177253
  • 15:00 ema: eqiad LVSs: reboot for retpoline kernel updates T188092
  • 14:36 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Import sources on maiwikimedia (T188374) (duration: 01m 13s)
  • 14:28 moritzm: rolling restart of swift frontends in codfw for kernel security update
  • 14:26 moritzm: draining restbase1012 for eventual reboot for kernel security update
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Quiz Extension at zhwikibooks (T188213) (duration: 01m 14s)
  • 14:12 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 2/2 - T188540 (duration: 01m 13s)
  • 14:10 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 1/2 - T188540 (duration: 01m 14s)
  • 14:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540 (duration: 00m 44s)
  • 14:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540
  • 13:54 moritzm: draining restbase1011 for eventual reboot for kernel security update
  • 13:50 ema: codfw LVSs: reboot for retpoline kernel updates T188092
  • 13:33 gehel: force merging enwiki_general index on codfw to reclaim space
  • 13:18 moritzm: draining restbase1010 for eventual reboot for kernel security update
  • 13:17 elukey: reboot kafka-jumbo100[5,6] for kernel updates
  • 13:16 ema: esams LVSs: reboot for retpoline kernel updates T188092
  • 12:44 moritzm: draining restbase1009 for eventual reboot for kernel security update
  • 12:39 moritzm: rolling reboot of parsoid in eqiad for kernel security update
  • 12:27 elukey: reboot kafka-jumbo1004 for kernel updates
  • 12:21 elukey: reboot kafka1023 for kernel updates
  • 11:59 moritzm: draining restbase1008 for eventual reboot for kernel security update
  • 11:48 moritzm: powercycling wtp2013, stuck in reboot
  • 11:36 elukey: reboot kafka-jumbo1003 for kernel updates
  • 11:33 jynus: restarting labsdb1011
  • 11:32 elukey: reboot kafka1022 for kernel updates
  • 11:20 elukey: reboot kafka-jumbo1002 for kernel security updates
  • 11:15 moritzm: draining restbase1007 for eventual reboot for kernel security update
  • 11:13 ema: ulsfo LVSs: reboot for retpoline kernel updates T188092
  • 11:08 elukey: reboot kafka1020 for kernel updates
  • 10:38 ema: eqsin LVSs: reboot for retpoline kernel updates T188092
  • 10:32 moritzm: rolling reboot of parsoid in codfw for kernel security update
  • 10:27 moritzm: draining restbase2012 for eventual reboot for kernel security update
  • 10:20 moritzm: rebooting labnodepool1001 for kernel security update
  • 10:02 moritzm: rebooting contint1001 for kernel security update
  • 09:59 elukey: reboot kafka1014 for kernel security updates
  • 09:57 moritzm: draining restbase2011 for eventual reboot for kernel security update
  • 09:43 elukey: reboot kafka1013 for kernel security updates
  • 09:29 elukey: rebooting analytics1030 for kernel updates
  • 09:17 moritzm: draining restbase2010 for eventual reboot for kernel security update
  • 08:52 moritzm: rebooting prometheus servers in eqiad for kernel security update
  • 08:41 moritzm: draining restbase2009 for eventual reboot for kernel security update
  • 08:34 elukey: reboot kafka1012 for kernel updates - T188594
  • 08:20 gehel: banning elastic1021 from cluster (failed memory) - T188595
  • 07:55 elukey: reboot kafka-jumbo1001 for kerne updates - T188594
  • 07:52 elukey: run kafka preferred-replica-election on kafka1012 to force broker 18 to get back among Kafka topic leaders
  • 07:26 gehel: starting rolling reboot of elasticsearch / cirrus - eqiad (kernel upgrade and config changes)
  • 07:24 demon@tin: Synchronized php-1.31.0-wmf.22/maintenance/sql.php: adding --json output mode (duration: 01m 15s)
  • 06:59 chasemp: restart nova-api on labnet1001
  • 06:57 madhuvishy: Restart nova-conductor on labcontrol1001
  • 06:26 marostegui: Deploy schema change on db1074 - T187089 T185128 T153182
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for alter table (duration: 01m 14s)
  • 06:09 marostegui: Reload haproxy on dbproxy1005
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 23s)
  • 02:05 demon@tin: Synchronized wmf-config/: removing extension-list-wikitech (duration: 01m 13s)
  • 02:03 demon@tin: Synchronized docroot/noc/: cleanup extension-list-wikitech removal (duration: 01m 12s)
  • 01:49 demon@tin: Synchronized wmf-config/: Undeploying EmailAuth from beta, no-op (duration: 01m 16s)
  • 01:32 eileen: update civicrm revision changed from 341c734a79 to a819d64d98, config revision is 62631813fc (add geocoder extension)
  • 00:43 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up $wgEchoPerUserBlacklist setting (duration: 01m 14s)
  • 00:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Remove $wgUsejQueryThree (duration: 01m 14s)
  • 00:27 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswikibooks (T145394) (duration: 01m 13s)
  • 00:17 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswiki (T130279) (duration: 01m 14s)

2018-02-28

  • 23:27 eileen: civicrm revision changed from a47eafcbad to 341c734a79, config revision is 62631813fc (update civicrm submodule & vendor but not geocoder extension as yet)
  • 22:11 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.22 T188555
  • 22:00 ejegg: updated payments-wiki from 1acfc4a9a0 to 498f49a758
  • 21:57 milimetric@tin: Finished deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment (duration: 04m 19s)
  • 21:56 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.23
  • 21:53 milimetric@tin: Started deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment
  • 21:46 arlolra: Updated Parsoid to 1415a2a (T58756, T169006)
  • 21:26 arlolra@tin: Finished deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a (duration: 08m 46s)
  • 21:17 arlolra@tin: Started deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a
  • 20:53 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 (back) to 1.31.0-wmf.23
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki to 1.31.0-wmf.23
  • 20:20 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/page/WikiPage.php: WikiPage: Avoid $user variable reuse in doDeleteArticleReal() T188479 (duration: 00m 57s)
  • 19:52 demon@tin: Synchronized README: no-op, forcing co-master sync (duration: 00m 57s)
  • 19:29 gehel: rolling reboot of elasticsearch / cirrus - codfw completed
  • 18:56 demon@tin: Finished deploy [gerrit/gerrit@f16f4a4]: GO plugin (duration: 00m 10s)
  • 18:55 demon@tin: Started deploy [gerrit/gerrit@f16f4a4]: GO plugin
  • 18:53 niharika29@tin: Synchronized wmf-config/throttle.php: Clean obsolete rules and add a new one - T188529 (duration: 00m 56s)
  • 18:44 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:42 niharika29@tin: Synchronized wmf-config/Wikibase.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:32 godog: puppet reenable on einsteinium
  • 18:30 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading from full term entity id everywhere T114903 (duration: 00m 57s)
  • 18:23 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable Wikibase RC injection for ruwiki [mediawiki-config] - https://gerrit.wikimedia.org/r/415078 (duration: 00m 57s)
  • 18:19 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Compact Language Links out of Beta on English Wikipedia T187677 (duration: 00m 58s)
  • 18:17 mutante: gerrit2001 - reboot for kernel upgrade
  • 18:12 godog: force a puppet run on failed hosts in eqiad for recovery
  • 18:09 apergos: rebooting dataset1001 (dumps.wm.o) for new kernel
  • 18:06 godog: stop and restart apache2 on puppetmaster1002
  • 17:58 godog: restart apache2 on puppetmaster1002
  • 17:46 milimetric@tin: Finished deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact (duration: 06m 45s)
  • 17:46 kart_: Finished running CLL preference migration script on terbium (T187677)
  • 17:39 milimetric@tin: Started deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact
  • 17:38 mutante: phab2001 - downtimed, rebooting for kernel upgrade
  • 16:44 moritzm: draining restbase2008 for eventual reboot for kernel security update
  • 16:10 moritzm: rebooting prometheus servers in codfw for kernel security update
  • 16:10 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons (duration: 00m 41s)
  • 16:09 ppchelko@tin: Started deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons
  • 16:02 moritzm: draining restbase2007 for eventual reboot for kernel security update
  • 15:45 godog: repool rhodium as puppet master backend
  • 15:22 moritzm: rebooting ores in eqiad for kernel security update
  • 15:22 ema: upgrade cache_text@eqiad to varnish 5
  • 15:20 moritzm: draining restbase2006 for eventual reboot for kernel security update
  • 15:16 zeljkof: EU SWAT finished
  • 15:15 zfilipin@tin: Synchronized php-1.31.0-wmf.23/extensions/WikibaseQualityConstraints/: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) Bump cache key for check results (T188384) (duration: 01m 02s)
  • 15:11 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Bump cache key for check results (T188384) (duration: 01m 02s)
  • 14:54 moritzm: rebooting ores in codfw for kernel security update
  • 14:53 jynus: stopping labsdb1011 to clone it to labsdb1010 T186579
  • 14:50 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Drop the medlem user group and editallpages user right (T184981) (duration: 00m 57s)
  • 14:48 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) (duration: 01m 02s)
  • 14:47 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: SWAT: Only filter statuses after collecting metadata (T188384) (duration: 01m 03s)
  • 14:38 jynus: dropping sqldata on dbstore1001
  • 14:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable HTML Previews on all wikipedias (T182319) (duration: 00m 57s)
  • 14:28 moritzm: rebooting kubestage* for kernel security update
  • 14:25 gehel@tin: Finished deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator (duration: 04m 27s)
  • 14:22 moritzm: draining restbase2005 for eventual reboot for kernel security update
  • 14:21 gehel@tin: Started deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator
  • 14:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: beta: enable VirtualPagePreviews events on beta cluster (T184793 T186728) (duration: 00m 57s)
  • 13:13 moritzm: draining restbase2004 for eventual reboot for kernel security update
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2011 - T187886 (duration: 00m 59s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2011 - T187886 (duration: 00m 58s)
  • 12:35 moritzm: draining restbase2003 for eventual reboot for kernel security update
  • 12:00 marostegui: Reboot db1115 tendril master to pick up new my.cnf options - T184704
  • 11:49 moritzm: draining restbase2002 for eventual reboot for kernel security update
  • 11:37 marostegui: Reset slave all on db2093 - T184704
  • 11:35 moritzm: rebooting eqiad job runners for kernel security update
  • 11:18 moritzm: powercycling restbase2001, stuck in reboot
  • 11:10 godog: rollout thumbor 1.15 to codfw/eqiad
  • 10:59 godog: upload python-thumbor-wikimedia 1.15 - T187822 T187350
  • 10:59 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1261.eqiad.wmnet
  • 10:54 moritzm: draining restbase2001 for eventual reboot for kernel security update
  • 10:43 moritzm: rebooting remaining mediawiki app servers in eqiad
  • 09:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2083, db2082 and db2081 after kernel upgrade (duration: 00m 57s)
  • 09:25 ema: upgrade cache_text@codfw to varnish 5
  • 09:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2083, db2082 and db2081 for kernel upgrade (duration: 00m 56s)
  • 09:06 marostegui: Reboot db2083, db2082 and db2081 for kernel and mariadb upgrade
  • 08:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 - T162807 (duration: 00m 57s)
  • 08:42 filippo@neodymium: conftool action : set/pooled=yes; selector: name=neodymium.eqiad.wmnet
  • 08:42 filippo@neodymium: conftool action : set/pooled=no; selector: name=neodymium.eqiad.wmnet
  • 08:34 marostegui: Reboot db2069 for kernel upgrade
  • 08:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2069 - T162807 (duration: 00m 57s)
  • 08:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 - T162807 (duration: 00m 57s)
  • 08:10 moritzm: rebooting remaining mediawiki API servers in eqiad
  • 07:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 - T162807 (duration: 00m 57s)
  • 07:51 marostegui: Reboot db2062 for mariadb and kernel upgrade
  • 07:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2085 (duration: 00m 57s)
  • 07:15 marostegui: Upgrade kernel and mariadb on db2085
  • 07:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2085 for mariadb and kernel upgrade (duration: 01m 00s)
  • 06:32 marostegui: Deploy schema change on db1060 (with replication) - this will cause lag on labs servers - T187089 T185128 T153182
  • 06:31 kart_: (Re)Starting CLL preference migration script on terbium (T187677)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 05:43 demon@tin: rebuilt and synchronized wikiversions files: (no justification provided)
  • 04:55 krinkle@tin: Synchronized wmf-config/profiler.php: Iba417de75a and Ied984d (duration: 01m 06s)
  • 03:01 kart_: Starting CLL preference migration script on terbium (T187677)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 21s)
  • 00:55 demon@tin: Synchronized scap/plugins/wmfbetaautoupdate.py: no-op (duration: 01m 14s)
  • 00:24 papaul: OS install on wdqs200[4-6]
  • 00:03 thcipriani@tin: Synchronized php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameUserJob.php: LocalRenameUserJob: escape backreferences in replacement title T188171 (duration: 01m 13s)

2018-02-27

  • 23:38 krinkle@tin: Synchronized dblists/: remove pp_stage1_raw.dblist (duration: 01m 14s)
  • 21:23 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/user/User.php: Add a missing check of $wgActorTableSchemaMigrationStage T188437 (duration: 01m 14s)
  • 20:42 ppchelko@tin: Finished deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull (duration: 02m 29s)
  • 20:39 ppchelko@tin: Started deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull
  • 20:37 ppchelko@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers (duration: 00m 25s)
  • 20:36 ppchelko@tin: Started deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers
  • 20:31 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.23
  • 20:08 herron: eqiad puppet master reboots finished -- re-enabling puppet agents
  • 20:02 herron: temporarily disabling puppet agents and rebooting eqiad puppet masters for kernel update
  • 20:02 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache (duration: 32m 10s)
  • 19:30 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache
  • 19:08 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (duration: 04m 16s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241
  • 19:03 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only) (duration: 00m 22s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only)
  • 18:32 otto@tin: Started restart [eventstreams/deploy@7629e16]: service restart to publish page change related streams: T187241 (scb2001 only)
  • 18:32 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only) (duration: 00m 03s)
  • 18:32 otto@tin: Started deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only)
  • 18:02 moritzm: rebooting kubernetes workers in eqiad for kernel security update
  • 17:46 moritzm: rebooting kubernetes workers in codfw for kernel security update
  • 17:41 jynus: restarting ferm on db2049, seems failed one day ago
  • 17:38 gehel: restarting wdqs-updater on wdqs1004 - T188045
  • 17:32 thcipriani: starting branch cut for 1.31.0-wmf.23 T183962
  • 17:14 godog: upload puppetdb 2.3.8-1~wmf1+stretch to stretch-wikimedia - T184562
  • 17:10 urandom: restarting Cassandra, restbase1007-a to test jmx_exporter
  • 16:53 elukey: restart cassandra-a on aqs1004 to test the prometheus jmx agent before complete rollout - T184795
  • 16:52 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH everywhere (duration: 00m 56s)
  • 16:50 ema: lvs1010: retpoline kernel/libs upgrade T188092
  • 16:46 ema: cp1008: retpoline kernel/libs upgrade T188092
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1081 (duration: 02m 04s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 55s)
  • 16:26 moritzm: rebooting mw1293-mw1298 for kernel security update
  • 16:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:10 thcipriani: restarting jenkins for plugin update
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:06 moritzm: rebooting restbase-dev for kernel security update
  • 15:49 awight: Restarting ORES celery workers, changing from 35 -> 45 workers per node.
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1081 - T186321 (duration: 00m 56s)
  • 15:37 marostegui: Stop MySQL and reboot db1081 for kernel ugprade, mariadb upgrade and binlog format change - T186321
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T186321 (duration: 00m 55s)
  • 15:33 moritzm: installing squid security updates
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 57s)
  • 15:20 moritzm: powercycling thumbor1004, stuck during reboot
  • 15:19 ottomata: beginning migration of varnishkafka webrequest upload from Kafka analytics to kafka jumbo
  • 15:11 ema: upgrade cache_text@esams to varnish 5 T184448
  • 15:02 gilles: EU SWAT finished
  • 15:02 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swit user for private containers (T187822) (duration: 00m 55s)
  • 15:00 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: (T187822) (duration: 00m 56s)
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix: Add missed line in wgLogo (T185977) (duration: 00m 56s)
  • 14:44 moritzm: rebooting thumbor in eqiad for kernel security update
  • 14:31 bblack: puppet disable on RPS-using hosts to be careful with RPS hosts https://gerrit.wikimedia.org/r/#/c/414676/ - cp*, lvs*, labstore
  • 14:27 chasemp: silence labvirt1019/1020 in icinga
  • 14:24 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation (duration: 00m 04s)
  • 14:23 ariel@tin: Started deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation
  • 14:15 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T188292) New throttle rule for cswiki (T187990) New throttle rule (T188034) (duration: 00m 57s)
  • 14:05 marostegui: Update tendril shard table for the "tendril" replication topology - T184704
  • 13:33 gehel: starting rolling restart of elasticsearch / cirrus codfw (config changes + kernel upgrade)
  • 13:25 moritzm: rebooting thumbor in codfw for kernel security update
  • 13:22 godog: upload ruby-mysql 2.9.1-1~bpo9+1 to stretch-wikimedia - T184562
  • 13:00 Amir1: inserting wikidata-related interwikis to site_identifiers table using eval.php in enwiki (T183019)
  • 12:35 marostegui: Remove /srv/tmp/dbstore1001 files from es1017 to free up space - T186596
  • 12:16 Hauskatze: The global rename: Darkweasel94 → Tokfo has FINISHED - T187629
  • 11:56 moritzm: rebooting mw1221-mw1235 (API servers) for kernel security update
  • 11:08 moritzm: rebooting mw1240-mw1258 (app servers) for kernel security update
  • 11:00 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=scb1003.eqiad.wmnet
  • 10:57 moritzm: keeping scb1003 depooled for T188385
  • 10:51 _joe_: updating python-conftool everywhere to 1.0.0
  • 10:51 _joe_: uploaded python-conftool 1.0.0 to stretch-wikimedia
  • 10:49 moritzm: powercycling scb1003, stuck during reboot
  • 10:29 Hauskatze: Starting big global rename: Darkweasel94 → Tokfo - with DBA/OPS green light - T187629
  • 10:07 akosiaris: poweroff sca1004 for T181121 tests
  • 10:05 moritzm: reboot scb in eqiad for kernel security updates
  • 10:03 _joe_: uploading conftool-1.0.0-1 to jessie-wikimedia
  • 09:16 godog: reimage rhodium - T184562
  • 08:42 gehel: powercycling wdqs1004 - T188045
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1084 (duration: 00m 56s)
  • 08:24 gilles@tin: Synchronized private/PrivateSettings.php: Separate Thumbor Swift user for private containers (duration: 00m 56s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 (duration: 00m 56s)
  • 07:04 marostegui: Stop MySQL on db1084 for kernel and mariadb upgrade
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 00m 56s)
  • 07:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1084 (duration: 00m 56s)
  • 06:59 demon@tin: Synchronized README: no-op (duration: 00m 56s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Increase traffic for db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Slowly repool db1103:3312 (duration: 00m 56s)
  • 06:33 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:21 marostegui: Stop MySQL on db1115 to copy it to db2093 - tendril (dbtree) service will be down for this maintenance - T184704
  • 06:20 marostegui: Reload haproxy on dbproxy1005
  • 05:26 krinkle@tin: Synchronized wmf-config/profiler.php: I1e7dc263b43 (duration: 00m 56s)
  • 05:00 krinkle@tin: Synchronized wmf-config/profiler.php: I34687c0569af (duration: 00m 57s)
  • 03:28 krinkle@tin: Synchronized wmf-config/profiler.php: various refactor and clean up for T180183 (no-op) (duration: 00m 54s)
  • 03:12 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta only (no-op) (duration: 00m 56s)
  • 02:58 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 [keeping static files] (duration: 01m 24s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 11s)
  • 01:39 mutante: install1002 - re-enabling disabled puppet
  • 00:55 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: add very likely bad faith filter on svwiki (T174560) (duration: 00m 57s)
  • 00:49 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on svwiki (T174560) (duration: 00m 56s)
  • 00:40 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on simplewiki (T182012) (duration: 00m 56s)
  • 00:39 demon@tin: Synchronized wmf-config/CommonSettings.php: beta-only change: lsctorestaticarray (duration: 00m 56s)
  • 00:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on all wikinews wikis (T188000), all private wikis (T188009), test2wiki, loginwiki, votewiki and wikimania2017wiki (T188008) (duration: 00m 56s)

2018-02-26

  • 23:37 bd808@tin: Finished scap: wikitech: use 'labswiki' database on m5-master (T188029) (duration: 03m 21s)
  • 23:34 bd808@tin: Started scap: wikitech: use 'labswiki' database on m5-master (T188029)
  • 23:31 bd808: Pulled T188029 change to silver
  • 22:57 demon@tin: Synchronized wmf-config/: fileimporter/fileexporter improvements (duration: 00m 58s)
  • 22:56 demon@tin: Synchronized wmf-config/InitialiseSettings.php: fileimporter/fileexporter improvements (duration: 00m 57s)
  • 22:09 andrewbogott: hotfixed mediawiki on silver to use m5-master for wikitech. This will be finalized with the merge of https://gerrit.wikimedia.org/r/#/c/414733/
  • 22:07 andrewbogott: made mysql on silver read-only, hopefully for good. T188029
  • 22:05 andrewbogott: logging a log to test logging a log
  • 22:03 andrewbogott: testing the log by logging a test
  • 19:46 catrope@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: T184937 (duration: 01m 03s)
  • 19:46 mutante: running puppet on cache::misc servers to add new director for design.wm
  • 19:29 catrope@tin: Synchronized wmf-config/CommonSettings.php: Simplify 2017 wikitext editor config (part 1) (duration: 00m 54s)
  • 19:26 catrope@tin: Synchronized wmf-config/throttle.php: Add throttle rule (T188129) (duration: 00m 56s)
  • 19:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add mushroomobserver.org to wgCopyUploadsDomains (T188203) (duration: 00m 57s)
  • 19:08 herron: codfw puppet master kernel updates complete re-enabling puppet agents
  • 18:31 gehel@tin: Finished deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh (duration: 06m 28s)
  • 18:24 gehel@tin: Started deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh
  • 18:13 demon@tin: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor: Ignore empty repositories (duration: 00m 56s)
  • 17:34 jynus: deploying new query killer to db1109
  • 17:32 akosiaris: shutdown sca1004 on ganeti1005 for T181121
  • 16:39 andrewbogott: making wikitech read-only (via a local patch) while I migrate the database to m5
  • 16:33 marostegui: Reboot db1111 storage crashed - T187526
  • 16:31 papaul: Maintenance: removing Msw-d4-codfw for replacement:T187534
  • 16:29 mutante: restarted stashbot on toolforge because it didn't react to !log
  • 16:26 mutante: test !log
  • 16:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 56s)
  • 15:45 andrewbogott: made wikitech read/write again pending a bit more preliminary work
  • 15:43 cmjohnson1: swapping failed disk db1068
  • 15:42 andrewbogott: marking wikitech read-only (via a local edit to CommonSettings.php) for https://phabricator.wikimedia.org/T188029
  • 15:32 addshore: EU SWAT done
  • 15:31 addshore@tin: Finished scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations (duration: 11m 29s)
  • 15:19 addshore@tin: Started scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations
  • 15:12 Amir1: This might have performance implications roll it back if it affects these wikis too much
  • 15:12 gehel: reboot of relforge completed, cluster is green again
  • 15:11 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading full entity id from wb_terms table in three wikis (T114903) (duration: 00m 56s)
  • 14:54 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Add patrol rights/groups to fawikisource (T187662) (duration: 00m 56s)
  • 14:52 gehel: rebooting relforge for kernel upgrade
  • 14:50 godog: upload puppetdb 4.4.0-1~wmf1 to stretch-wikimedia - T177253
  • 14:48 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable statement usage tracking in several wikis (T151717) (duration: 00m 57s)
  • 14:40 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespaces to urwiktionary (T186393) (duration: 00m 56s)
  • 14:28 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 55s)
  • 14:15 moritzm: rebooting scb in codfw for kernel security updates
  • 14:10 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php: SWAT: Added option to continue script from particular User ID Use a replica dedicated to slow queries (if available) (T187880) (duration: 00m 58s)
  • 13:09 moritzm: rebooting video scalers in eqiad for kernel security update
  • 11:12 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:11 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:01 moritzm: powercycling mw1264 (stuck after reboot)
  • 10:10 moritzm: rebooting mw canaries for kernel security update
  • 09:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 and db2070 (duration: 00m 55s)
  • 09:23 elukey: copied burrow 0.1 from jessie-wikimedia to stretch-wikimedia
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1103:3314 (duration: 00m 56s)
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1103:3314 after mariadb and kernel upgrade (duration: 00m 56s)
  • 07:08 marostegui: Deploy schema change on db1103:3312 - T187089 T185128 T153182
  • 06:59 marostegui: Stop MySQL on db1103:3312 and 3314 to upgrade it and kernel
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 (duration: 00m 54s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui: Stop MySQL db2070 and db2055 to copy data to db2055 (and upgrade kernel and mariadb)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2055 and db2070 (duration: 01m 07s)
  • 06:15 marostegui: Stop MySQL on db1115 tendril database to copy it to db2093. Tendril (dbtree) service will be down for maintenance - T184704
  • 02:55 XioNoX: labs->cloud vlan rename in codfw - T187933
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 07m 12s)
  • 02:15 XioNoX: disabling ALGs on MR routers

2018-02-25

  • 07:35 marostegui: Fix s7 replication on labsdb1010 - T186579

2018-02-24

  • 06:11 marostegui: Reload haproxy on dbproxy1005
  • 01:42 demon@tin: Synchronized docroot/noc/conf/highlight.php: one last time (duration: 00m 57s)
  • 01:18 demon@tin: Synchronized docroot/noc/conf/index.php: fix dblist links from listing (duration: 00m 56s)
  • 01:13 Reedy: added eqsin ipv6 range to botpasswords ip range restriction T188111
  • 01:08 demon@tin: Synchronized docroot/noc/: dblists cleanup (duration: 00m 57s)
  • 01:07 demon@tin: Synchronized tests/: no-op (duration: 00m 59s)

2018-02-23

  • 22:36 demon@tin: Finished deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file (duration: 00m 10s)
  • 22:35 demon@tin: Started deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file
  • 21:27 demon@tin: Finished scap: pos mysql code (duration: 23m 09s)
  • 21:04 demon@tin: Started scap: pos mysql code
  • 20:48 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.22
  • 20:39 no_justification: wmf.21, that is
  • 20:38 demon@tin: rebuilt and synchronized wikiversions files: roll wikidatawiki back to wmf.11, busted
  • 20:35 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.22
  • 19:10 ebernhardson: restart relforge elasticsearch cluster to test entity extraction on larger dataest
  • 18:28 Amir1: mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=enwiki --force-protocol https (T183019)
  • 17:22 ema: libvmod-netmapper 1.6-1 uploaded to apt.w.o/experimental T188089
  • 16:37 moritzm: rebooting image scalers in codfw for kernel security updates
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1083 (duration: 01m 14s)
  • 15:58 moritzm: rebooting job runners in codfw for kernel security updates
  • 15:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 02m 21s)
  • 15:15 jynus: about to deploy gerrit:413375 disabling puppet on affected hosts
  • 14:59 elukey: update facts on puppet compiler
  • 14:40 moritzm: installing kernel updates on API servers in codfw
  • 14:09 jynus: restarting tendril database- will case unavailability of dbtree for a while
  • 13:44 moritzm: reboot ocg1003 for tests
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 and fully repool db1076 (duration: 01m 13s)
  • 12:28 hashar@tin: Synchronized wmf-config/throttle.php: Define new throttle rule - T188090 (duration: 01m 11s)
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 01m 21s)
  • 12:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 - T186321 (duration: 01m 12s)
  • 11:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1076 - T186321 (duration: 01m 13s)
  • 11:29 marostegui: Restart mariadb on db1076 for binlog format change - T186321
  • 11:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for binlog format change - T186321 (duration: 01m 08s)
  • 11:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090 after alter table (duration: 01m 12s)
  • 11:02 moritzm: installing kernel updates on mw* in codfw
  • 10:30 hashar: releases1001: sudo -u jenkins rm -fR /var/lib/jenkins/jobs/mediawiki-private-nightlies/workspace/BRANCH/REL1_??/mediawiki-snapshot-REL1_??-2018???? # T188080
  • 10:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:08 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:01 elukey: restart hhvm on mw1230
  • 09:54 elukey: restart hhvm on mw1286
  • 09:50 elukey: restart hhvm on mw1227
  • 08:05 marostegui: MariaDB and kernel upgrade on db1083
  • 07:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083, fully repool db1089 - T162807 (duration: 01m 12s)
  • 06:55 marostegui: Reboot db2093 to test /srv auto-mounting
  • 06:40 marostegui: Deploy schema change on db1090 - T187089 T185128 T153182
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 for alter table (duration: 01m 13s)
  • 05:58 mutante: puppetmaster1001 - signing puppet certs for kafkamon1001/kafkamon2001 - initial puppet runs, adding as role spare (T187901)
  • 05:40 mutante: ganeti1004 - initial startup of kafkamon1001 - booting to PXE, installing stretch (T187901)
  • 04:56 mutante: ganeti: ganeti2004 - creating new VM kafkamon2001 - vcpus=2,memory=8g,disk=60G, row_A codfw (T187901)
  • 04:53 mutante: ganeti: creating new VM kafkamon1001 - vcpus=2,memory=8g,disk=60G, row_A eqiad (T187901)
  • 02:46 demon@tin: Finished deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin (duration: 00m 10s)
  • 02:46 demon@tin: Started deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin
  • 02:10 demon@tin: Synchronized docroot/: mw.org docroot moving (duration: 01m 13s)
  • 01:45 eileen: update process control process-control config revision is 1605238b2e
  • 01:20 eileen: update civicrm revision changed from aa251f1a93 to a47eafcbad, config revision is c1787646bc
  • 01:19 demon@tin: Synchronized static/favicon/: smaller favicons (duration: 01m 12s)
  • 01:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: point mkwikt favicon to en version, dupe (duration: 01m 15s)
  • 01:08 demon@tin: Synchronized wmf-config/InitialiseSettings.php: rtl wikibooks logo (duration: 01m 13s)
  • 01:06 demon@tin: Synchronized static/favicon/wikibooks-rtl.ico: rtl wikibooks logo (duration: 01m 12s)
  • 00:52 demon@tin: Synchronized static/images/project-logos/: new project logos for urdu wikt (duration: 01m 13s)
  • 00:37 krinkle@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: Ia54cd7 - rm use of MW_LANG (duration: 01m 13s)

2018-02-22

  • 22:33 demon@tin: Synchronized php-1.31.0-wmf.22/includes/filerepo/file/LocalFile.php: Id5cdd8ec (duration: 01m 12s)
  • 22:32 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: Id5cdd8ec (duration: 01m 12s)
  • 22:30 demon@tin: Synchronized php-1.31.0-wmf.22/includes/Storage/: Id5cdd8ec (duration: 01m 13s)
  • 22:16 maxsem@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 12s)
  • 22:14 maxsem@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 14s)
  • 21:51 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: I9334d36e (duration: 01m 15s)
  • 21:37 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1004.eqiad.wmnet
  • 21:11 gehel: powercycling wdqs1004 (complete loss of network)
  • 20:39 demon@tin: Synchronized php-1.31.0-wmf.22/includes/libs/objectcache/WANObjectCache.php: betterer logging for cache ttl reduction, Iea029e78 (duration: 01m 13s)
  • 19:33 XioNoX: redirecting Facebook bots large source of traffic to codfw ( https://gerrit.wikimedia.org/r/#/c/413446/ )
  • 19:14 akosiaris: rolling restart of eqiad appservers. sudo cumin -b3 -s 30 'A:mw-eqiad' 'restart-hhvm' T188019
  • 19:12 twentyafterfour@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour: syncing https://gerrit.wikimedia.org/r/#/c/413437/
  • 19:03 chasemp: baham:~# authdns-update
  • 19:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2073 (duration: 01m 12s)
  • 17:23 elukey: installed linux-perf-4.9 on phab1001 to experiment with perf tracing
  • 17:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1076 (duration: 01m 12s)
  • 17:05 XioNoX: rolling back "redirecting ns2 traffic to radon"
  • 17:02 ema: reboot eeden with new kernel 4.9.0-0.bpo.6
  • 16:58 XioNoX: redirecting ns2 traffic to radon
  • 16:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 (duration: 01m 12s)
  • 16:28 ejegg: updated CiviCRM from b27e6a5019 to aa251f1a93
  • 16:26 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Use EventBus for refreshLinks in test wikis, file 2/2 - T185052 (duration: 01m 12s)
  • 16:25 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for refreshLinks in test wikis, file 1/2 - T185052 (duration: 01m 12s)
  • 16:23 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052 (duration: 00m 36s)
  • 16:23 ppchelko@tin: Started deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052
  • 16:22 mobrovac@tin: scap failed: average error rate on 8/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 16:13 jynus: tendril and dbtree database currently under maintanance
  • 16:04 ejegg: updated payments-wiki from fe311c2d26 to 1acfc4a9a0
  • 15:26 ema: finished upgrading cache_text@ulsfo to varnish 5
  • 15:24 elukey: manually removing from cp1008 and cache::misc old files related to the varnishkafka jumbo testing instance (after https://gerrit.wikimedia.org/r/413370)
  • 14:58 matthiasmullie: EU SWAT finished
  • 14:52 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable 3D file display (duration: 01m 12s)
  • 14:50 mlitn@tin: Synchronized php-1.31.0-wmf.21/extensions/3D/extension.json: Remove MMV dependency for 3D (duration: 01m 12s)
  • 14:41 ottomata: beginning migration of webrequest_misc from Kafka analytics to jumbo: T185136
  • 14:40 mlitn@tin: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 14:38 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable 3D file display (duration: 01m 13s)
  • 14:32 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2171.codfw.wmnet
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Show HTML summaries on cswiki (T182321) (duration: 01m 13s)
  • 13:41 ema: bounce pybal on lvs1003 to try establish missing etcd connections (zotero, thumbor, wdqs) https://phabricator.wikimedia.org/P6730
  • 13:30 moritzm: rebooting kubernetes1001
  • 13:21 ema: upgrade pybal on lvs1003 to 1.14.4
  • 12:42 _joe_: ended live-hacking on mwdebug1001 (T185078)
  • 12:24 _joe_: live-hacking ProductionServices.php on mwdebug1001 for testing (T185078)
  • 11:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and slowly repool db1076 (duration: 01m 12s)
  • 11:40 kartik@tin: Finished deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1 (duration: 03m 37s)
  • 11:39 akosiaris: purge ORES from scb hosts T168073 T171851
  • 11:37 kartik@tin: Started deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1
  • 11:19 _joe_: upgrading python-conftool on all cache hosts
  • 10:55 ema: upgrading python-conftool on cp5007
  • 10:51 _joe_: upgrading python-conftool on cp1008
  • 10:42 jynus: stop db2073 for maintenance
  • 10:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and fully repool db1104 (duration: 01m 13s)
  • 10:37 _joe_: benchmarking EtcdConfig failure scenarios on mwdebug1001, T185078
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 14s)
  • 10:18 ema: upgrade cache_text @ ulsfo to varnish 5
  • 10:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2073 for maintenance (duration: 01m 12s)
  • 10:08 moritzm: uploaded Linux 4.9.82-1~wmf1 for jessie-wikimedia to apt.wikimedia.org (retpoline-enabled kernel)
  • 10:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low traffic and depool db1067 - T162807 (duration: 01m 12s)
  • 09:59 akosiaris: reboot kraz.wikimedia.org (irc.wikimedia.org)
  • 09:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 12s)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 - T186321 (duration: 01m 12s)
  • 09:20 marostegui: Stop MySQL on db1104 to switch its binlog to statement - T186321
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T186321 (duration: 01m 13s)
  • 09:19 moritzm: rebooting multatuli
  • 09:03 ema: eqiad LVSs: upgrade pybal to 1.14.4
  • 08:48 jynus: tendril and dbtree database currently under maintanance
  • 08:47 ema: codfw LVSs: upgrade pybal to 1.14.4
  • 08:35 marostegui: Stop tendril database (db1011) to copy it to db1115 - tendril will be offline while the copy is in progress - T184704
  • 08:32 ema: esams LVSs: upgrade pybal to 1.14.4
  • 08:24 ema: ulsfo LVSs: upgrade pybal to 1.14.4
  • 08:05 marostegui: Disable puppet on db1011 - T184704
  • 07:48 krinkle@tin: Synchronized wmf-config/FeaturedFeedsWMF.php: I73945d7d - minor clean-up (duration: 01m 13s)
  • 07:32 _joe_: starting tests on mwdebug1001 again
  • 07:32 marostegui: Deploy schema change on db1076 - T187089 T185128 T153182
  • 07:24 marostegui: Stop MySQL on db1076 for mariadb and kernel upgrade + alter table
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for alter table (duration: 01m 14s)
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 (duration: 01m 13s)
  • 06:21 marostegui: Stop puppet and mysql on db1011 to get ready to copy its data to db1115 - T184704
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 53s)
  • 01:05 anomie: Running cleanupBlocks.php on more wikis for T187834: alswiki bgwiki bhwiki cawiki dewiki elwiki eswiki frwiki hewiki hiwiki huwiki hywiki jawiki jawikibooks jawikinews jawikiquote jawikisource jawiktionary kawiki kowiki mswiki mswiktionary rowiki sourceswiki
  • 01:01 anomie: Running cleanupBlocks.php on mediawikiwiki for T187834
  • 00:46 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 03m 07s)
  • 00:43 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:41 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 00m 27s)
  • 00:40 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:25 tgr@tin: Synchronized wmf-config/CommonSettings-labs.php: T57420 enable loginOnly flag in beta (duration: 01m 12s)
  • 00:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9 (duration: 06m 05s)
  • 00:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9
  • 00:13 demon@tin: Synchronized php-1.31.0-wmf.22/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 13s)
  • 00:12 demon@tin: Synchronized php-1.31.0-wmf.21/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 21s)
  • 00:00 mutante: LDAP - added uid 'raz-shuty' to group 'wmde' (T187442)

2018-02-21

  • 21:50 elukey: restart hhvm on mw1224 - high load alarms
  • 21:46 elukey: restart hhvm on mw1235 - high load alarms
  • 21:44 elukey: restart hhvm on mw1233 - high load alarms
  • 21:39 awight@tin: Finished deploy [ores/deploy@addba9c]: T187914 on the scb* cluster (duration: 10m 02s)
  • 21:34 elukey: restart hhvm on mw1232 - high load alarms
  • 21:30 ppchelko@tin: Finished deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636 (duration: 15m 59s)
  • 21:30 elukey: restart hhvm on mw1229 - high load alarms
  • 21:29 awight@tin: Started deploy [ores/deploy@addba9c]: T187914 on the scb* cluster
  • 21:28 awight@tin: Finished deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster (duration: 13m 03s)
  • 21:27 elukey: restart hhvm on mw1227 - high load alarms
  • 21:23 elukey: restart hhvm on mw1221 - high load alarms
  • 21:15 awight@tin: Started deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster
  • 21:14 ppchelko@tin: Started deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636
  • 20:53 twentyafterfour: MediaWiki Train for 1.31.0-wmf.22 is blocked by T187942
  • 20:39 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:38 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:34 twentyafterfour: rolling back group1 to wmf.21
  • 20:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.22 (duration: 01m 08s)
  • 20:27 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.22
  • 20:10 mutante: phab2001 - testing phab restart cron
  • 19:34 ebernhardson@tin: Synchronized wmf-config/PoolCounterSettings.php: Increase pool counter workers for cirrus namespace lookup (duration: 01m 13s)
  • 19:24 ottomata: applying changes to kafkatee module, first rhenium then oxygen. will require manual config fixings
  • 18:59 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for Burmese Wiktionary T187882 (duration: 01m 06s)
  • 18:48 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespace localization for sdwiki T186943 (duration: 01m 13s)
  • 18:39 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Added new throttle rule for Wikipedia Women in Red editathon T187803 (duration: 01m 12s)
  • 18:37 chasemp: labsdb rm -fR /usr/local/lib/mediawiki-config && puppet agent --test
  • 18:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Topic namespace alias of zhwiki T187546 (duration: 01m 13s)
  • 18:12 _joe_: stopped testing on mwdebug1001 for SWAT window
  • 17:43 ema: eqsin LVSs: upgrade pybal to 1.14.4
  • 17:34 _joe_: resuming tests on mwdebug1001
  • 17:17 ema: eqiad LVSs: bounce pybal for labweb proxfetch config changes
  • 17:12 ppchelko@tin: Finished deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437 (duration: 01m 23s)
  • 17:11 ppchelko@tin: Started deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437
  • 17:07 _joe_: finished testing on mwdebug1001 for swat
  • 16:56 oblivian@puppetmaster1001: conftool action : edit; selector: name=ReadOnly,scope=eqiad
  • 16:40 _joe_: testing various etcd failure scenarios on mwdebug1001, T185078
  • 16:39 ppchelko@tin: Finished deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437 (duration: 01m 33s)
  • 16:37 ppchelko@tin: Started deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437
  • 16:27 ema: lvs1010: restart pybal
  • 16:00 godog: restart rsyslogd on lithium and wezen - T136312
  • 15:50 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve private wiki thumbnails with Thumbor (T169144) (duration: 01m 12s)
  • 15:44 no_justification: pruned old 1.29.x and 1.30.x versions that somehow stuck around. Also 1.31.0-wmf.* cache/ directories for unused branches. T157030
  • 15:37 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve officewiki thumbnails with Thumbor (T169144) (duration: 01m 11s)
  • 15:27 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 11s)
  • 15:24 chasemp: reboot labtestservices2002
  • 15:24 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 12s)
  • 15:19 gilles: Thumbor private wiki support deployment
  • 15:08 zeljkof: EU SWAT finished
  • 15:08 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removing Mobile beta feedback link (T187712) (duration: 01m 12s)
  • 15:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Page Previews EventLogging instrumentation (T185973) (duration: 01m 13s)
  • 14:52 _joe_: rolling restart another 4 api appservers
  • 14:49 oblivian@tin: Synchronized wmf-config: Serve configuration to mwdebug hosts via etcd (duration: 01m 16s)
  • 14:42 _joe_: restarted hhvm on mwdebug1001 too
  • 14:38 _joe_: restarting hhvm on mwdebug1002
  • 14:06 _joe_: restarting hhvm on misbehaving api appservers
  • 14:02 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T187870) (duration: 01m 13s)
  • 13:28 marostegui: Reboot db2092 for a kernel upgrade
  • 13:26 moritzm: powercycling ganeti1007
  • 12:43 _joe_: rolling restart of hhvm on api servers under high load
  • 12:38 elukey: restart hhvm on mw1234 - high load
  • 12:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify that db1067 is now s1 candidate master - T186321 (duration: 01m 13s)
  • 12:26 elukey: restart hhvm on mw1231 - high load, hhvm-dump-debug in /home/elukey/hhvm.6759.bt
  • 12:21 elukey: restart hhvm on mw1227 - high load, hhvm-dump-debug in /home/elukey/hhvm.23382.bt
  • 12:10 moritzm: uploading retpoline-enabled gcc-4.9 to apt.wikimedia.org / jessie-wikimedia to be able to use it on boron for building Linux (trying to adapt our pbuilder setup to also include security.debian.org ran into a few proxy-related problems and this is really a rare corner case anyway)
  • 12:02 ema: lvs5003: pybal upgraded to 1.14.4
  • 12:01 ema: pybal 1.14.4 uploaded to apt.w.o
  • 11:17 moritzm: installing db5.3 security updates
  • 11:12 jynus: cloning db2011 to db2044
  • 10:40 kart_: Finished running CLL preference migration script dry-run on terbium (T187677)
  • 10:33 marostegui: Reload haproxy on dbproxy1005 - T187722
  • 10:26 marostegui: Remove db2030 from tendril - T187768
  • 10:09 moritzm: installing openssh bugfix updates from jessie/stretch point releases
  • 10:01 kart_: Running CLL preference migration script dry-run on terbium (T187677)
  • 09:46 moritzm: installing dbus updates from stretch point release
  • 09:23 moritzm: installing sqlite security updates on stretch
  • 08:35 godog: roll-restart thumbor in codfw and eqiad to apply https://gerrit.wikimedia.org/r/c/412980
  • 08:20 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 07:20 marostegui: Stop Mariadb on db1108 for kernel upgrade
  • 06:36 marostegui: Deploy schema change on db1105:3312 - T187089 T185128 T153182
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 for alter table (duration: 01m 17s)
  • 05:00 eileen: enable major gifts address job
  • 04:41 eileen: update civicrm revision changed from 43a7641597 to b27e6a5019, config revision is ef884a2c5d
  • 04:13 andrew@tin: Finished deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more (duration: 02m 45s)
  • 04:10 andrew@tin: Started deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more
  • 03:34 andrew@tin: Finished deploy [horizon/deploy@0e28f49]: updating branded graphics (duration: 02m 49s)
  • 03:31 andrew@tin: Started deploy [horizon/deploy@0e28f49]: updating branded graphics
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 06m 18s)
  • 02:15 no_justification: running `initSiteStats.php --update` for all wikis in medium.dblist. T187845
  • 02:01 no_justification: running `initSiteStats.php --update` for all wikis in small.dblist. T187845
  • 01:54 no_justification: WikipediaMobileFirefoxOS submodule references caused labsdb* (and related) puppet failures. They should recover now (self reverted my docroot changes). Filed T187850
  • 01:51 demon@tin: Synchronized docroot/: revert docroot improvements. some servers don't like improvements (duration: 01m 12s)
  • 01:36 demon@tin: Synchronized docroot/: Swapping wikimedia.org docroot for symlink (second try, old WPFirefoxMobileOS cleanup was still needed) (duration: 01m 12s)
  • 01:16 eileen: update civicrm revision changed from efba904b06 to 43a7641597, config revision is ef884a2c5d
  • 01:10 cwd: disabled process-control
  • 01:08 eileen: start outage to upgrade civicrm to 4.7.31
  • 00:56 mutante: gerrit2001 - restarted gerrit to test that gerrit:411397 and gerrit:411394 don't break anything - didn't touch cobalt right now to minimize affecting users and their logins
  • 00:43 thcipriani@tin: Synchronized wmf-config/abusefilter.php: SWAT: Allow CheckUsers and Stewards to access private data from the AbuseLog T160357 (duration: 01m 12s)
  • 00:29 thcipriani@tin: Synchronized php-1.31.0-wmf.21/includes/page/WikiPage.php: SWAT: site_stats: Unbreak counting newly created pages (duration: 01m 12s)
  • 00:26 thcipriani@tin: Synchronized php-1.31.0-wmf.21/resources/src/mediawiki/mediawiki.ForeignStructuredUpload.js: SWAT: Follow-up I0bb4ed7f7: Use correct "this" T187523 (duration: 01m 13s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable x-kill feature everywhere T186714 T184322 (duration: 01m 13s)

2018-02-20

  • 22:58 ejegg: restarted donations queue consumer
  • 22:26 ejegg: turned off donations queue consumer for timing test
  • 22:25 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/Thanks/modules/ext.thanks.revthank.js: T187757 (duration: 01m 14s)
  • 22:20 chasemp: T184209 create labs-instance-transport1-b-codfw
  • 22:06 eileen: update civicrm revision changed from 915a4419c8 to efba904b06, config revision is 8c7ce87207 (extended report update for regex)
  • 21:44 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.22
  • 21:39 no_justification: ran `namespaceDupes.php --wiki=enwikiversity` for T187660
  • 21:18 twentyafterfour@tin: Finished scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961 (duration: 46m 59s)
  • 20:34 ejegg: updated CiviCRM from 31115684f6 to 915a4419c8
  • 20:31 twentyafterfour@tin: Started scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961
  • 20:20 chasemp: labtestmetal2001:~# aptitude install linux-image-4.4.0-109-generic && aptitude install linux-image-extra-4.4.0-109-generic
  • 20:17 chasemp: labtestmetal mkfs -t xfs -i size=512 /dev/mapper/labtestmetal2001--vg-data
  • 20:16 andrew@tin: Finished deploy [horizon/deploy@b02c819]: trying to get a clean deploy (duration: 01m 54s)
  • 20:14 andrew@tin: Started deploy [horizon/deploy@b02c819]: trying to get a clean deploy
  • 20:10 andrew@tin: Finished deploy [horizon/deploy@b02c819]: a couple of bug fixes (duration: 02m 55s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@b02c819]: a couple of bug fixes
  • 20:07 andrew@tin: Started deploy [horizon/deploy@6a40f84]: a couple of bug fixes
  • 19:57 twentyafterfour: Cutting new branch wmf/1.31.0-wmf.22 - Deployment blockers: T183961
  • 19:45 demon@tin: Synchronized docroot/mediawiki/keys/: symlink magic (duration: 00m 56s)
  • 19:26 mobrovac@tin: Started restart [changeprop/deploy@5fdc03a]: (no justification provided)
  • 19:00 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2 (duration: 02m 47s)
  • 18:57 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 (duration: 14m 02s)
  • 18:43 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875
  • 18:34 arlolra@tin: Finished deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113 (duration: 10m 37s)
  • 18:23 arlolra@tin: Started deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113
  • 18:03 ppchelko@tin: Finished deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875 (duration: 16m 01s)
  • 17:52 moritzm: installing cups updates from jessie point release
  • 17:50 gilles: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --wiki=officewiki --backend=local-multiwrite --private
  • 17:47 ppchelko@tin: Started deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875
  • 17:41 andrew@tin: Finished deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts (duration: 00m 55s)
  • 17:40 andrew@tin: Started deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts
  • 17:11 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 1 (duration: 00m 56s)
  • 16:33 godog: roll-restart thumbor in codfw/eqiad to apply https://gerrit.wikimedia.org/r/412935
  • 16:25 moritzm: installing initramfs-tools update from jessie point release
  • 16:17 jynus: drop s3 from dbstore2001
  • 16:14 gilles@tin: Synchronized private/PrivateSettings.php: Add Thumbor secret to Swift configuration (duration: 00m 56s)
  • 15:37 oblivian@puppetmaster1001: conftool action : edit; selector: dc=esams,name=cp3033.esams.wmnet
  • 15:36 bblack: eqsin: restarting all varnish backends for storage changes (not in prod traffic flow, yet!)
  • 15:27 _joe_: upgrading conftool on swift proxies, thumbor
  • 15:25 _joe_: upgrading conftool on parsoid,wdqs
  • 15:23 _joe_: upgrading conftool on aqs, restbase, ores clusters
  • 15:19 _joe_: upgrading conftool on the mediawiki appservers
  • 15:15 _joe_: upgrading conftool on the maps cluster
  • 15:10 _joe_: installing python-conftool on puppetmasters, cumin masters
  • 14:53 godog: roll-restart thumbor after rollback
  • 14:50 volans: running puppet on thumbor1002 (was already logged in)
  • 14:40 zeljkof: EU SWAT finished
  • 14:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update the sitename of newiki (T186952) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Draft namespace to hiwikiversity. (T187535) (duration: 00m 56s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect to autoconfirmed at zhwikt (T187018) (duration: 00m 55s)
  • 14:10 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T187171) (duration: 00m 55s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: throttle: add new rule for Wikidata edit-a-thon (T187655) (duration: 00m 56s)
  • 13:29 marostegui: Upgrade kernel and reboot db1113 and db1114
  • 13:23 marostegui: Stop MySQL and reboot db1111 for kernel and mariadb upgrade
  • 13:17 marostegui: Stop MySQL and reboot db1112 for kernel and mariadb upgrade
  • 13:03 moritzm: installing libav security updates
  • 12:11 _joe_: upgrading conftool to 1.0.0~beta2 on scb*
  • 11:24 jynus: upgrding mariadb-client on neodymium and sarin
  • 11:09 marostegui: Deploy schema change on labtestweb2001 - T153182 T185128 T187089
  • 11:00 marostegui: Deploy schema change on s2 codfw master (db2035) with replication, this will generate lag on codfw - T187089 T185128 T153182
  • 11:00 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2037 and db2044 (duration: 00m 55s)
  • 10:58 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2037 and db2044 (duration: 00m 53s)
  • 10:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2037 and db2044 (duration: 00m 55s)
  • 10:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2030 from config - T187768 (duration: 00m 55s)
  • 10:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2030 from config - T187768 (duration: 00m 56s)
  • 10:13 volans: unified python-requests-mock packages in apt.wikimedia.org jessie-wikimedia to be 1.3.0-3~wmf1, removed binaries for 1.3.0-3
  • 09:49 marostegui: Deploy schema change on s6 primary master db1061 - T185128 T153182
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 after alter table (duration: 00m 55s)
  • 09:16 marostegui: Data checks for db2037 before removing it from s4 - T187722
  • 09:14 elukey: restart zookeeper on druid1001 (follower) to verify that the last changes are no-op
  • 09:12 marostegui: Deploy schema change on db1088 - T187089 T185128 T153182
  • 09:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 55s)
  • 09:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316 and db1085 (duration: 00m 55s)
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:02 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:01 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:56 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:51 oblivian@puppetmaster2001: conftool action : edit; selector: scope=common
  • 08:32 _joe_: uploading conftool 1.0.0~beta1 on stretch
  • 08:26 _joe_: uploading conftool 1.0.0~beta1 to jessie
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 08:09 godog: powercycle ganeti1006
  • 08:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 01m 10s)
  • 07:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 07:27 marostegui: Deploy schema change on db1096:3316 - T187089 T185128 T153182
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 00m 56s)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1085 (duration: 00m 55s)
  • 06:58 marostegui: Upgrade mariadb and kernel on db1085
  • 06:26 marostegui: Deploy schema change on db1085 (with replication - this will generate lag on labs hosts) - T187089 T185128 T153182
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 00m 56s)
  • 04:56 krinkle@tin: Synchronized docroot/mediawiki/keys/: Ie26638ed0c - rm old 2009 keys file (duration: 00m 56s)
  • 04:27 krinkle@tin: Synchronized w/extract2.php: Ib6d77e863b - clean up MW_LANG indirection (duration: 00m 55s)
  • 03:40 krinkle@tin: Synchronized wmf-config/CommonSettings.php: Ie4c7879f8ac - Clean up TemplateSandboxEditNamespaces config (duration: 00m 57s)
  • 03:37 Krinkle: It seems 'scap pull' on mwdebug1002 is acting weird (prompt doesn't return until 3-5 minutes after last line of "Finished rsync common")
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 50s)

2018-02-19

  • 23:21 eileen: re-enable omnirecipient jobs - process-control config revision is 8c7ce87207
  • 22:03 volans: uploaded cumin_3.0.1-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 20:03 volans: uploaded cumin_3.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 19:29 volans: uploaded python3-requests-mock, python-requests-mock and python-requests-mock-doc for version 1.3.0-3~wmf1 to apt.wikimedia.org jessie-wikimedia
  • 18:53 volans: disabled all notifications on Icinga for db2030
  • 18:04 volans: uploaded clustershell_1.8-1~wmf1_all.deb, python-clustershell_1.8-1~wmf1_all.deb and python3-clustershell_1.8-1~wmf1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:04 elukey@tin: Finished deploy [eventlogging/analytics@8bebdf7]: (no justification provided) (duration: 00m 05s)
  • 17:04 elukey@tin: Started deploy [eventlogging/analytics@8bebdf7]: (no justification provided)
  • 16:29 _joe_: uploading conftool 1.0.0beta1 to reprepro for jessie
  • 16:22 andrew@tin: Finished deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002 (duration: 00m 10s)
  • 16:22 andrew@tin: Started deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002
  • 16:11 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 22s)
  • 16:10 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 16:10 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 17s)
  • 16:09 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 14:59 jynus: testing new dbproxy1010 configuration locally to pool labsdb1010 for analytics
  • 13:44 godog: roll-restart prometheus after retention period bump
  • 13:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 (duration: 00m 55s)
  • 13:19 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 13:16 ema: upgrade cache_text@eqsin to varnish 5
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1098 s6 and s7 (duration: 00m 55s)
  • 12:27 marostegui: Deploy schema change on db1063 - T187089 T185128 T153182
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 for alter table (duration: 00m 55s)
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 (duration: 00m 55s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 56s)
  • 11:07 jdrewniak@tin: Synchronized portals: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:06 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 56s)
  • 10:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 55s)
  • 10:35 marostegui: Deploy schema change on db1093 - T187089 T185128 T153182
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 56s)
  • 10:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1098 s6 and s7 (duration: 00m 56s)
  • 10:10 marostegui: Upgrade mariadb and kernel on db1098
  • 09:59 marostegui: Enable GTID on dbstore2002:3313 and dbstore2001:3316
  • 09:57 marostegui: Enable GTID on dbstore2002 and dbstore2001 for x1
  • 09:55 jynus: reenable gtid replication on db1053 and db2042
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1260.eqiad.wmnet
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1259.eqiad.wmnet
  • 09:43 marostegui: Upgrade mariadb and kernel on db2033
  • 09:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1090 (duration: 00m 55s)
  • 09:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 - T162807 (duration: 00m 55s)
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for mariadb and kernel upgrade (duration: 00m 55s)
  • 08:49 marostegui: Deploy schema change on db1098:3316 - T187089 T185128 T153182
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 55s)
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1090 (duration: 00m 55s)
  • 08:11 godog: repool mw1227 - T149287
  • 08:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Promote db2034 to x1 codfw master - T184888 (duration: 00m 56s)
  • 07:58 moritzm: installing werkzeug security updates on trusty
  • 07:42 marostegui: Change topology on x1 codfw - T184888
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1090 (duration: 00m 55s)
  • 07:01 marostegui: Reboot db1090 for kernel ugprade, mariadb upgrade, socket path location upgrade
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 00m 55s)
  • 06:44 marostegui: Stop MySQL on db1089 to update its socket path
  • 06:42 marostegui: Deploy schema change on s6 codfw master (db2039), this will generate lag on codfw - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1105 - T162807 (duration: 00m 56s)
  • 05:29 andrew@tin: Finished deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes (duration: 03m 14s)
  • 05:26 andrew@tin: Started deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 10m 59s)

2018-02-18

  • 15:49 _joe_: rolling restart (1 at a time, staggered by 2 minutes) of 18 api appservers in equiad

2018-02-17

  • 17:33 twentyafterfour: restarting apache on phab1001 to clear deadlocked workers. refs T182832
  • 03:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 [keeping static files] (duration: 01m 17s)
  • 03:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 (duration: 04m 32s)

2018-02-16

  • 21:12 hashar: Upgraded Zuul to https://gerrit.wikimedia.org/r/#/c/411322/3 | T187567
  • 20:40 andrew@tin: Finished deploy [horizon/deploy@efcba2b]: sudo dashboard update (duration: 01m 16s)
  • 20:39 andrew@tin: Started deploy [horizon/deploy@efcba2b]: sudo dashboard update
  • 20:11 andrew@tin: Finished deploy [horizon/deploy@1fdd122]: two more small fixes (duration: 01m 21s)
  • 20:10 andrew@tin: Started deploy [horizon/deploy@1fdd122]: two more small fixes
  • 19:54 andrew@tin: Finished deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix (duration: 03m 12s)
  • 19:51 andrew@tin: Started deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix
  • 18:34 hashar: upgraded zuul
  • 16:21 andrew@tin: Finished deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements (duration: 08m 00s)
  • 16:13 andrew@tin: Started deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements
  • 16:06 cmjohnson1: labstore1006 and labstore1007 down for rack relocation
  • 16:03 andrew@tin: Finished deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints (duration: 02m 18s)
  • 16:00 andrew@tin: Started deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints
  • 15:40 andrew@tin: Finished deploy [horizon/deploy@29f9afb]: second attempt at ocata branch (duration: 03m 22s)
  • 15:37 andrew@tin: Started deploy [horizon/deploy@29f9afb]: second attempt at ocata branch
  • 15:29 andrew@tin: Finished deploy [horizon/deploy@58d2718]: first attempt at ocata branch (duration: 01m 28s)
  • 15:28 andrew@tin: Started deploy [horizon/deploy@58d2718]: first attempt at ocata branch
  • 15:27 godog: shut ms-be1018 for bbu swap - T186988
  • 15:16 akosiaris: run T181121#3978654 oneliner once more on sca1004, this time the VM has no DRBD
  • 15:14 akosiaris: poweroff sca1004, switch from DRBD to plain disk template T181121
  • 14:15 akosiaris: doing more IO stress tests on ganeti1005. T181121. Seems like we can reproduce
  • 14:06 chasemp: T184209 initial setup of labs-instances2-b-codfw and hosts
  • 13:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1094 (duration: 00m 56s)
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 and db1067 - T162807 (duration: 00m 55s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 jynus: reload dbproxy1008 configuration
  • 12:44 jynus: reload dbproxy1003 configuration
  • 12:37 ema: cp3049: restart varnish-fe to clear 'child restarted' alert
  • 12:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1094 (duration: 00m 56s)
  • 12:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s)
  • 12:17 marostegui: Stop MySQL on db1094 for mariadb upgrade, kernel upgrade and socket location upgrade
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 (duration: 00m 56s)
  • 12:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 00m 56s)
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:35 jynus: stopping mysql on db1043, db2012 for clonning data away
  • 11:33 jynus: changing socket location on phabricator db hosts T148507
  • 11:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:28 ema: cp3036: restart varnish-fe to clear 'child restarted' alert
  • 11:28 hashar: Switching operations/mediawiki-config job for composer to Docker | https://gerrit.wikimedia.org/r/#/c/411206/
  • 11:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 00m 56s)
  • 11:09 elukey: restart nfaccd on rhenium to see if it picks up the new kafka topic config (3 partitions)
  • 11:06 marostegui: Stop MySQL on db1093 for mariadb and kernel upgrade, also update socket path
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 00m 56s)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T162807 (duration: 00m 56s)
  • 09:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1053 (duration: 00m 56s)
  • 09:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1053 (duration: 00m 56s)
  • 08:48 akosiaris: doing IO stress tests on ganeti1005. T181121
  • 08:34 akosiaris: manually allocate logstash1008 on ganeti1005 to undo the manual override of sensible allocation rules by ganeti
  • 08:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 (duration: 00m 57s)
  • 08:14 akosiaris: powercycle ganeti1006 T181121
  • 08:13 akosiaris: powercycle ganeti1006
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 59s)
  • 06:41 moritzm: installing installing quagga security updates
  • 06:35 marostegui: Deploy schema change on s5 primary master db1070 - T185128 T153182
  • 00:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.21/extensions/ProofreadPage/modules/page/ext.proofreadpage.page.edit.js: SWAT: T187454 fix text selection on #wpTextbox1 (duration: 00m 58s)

2018-02-15

  • 23:43 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 56s)
  • 22:54 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/MassMessage/includes/MassMessage.php: fix use statement, T187510 (duration: 00m 57s)
  • 21:50 ejegg: updated CiviCRM from 61acc9175e to 31115684f6
  • 20:22 twentyafterfour: 1.31.0-wmf.21 deployed: no apparent change in fatalmonitor error rate. refs T183960
  • 20:18 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.21
  • 20:11 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/TwoColConflict/includes/TwoColConflictHooks.php: sync https://gerrit.wikimedia.org/r/#/c/410809/ (duration: 01m 13s)
  • 20:09 twentyafterfour: syncing a patch before deploying 1.31.0-wmf.21 to all wikis.
  • 19:55 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Follow-up 77be427a1: Enable the Beta Feature on all wikis T185708 (duration: 01m 12s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Portal and Portal talk namespace alias of zhwiki T184866 (duration: 01m 13s)
  • 19:13 thcipriani@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Set SPARQL endpoint for category search T184840 (duration: 01m 12s)
  • 18:42 arlolra@tin: Finished deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195 (duration: 08m 34s)
  • 18:33 arlolra@tin: Started deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195
  • 18:11 bsitzmann@tin: Finished deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475) (duration: 05m 54s)
  • 18:06 bsitzmann@tin: Started deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475)
  • 17:24 foks: removed 2FA from User:Lea Lacroix (WMDE)
  • 17:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 (duration: 01m 12s)
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1097:3315, db1089, db1066 (duration: 01m 12s)
  • 16:32 andrew@tin: Finished deploy [horizon/deploy@4e7ccc5]: lots of updates (duration: 03m 13s)
  • 16:29 andrew@tin: Started deploy [horizon/deploy@4e7ccc5]: lots of updates
  • 16:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3315 (duration: 01m 12s)
  • 15:34 ema: upgrade upload @ eqsin to varnish 5
  • 15:27 marostegui: Deploy schema change on db1051 - T187089 T185128 T153182
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051, fully repool db1097:3314, increase weight for db1097:3315 (duration: 01m 13s)
  • 15:15 zeljkof: EU SWAT finished
  • 15:14 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Log accessing private abusefilter details (T160357) (duration: 01m 12s)
  • 14:58 moritzm: installing erlang security updates on labcontrol1001
  • 14:53 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable the visual diff beta feature (T185708) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.21/includes/Revision.php: SWAT: Log the reason why revision->getContent() returns null (T184670) (duration: 01m 12s)
  • 14:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable log channel T184670 (T184670) (duration: 01m 12s)
  • 14:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2042 (duration: 01m 11s)
  • 14:20 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2042 (duration: 01m 12s)
  • 14:09 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add RevisionStore to wmgMonologChannels: (duration: 01m 13s)
  • 12:01 addshore: script run for T185738 done
  • 11:59 milimetric@tin: Finished deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars (duration: 09m 33s)
  • 11:58 addshore: addshore@terbium:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki elwiktionary --batchsize 1000 # T185738
  • 11:49 milimetric@tin: Started deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars
  • 10:58 marostegui: Stop replication in sync db1089 and db1066
  • 10:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 and slowly repool db1097:3315 (duration: 01m 12s)
  • 10:38 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 fully (duration: 01m 12s)
  • 10:28 marostegui: Upgrade mariadb on db1066
  • 10:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 (duration: 01m 12s)
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1097:3314 (duration: 01m 12s)
  • 09:48 marostegui: Deploy schema change on db1097:3315 - T187089 T185128 T153182
  • 09:39 marostegui: Upgrade kernel and mariadb on db1097
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 for s4 and s5 (duration: 01m 12s)
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 (duration: 01m 12s)
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1082 (duration: 01m 12s)
  • 08:54 moritzm: installing erlang security updates on labtestcontrol*
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1082 (duration: 01m 13s)
  • 08:18 marostegui: Upgrade kernel + mariadb on db1082 (sanitarium master in s5)
  • 07:55 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1066 - T162807 (duration: 01m 12s)
  • 07:39 marostegui: Deploy schema change on db1082 (sanitarium master) with replication, this will generate lag on labs - T187089 T185128 T153182
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 13s)
  • 07:35 moritzm: installing libvorbis security updates on stretch
  • 07:30 twentyafterfour: phabricator upgrade finished. phd is back online.
  • 07:27 twentyafterfour: phabricator database migration finished
  • 07:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1110 (duration: 01m 12s)
  • 07:09 jynus: reimage dbproxy1003 to stretch
  • 07:04 twentyafterfour: Applying patch "phabricator:20180215.maniphest.02.populate.php" to host "m3-master.eqiad.wmnet"...
  • 07:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1110 (duration: 01m 13s)
  • 06:57 twentyafterfour: apache restarted, update appears to be successful
  • 06:57 twentyafterfour: phabricator database migrations applied
  • 06:50 twentyafterfour: shutting down apache on phab1001 to deploy update, downtime should be only a couple of minutes
  • 06:49 twentyafterfour: starting phabricator upgrade tagged release/2018-02-15/1
  • 06:45 twentyafterfour: restarted apache on phab1001 and reset cluster.read-only to false
  • 06:44 jynus: set db1059 in read-write
  • 06:38 jynus: merging dns update for phabricator db
  • 06:35 jynus: set db1043 as read only
  • 06:34 twentyafterfour: set cluster.read-only in phabricator
  • 06:33 jynus: about to set phabricator.wikimedia.org as read only
  • 06:28 jynus: scheduling downtime for phabricator on phab1001
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1110 (duration: 01m 13s)
  • 06:06 marostegui: Upgrade mysql on db1110
  • 05:57 jynus: restarting dbproxy1008 for kernel upgrade
  • 05:43 andrew@tin: Finished deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon (duration: 03m 06s)
  • 05:40 andrew@tin: Started deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 07m 25s)
  • 02:01 mutante: phab1001 - restarted apache to fix server status page
  • 01:27 twentyafterfour: restarting apache2 on phab1001 to free deadlocked php processes.
  • 01:03 twentyafterfour: using the current phabricator maintenance window to deploy https://gerrit.wikimedia.org/r/#/c/410626/
  • 01:03 twentyafterfour: the scheduled phabricator upgrade is delayed until 06:00 UTC Thursday because of large database migrations. Doing the upgrade at a time when DBAs are available to assist.
  • 00:52 maxsem@tin: Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 14s)
  • 00:49 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 13s)

2018-02-14

  • 23:39 AaronSchulz: Running initSiteStats.php on s3 for T186947
  • 22:04 aaron@tin: Synchronized php-1.31.0-wmf.20/includes/SiteStats.php: f549559dc0 (duration: 01m 13s)
  • 21:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with full weight (duration: 01m 13s)
  • 21:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f (duration: 06m 01s)
  • 21:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f
  • 21:30 arlolra@tin: Finished deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed (duration: 15m 12s)
  • 21:15 arlolra@tin: Started deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed
  • 21:00 ema: upgrade cp1099 to varnish 5 (last upload@eqiad host)
  • 20:54 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice: Sync CentralNotice again after proper rebase (duration: 01m 14s)
  • 20:43 ema: upgrade cp1074 to varnish 5
  • 20:42 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice/: sync https://gerrit.wikimedia.org/r/#/c/410346/ for Ejegg (duration: 01m 15s)
  • 20:40 twentyafterfour: Group1 wikis are now running MediaWiki 1.31.0-wmf.21 - still no blockers on T183960
  • 20:38 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:37 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:33 ema: upgrade cp1073 to varnish 5
  • 20:05 ema: upgrade cp1072 to varnish 5
  • 19:44 ema: upgrade cp1071 to varnish 5
  • 19:25 XioNoX: enabling netflow on cr1-eqiad
  • 19:24 no_justification: ran namespaceDupes.php --fix for hiwiki
  • 19:24 demon@tin: Synchronized wmf-config/InitialiseSettings.php: portal aliases for hiwiki (duration: 01m 13s)
  • 19:22 ema: upgrade cp1064 to varnish 5
  • 19:20 no_justification: running updateCollation.php on nowikimedia
  • 19:19 demon@tin: Synchronized wmf-config/InitialiseSettings.php: nowikimedia collation, T185630 (duration: 01m 13s)
  • 19:16 andrewbogott: rebooting labvirt1019 so I can have a look at the raid setup, for T172538
  • 19:14 no_justification: ran namespaceDupes.php --fix on wawiktionary
  • 19:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: wawiktionary namespaces, T185289 (duration: 01m 13s)
  • 19:11 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Revert prior, busted the canaries (duration: 01m 15s)
  • 19:08 demon@tin: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 19:06 demon@tin: rebuilt and synchronized wikiversions files: namespace aliases for zhwiki, T184866
  • 19:00 ema: upgrade cp1063 to varnish 5
  • 17:43 ema: upgrade cp1062 to varnish 5
  • 17:42 moritzm: updated jenkins packages on apt.wikimedia.org for stretch (thirdpary/ci) and jessie (thirdparty) to 2.89.4
  • 17:39 hashar: CI Jenkins seems all happy following the upgrade ^o^
  • 17:34 moritzm: updating remaining python-cryptography updates from jessie point release
  • 17:32 hashar: Upgrading Jenkins on contint1001 / contint2001
  • 17:30 godog: roll-restart ms-fe to pick up https://gerrit.wikimedia.org/r/c/410199/
  • 17:22 moritzm: installing uwsgi jessie update on graphite*
  • 17:20 godog: roll-upgrade thumbor 1.14 in eqiad/codfw
  • 16:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 09s)
  • 16:56 ema: upgrade cp1050 to varnish 5
  • 16:50 marostegui: Deploy schema change on db1110 - T187089 T185128 T153182
  • 16:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 01m 12s)
  • 16:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with low weight (duration: 01m 12s)
  • 16:19 ema: upgrade cp1049 to varnish 5
  • 15:59 jynus: upgrade and restart db1088
  • 15:52 moritzm: rolling out debdeploy 0.0.99.2 (cumin masters already upgraded for a while, just synching the clients)
  • 15:51 andrewbogott: powering down labvirt1008 so chris can re-apply thermal paste
  • 15:45 moritzm: installing libgcrypt security updates on trusty
  • 15:31 zeljkof: EU SWAT finished
  • 15:24 godog: roll-upgrade thumbor to 1.13 - T187159 T179954 T187088
  • 15:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add suppressredirect to autoconfirmed at zhwikt" (T187018) (duration: 01m 13s)
  • 15:18 ema: upgrade cp1048 to varnish 5
  • 14:47 moritzm: installing PHP security updates
  • 14:44 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable flood flag at zhwikt (T187018) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Require 7 days & 10 edits for autoconfirmed at zhwiktionary (T187018) (duration: 01m 13s)
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make alias from old NS_PROJECT to new NS_PROJECT at hiwikiversity (T185347) (duration: 01m 12s)
  • 14:21 akosiaris: reboot ganeti1008 for kernel upgrade T181121
  • 14:14 zfilipin@tin: Synchronized wmf-config/reverse-proxy.php: SWAT: wgSquidServersNoPurge: add eqsin, remove dead IP (T156027) (duration: 01m 12s)
  • 14:11 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/mmv.3d.head.js: Fix 3D badge (duration: 01m 12s)
  • 14:10 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge and Webkit thumb load detection (duration: 01m 13s)
  • 13:44 elukey: rollback java 8 upgrade for archiva - issues with Analytics builds
  • 13:34 elukey: installed openjdk-8 on meitnerium, manually upgraded java-update-alternatives to java8, restarted archiva
  • 13:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 original weight (duration: 01m 12s)
  • 13:16 jynus: stop slave and rolling schema change on db1059 m3 replica
  • 13:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 (duration: 01m 12s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1106 (duration: 01m 12s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1106 (duration: 01m 12s)
  • 12:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1106 (duration: 01m 12s)
  • 11:25 marostegui: Deploy schema change on db1106 - T187089 T185128 T153182
  • 11:16 marostegui: Stop MySQL and reboot db1106 for mysql and kernel upgrade
  • 11:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 12s)
  • 11:14 filippo@tin: Synchronized wmf-config/ProductionServices.php: repool poolcounter1002 after disk replacement (duration: 01m 12s)
  • 11:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 (duration: 01m 12s)
  • 10:46 jynus: dropping test databases from m5 T186585
  • 10:42 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 10:28 moritzm: installing libvorbis security updates on trusty systems
  • 10:13 marostegui: Deploy schema change on db1100 - T187089 T185128 T153182
  • 10:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 (duration: 01m 12s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316,3315 (duration: 01m 12s)
  • 09:50 akosiaris: set standard weight for all ores* hosts
  • 09:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool slowly db1096:3316,3315 (duration: 01m 13s)
  • 09:08 marostegui: Deploy schema change on s5 dbstore1002 https://phabricator.wikimedia.org/T187089 https://phabricator.wikimedia.org/T185128 https://phabricator.wikimedia.org/T153182
  • 09:02 marostegui: Stop MySQL on db1096:3315 and 3316 for mysql+kernel upgrade
  • 08:45 jynus@tin: Synchronized wmf-config/db-eqiad.php: Rebalance s8 (duration: 01m 13s)
  • 08:38 akosiaris: pybal restart on lvs1003 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:29 akosiaris: pybal restart on lvs1006, lvs1009, lvs1012 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:08 _joe_: powercycled ganeti1008
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 (duration: 01m 12s)
  • 06:44 marostegui: Stop replication in sync db1089 and db1067 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 01m 12s)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1096:3315 for alter table (duration: 01m 13s)
  • 06:30 marostegui: Deploy schema change on db1096:3315 - T187089 T185128 T153182
  • 05:55 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 03m 13s)
  • 05:52 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 05:52 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 00m 20s)
  • 05:51 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 39s)
  • 02:02 demon@tin: Synchronized fonts/: removing executable bits, no-op (duration: 01m 15s)
  • 01:33 demon@tin: Finished deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now) (duration: 00m 11s)
  • 01:32 demon@tin: Started deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now)
  • 00:25 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add uploader user group to mznwiki and make it automagically added T187187 (duration: 01m 12s)
  • 00:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable xkill on top wikis that use x aspect T187265 (duration: 01m 14s)

2018-02-13

  • 21:19 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.21
  • 21:07 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 49s)
  • 21:07 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 20:43 twentyafterfour@tin: Finished scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis (duration: 31m 01s)
  • 20:41 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 00m 21s)
  • 20:41 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:26 jynus: upgrading labsdb1010 database - proxies will complain for some time
  • 20:18 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 01m 17s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:12 twentyafterfour@tin: Started scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis
  • 20:11 twentyafterfour: Currently there are no blockers listed on T183960 and the train is leaving the station.
  • 20:05 twentyafterfour: MediaWiki Train 1.31.0-wmf.21 branched, prepped and patched | Changelog uploaded to https://www.mediawiki.org/wiki/MediaWiki_1.31/wmf.21/Changelog | Blockers: T183960
  • 19:03 jynus: upgrade and restart db2042
  • 18:53 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 (duration: 01m 58s)
  • 18:25 elukey: Analytics Hadoop cluster upgrade to Java 8 about to start - complete cluster shutdown is needed - T166248
  • 18:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc (duration: 05m 28s)
  • 18:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc
  • 18:00 twentyafterfour: Preparing to cut new MediaWiki branch wmf/1.31.0-wmf.21 - report deployment blockers for this branch in phabricator: T183960
  • 17:54 godog: repool mw1256 after disk swap - T186535
  • 17:20 demon@tin: Synchronized README: forcing git config sync, setting core.sharedRepository=group, T187076 (duration: 01m 12s)
  • 17:13 cmjohnson1: sorry snapshot1001 is going down for rack relocation
  • 17:12 cmjohnson1: stat1001 going down to for rack relocation
  • 17:04 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 17:03 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 16:36 demon@tin: Synchronized scap/plugins/clean.py: no-op, consistency (duration: 00m 55s)
  • 16:23 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 0 (duration: 00m 56s)
  • 16:17 cmjohnson1: replacing disk poolcounte1002
  • 15:35 marostegui: Deploy schema change on s5 codfw master (db2052), this will generate lag on codfw - T187089 T185128 T153182
  • 15:30 bblack: deploying changes to URL-encoding normalization on caches - https://gerrit.wikimedia.org/r/407488
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 (duration: 00m 55s)
  • 15:01 zeljkof: EU SWAT finished
  • 14:59 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 55s)
  • 14:58 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 54s)
  • 14:37 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change logos for sdwiki (T185865) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized php-1.31.0-wmf.20/extensions/ContentTranslation/extension.json: SWAT: Add ext.cx.widgets.overlay dependency to template editor (T187119) (duration: 00m 55s)
  • 14:22 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for sdwiki (T184521) (duration: 00m 57s)
  • 13:51 marostegui: Reboot db2066 to pick up new kernel
  • 13:50 marostegui: Deploy schema change on dbstore2001 - T187089 T185128 T153182
  • 12:51 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 56s)
  • 12:20 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:19 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:07 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge (duration: 00m 56s)
  • 11:57 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 (duration: 00m 55s)
  • 11:56 marostegui: Deploy schema change on db2066 - T187089 T185128 T153182
  • 11:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Rpool db2038 and db2059 (duration: 00m 55s)
  • 11:47 jynus: reenabling puppet on all eqiad databases
  • 11:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099 (duration: 00m 56s)
  • 11:37 marostegui: Stop MySQL on db2059 and db2038 for kernel upgrade
  • 11:29 ema: lvs1003: restart pybal to reconnect to etcd
  • 11:27 ema: lvs1006/1010: restart pybal to reconnect to etcd
  • 11:26 ema: lvs4005: restart pybal to reconnect to etcd
  • 11:23 ema: esams primary LVSs: restart pybal to reconnect to etcd
  • 11:21 ema: esams secondary LVSs: restart pybal to properly reconnect to etcd
  • 11:14 ema: repool cp3007
  • 11:13 ema: depool cp3007 to test pybal's behavior on lvs3002
  • 10:51 filippo@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1002 for disk replacement (duration: 00m 56s)
  • 10:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099 (duration: 00m 54s)
  • 10:08 godog: roll-restart ms-fe in codfw/eqiad after applying https://gerrit.wikimedia.org/r/c/409942/
  • 10:03 ema: restart pybal on lvs2003
  • 09:58 ema: restart pybal on lvs2006
  • 09:52 filippo@neodymium: conftool action : set/pooled=no; selector: name=ms-fe2005.codfw.wmnet
  • 09:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2075, depool db2038 and db2059 (duration: 00m 55s)
  • 09:32 marostegui: Stop mysql on db2075 for mysql and kernel upgrade
  • 09:30 marostegui: Stop replication in sync on db1089 and dbstore1002 - T162807
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 54s)
  • 09:22 elukey: powercycle analytics1062 - not reachable via ssh, frozen via serial console
  • 09:22 jynus: disabling puppet on all eqiad databases
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 55s)
  • 09:20 marostegui: Stop replication in sync on db1089 and db1065 - T162807
  • 09:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2084:3315, depool db2075 (duration: 00m 55s)
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 55s)
  • 08:52 marostegui: Stop replication in sync on db1089 and db1099:3311 - T162807
  • 08:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089, db1099 - T162807 (duration: 00m 55s)
  • 08:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2084:3315 (duration: 00m 56s)
  • 08:37 hashar: tin.eqiad.wmnet: removing live hack in /srv/mediawiki-staging/scap/plugins/clean.py | T187160
  • 08:32 moritzm: installing wavpack security updates
  • 08:09 moritzm: installing exim security updates on trusty hosts
  • 07:02 marostegui: Deploy schema change on s5 db2089 db2084 db2075 db2039 db2059 - T187089
  • 06:28 marostegui: reload haproxy on dbproxy1005
  • 05:10 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp50(0[12345789]|1[12]).eqsin.wmnet
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 29s)
  • 00:24 cwd: re-enabled p-c
  • 00:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/modules/ve-mw/ui/pages/: T187112 (duration: 00m 56s)
  • 00:10 cwd: disabled p-c jobs for reboot
  • 00:04 demon@tin: Synchronized wmf-config/: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 57s)
  • 00:03 demon@tin: Synchronized wmf-config/InitialiseSettings.php: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 56s)

2018-02-12

  • 23:47 demon@tin: Finished deploy [gerrit/gerrit@6adde70]: reviewers plugin (duration: 00m 12s)
  • 23:46 demon@tin: Started deploy [gerrit/gerrit@6adde70]: reviewers plugin
  • 23:32 mutante: terbium,wasat: touch /var/log/mediawwiki/purge_abusefilter.log ; set owner/permissions like other logfiles
  • 23:13 elukey: manual restart of Yarn Node Managers on analytics1058/31 (failed due to root partition filled up for the issue logged before)
  • 23:09 elukey: cleaned up tmp files on all analytics hadoop worker nodes, job filling up tmp
  • 21:27 andrew@tin: Finished deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content (duration: 03m 18s)
  • 21:24 andrew@tin: Started deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content
  • 21:06 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5 (duration: 05m 46s)
  • 21:00 mholloway-shell@tin: Started deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5
  • 20:21 andrew@tin: Finished deploy [horizon/deploy@c009388]: updating puppet dashboard (duration: 03m 22s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c009388]: updating puppet dashboard
  • 20:13 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/UUID.php: T186909 (duration: 00m 56s)
  • 20:08 andrew@tin: Finished deploy [horizon/deploy@cba66d2]: more submodule tinkering (duration: 01m 15s)
  • 20:07 ppchelko@tin: Finished deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API (duration: 15m 10s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@cba66d2]: more submodule tinkering
  • 20:01 andrew@tin: Finished deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks (duration: 01m 02s)
  • 20:00 andrew@tin: Started deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks
  • 19:58 andrew@tin: Finished deploy [horizon/deploy@9d73005]: fixes to post-isntall checks (duration: 01m 01s)
  • 19:57 andrew@tin: Started deploy [horizon/deploy@9d73005]: fixes to post-isntall checks
  • 19:52 ppchelko@tin: Started deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API
  • 19:50 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 45s)
  • 19:50 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:48 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 03s)
  • 19:47 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:44 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes (duration: 01m 06s)
  • 19:43 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes
  • 19:17 niharika29@tin: Synchronized wmf-config/filebackend.php: Proxy public wiki thumb.php requests through Thumbor T169144 (duration: 00m 55s)
  • 19:13 andrew@tin: Finished deploy [horizon/deploy@01021b4]: trying another force (duration: 00m 17s)
  • 19:13 andrew@tin: Started deploy [horizon/deploy@01021b4]: trying another force
  • 19:12 niharika29@tin: Synchronized php-1.31.0-wmf.20/extensions/PageAssessments/: Fix 500 error with PageAssessments API T185037 (duration: 00m 56s)
  • 19:07 niharika29@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Stop PHP errors from going to the hhvm channel T45086 (duration: 00m 56s)
  • 18:58 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 07m 39s)
  • 18:50 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:48 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 14s)
  • 18:35 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:34 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 06m 47s)
  • 18:27 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:23 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: ores1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 18:12 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 30s)
  • 18:09 gehel@tin: Finished deploy [wdqs/wdqs@b6bd483]: new WDQS GUI (duration: 01m 53s)
  • 18:07 gehel@tin: Started deploy [wdqs/wdqs@b6bd483]: new WDQS GUI
  • 18:00 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:47 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 13m 18s)
  • 17:45 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards again (duration: 00m 17s)
  • 17:45 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards again
  • 17:34 gilles: added thumborUrl to PrivateSettings.php on labs, in preparation for https://gerrit.wikimedia.org/r/#/c/407611/
  • 17:34 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:21 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 [keeping static files] (duration: 02m 08s)
  • 17:18 elukey: home dirs on stat1004 moved to /srv/home (/home symlinks to it)
  • 17:10 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards (duration: 00m 54s)
  • 17:09 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards
  • 16:56 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns5001.wikimedia.org
  • 16:52 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/ApiVisualEditor.php: T186934 (duration: 00m 57s)
  • 16:27 andrew@tin: Finished deploy [horizon/deploy@4d1bdeb]: updating requirements.txt (duration: 01m 04s)
  • 16:26 andrew@tin: Started deploy [horizon/deploy@4d1bdeb]: updating requirements.txt
  • 16:16 andrew@tin: Finished deploy [horizon/deploy@de72527]: scap debugging run (duration: 00m 24s)
  • 16:16 andrew@tin: Started deploy [horizon/deploy@de72527]: scap debugging run
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 00m 55s)
  • 15:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 55s)
  • 15:28 marostegui: Stop replication in sync on db1089 and db1105:3311 - T162807
  • 15:23 moritzm: installing libtasn security updates
  • 15:02 reedy@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/maintenance/: Fix maintenance scripts (duration: 00m 56s)
  • 15:01 godog: roll-upgrade thumbor to 1.12 - T186500 T186594 T186492
  • 14:54 elukey: upload prometheus-burrow-exporter 0.0.4 on jessie/stretch-wikimedia
  • 14:51 ottomata: emitting IP field from varnishkafka-eventlogging instance T186833
  • 14:51 zeljkof: EU SWAT finished
  • 14:47 filippo@neodymium: conftool action : set/pooled=no; selector: name=mw1227.eqiad.wmnet
  • 14:44 addshore@tin: Finished scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description) (duration: 19m 56s)
  • 14:44 andrew@tin: Finished deploy [horizon/deploy@de72527]: just checking that this still doesn't work (duration: 00m 04s)
  • 14:44 andrew@tin: Started deploy [horizon/deploy@de72527]: just checking that this still doesn't work
  • 14:38 moritzm: uploading cassandra 3.11.0-wmf5 to component/cassandra311 for stretch-wikimedia/apt.wikimedia.org (T186619)
  • 14:24 addshore@tin: Started scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description)
  • 14:22 otto@tin: Finished deploy [eventlogging/analytics@01d5761]: T186833 (duration: 00m 04s)
  • 14:22 otto@tin: Started deploy [eventlogging/analytics@01d5761]: T186833
  • 14:20 godog: grant group write for wikidev on tin on /srv/mediawiki-staging/php-1.31.0-wmf.20/.git
  • 13:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 01m 06s)
  • 13:11 marostegui: Deploy schema change on db2084 and db2075 - T185128 T153182
  • 12:03 moritzm: upgrading jessie-based servers in deployment-prep/beta to the HHVM build using ICU 57 (component/icu57)
  • 11:15 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:14 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 10:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T162807 (duration: 00m 55s)
  • 10:07 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 10:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T162807 (duration: 00m 55s)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 55s)
  • 09:51 elukey: reboot mw1302 (hhvm defunct processes, hungs registered in dmesg, very high load)
  • 09:46 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 00m 56s)
  • 09:29 moritzm: installing libdatetime-timezone-perl SUA update
  • 09:25 godog: install swift stretch updates on ms-be eqiad - T177739
  • 09:19 marostegui: Deploy schema change on s5 - T185128 T153182
  • 09:05 marostegui: Stop replication in sync on db1089 and db2048 - T162807
  • 09:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 55s)
  • 08:57 moritzm: installing glibc security updates on trusty (harmless in our environment; CVE-2018-1000001 is non-exploitable due to disabled unprivileged user name spaces)
  • 08:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T184599 (duration: 00m 55s)
  • 08:36 marostegui: Reboot db1087 to pick new kernel
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092, depool db1087 - T184599 (duration: 00m 55s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318, depool db1092 - T184599 (duration: 00m 55s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318, depool db1099:3318 - T184599 (duration: 00m 55s)
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104, depool db1101:3318 - T184599 (duration: 00m 55s)
  • 08:01 hashar: Upgrading CI Jenkins plugins
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1109, depool db1104 - T184599 (duration: 00m 55s)
  • 07:46 moritzm: installing exim security updates on remaining hosts
  • 07:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1109 - T184599 (duration: 00m 55s)
  • 06:53 marostegui: Reboot db1109 to pick up new kernel
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T184599 (duration: 00m 56s)
  • 06:40 marostegui: Drop dewiki database from s8 servers - T184599
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 11m 40s)

2018-02-11

  • 14:06 moritzm: installing exim4 security updates on MXs

2018-02-10

  • 16:51 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/specials/SpecialLog.php: SpecialLog: Fix results when no offender is specified - T186950 (duration: 00m 57s)
  • 01:10 demon@tin: Finished deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend (duration: 00m 10s)
  • 01:10 demon@tin: Started deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend

2018-02-09

  • 23:28 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:26 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:01 jynus: restart haproxy on dbproxy1005
  • 22:47 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again) (duration: 00m 03s)
  • 22:47 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again)
  • 22:45 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:43 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:42 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again) (duration: 00m 40s)
  • 22:42 tgr@tin: Synchronized php-1.31.0-wmf.20/includes/parser/ParserOutput.php: emergency fix for T186927 (duration: 00m 57s)
  • 22:42 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again)
  • 22:36 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (duration: 09m 59s)
  • 22:26 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901
  • 22:10 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again (duration: 00m 03s)
  • 22:10 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again
  • 22:08 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches (duration: 00m 17s)
  • 22:08 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches
  • 21:40 andrew@tin: Finished deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try (duration: 00m 14s)
  • 21:40 andrew@tin: Started deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try
  • 21:28 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/includes/api/ApiQueryAbuseLog.php: T186914 (duration: 00m 54s)
  • 21:20 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Block/TopicList.php: T186911 (duration: 00m 55s)
  • 21:10 ejegg@tin: Synchronized php-1.31.0-wmf.20/extensions/CentralNotice/CentralNoticePageLogPager.php: Sync CentralNotice for banner content log fix (duration: 00m 56s)
  • 20:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/user/User.php: Avoid pointless DB_MASTER connections in User::clearSharedCache() (duration: 00m 55s)
  • 20:08 demon@tin: Synchronized php-1.31.0-wmf.20/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 55s)
  • 20:07 demon@tin: Synchronized php-1.31.0-wmf.20/includes/MediaWiki.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 57s)
  • 19:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Scribunto/common/Hooks.php: silence divide by zero / no such index 0 errors (duration: 00m 56s)
  • 18:31 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.20
  • 18:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/filerepo/file/LocalFile.php: Fix CommentStore->createComment() call in LocalFile.php (duration: 01m 12s)
  • 18:08 bblack: cp4023: after a brief period of levelling off a bit: sharp, steep recovery of mbox lag ramp back to ~6K. not sure if this is a new floor or will drop further, but seems pretty ok.
  • 18:03 bblack: cp4023: now seems to be leveling off on lag and decreasing objhdr locks. either expiry thread prio helped (which argues for our prio-related patches) or it was naturally going to end?
  • 17:44 bblack: cp4023: experimental, "renice -19 39007" (backend cache-timeout aka expiry thread), to see if mbox lag resolves on its own quicker
  • 17:19 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 16:53 andrew@tin: Finished deploy [horizon/deploy@de72527]: Rolling out pyldap wheel (duration: 02m 26s)
  • 16:51 andrew@tin: Started deploy [horizon/deploy@de72527]: Rolling out pyldap wheel
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:29 demon@tin: Finished deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change (duration: 00m 10s)
  • 16:29 demon@tin: Started deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change
  • 15:49 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1 T186866
  • 15:47 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1
  • 15:47 akosiaris: upload etherpad-lite 1.6.3-1 to apt.wikimedia.org/jessie-wikimedia/main T186866
  • 15:00 herron: upgraded mailman on fermium for security updates
  • 14:24 demon@tin: Synchronized php-1.31.0-wmf.20/tests/phpunit/includes/db/LBFactoryTest.php: no-op to prior (duration: 01m 12s)
  • 13:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 12s)
  • 13:33 demon@tin: Finished deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin (duration: 00m 10s)
  • 13:33 demon@tin: Started deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin
  • 10:51 marostegui: Stop replication in sync on db1067 and db1089 - T162807
  • 10:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 for data checksumming - T162807 (duration: 01m 11s)
  • 10:36 moritzm: uploaded php-luasandbox 2.0.14~stretch2 for stretch-wikimedia to apt.wikimedia.org (this removes the php-luasandbox binary from our internal luasandbox build in favour of the php-luasandbox package maintained by legoktm from stretch-backports). As such the php-luasandbox source package we build internall now only provides the HHVM extension (and we can retire it entirely when migrating to PHP7)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1080 - T162807 (duration: 01m 11s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 12s)
  • 09:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 11s)
  • 09:06 marostegui: Fix data drifts on db1067 - T162807
  • 08:45 demon@tin: Synchronized wmf-config/: rm cleanchanges (duration: 01m 14s)
  • 08:44 demon@tin: Synchronized multiversion/submodules.json: rm CleanChanges (duration: 01m 13s)
  • 07:57 marostegui: Stop replication on labsdb1004 to fix replication issues
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1080 - T162807 (duration: 01m 11s)
  • 07:39 elukey: forced remount of /mnt/hdfs on stat1005
  • 06:52 marostegui: Fix replication on labsdb1010 - T186579
  • 06:47 marostegui: Fix data drifts, upgrade kernel, mariadb and socket path on db1080 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T162807 (duration: 01m 12s)
  • 02:41 andrew@tin: Finished deploy [horizon/deploy@60cac8e]: updating with designate dashboard (duration: 02m 42s)
  • 02:38 andrew@tin: Started deploy [horizon/deploy@60cac8e]: updating with designate dashboard
  • 00:18 demon@tin: rebuilt and synchronized wikiversions files: surprise, it broke. revert group1 back to wmf.20
  • 00:16 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20 *duck and cover*

2018-02-08

  • 23:49 ppchelko@tin: Finished deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag (duration: 15m 44s)
  • 23:33 ppchelko@tin: Started deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag
  • 22:37 bsitzmann@tin: Finished deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95 (duration: 05m 07s)
  • 22:32 bsitzmann@tin: Started deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95
  • 22:17 demon@tin: rebuilt and synchronized wikiversions files: mw.org back to wmf.20
  • 22:08 XioNoX: rebooting cr1-eqsin
  • 21:59 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess? (duration: 00m 03s)
  • 21:58 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess?
  • 21:53 ottomata: finished upgrade of scb to librdkafka 0.11 and node-rdkafka 2
  • 21:49 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 49s)
  • 21:49 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 35s)
  • 21:48 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 46s)
  • 21:47 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:40 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 15s)
  • 21:40 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:40 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 04s)
  • 21:40 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:39 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 47s)
  • 21:38 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 24s)
  • 21:38 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:38 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:32 herron: restarted rsyslogd services on lithium and wezen to clear rsyslog tls listener on port 6514 icinga alerts
  • 21:23 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 54s)
  • 21:23 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 01m 03s)
  • 21:23 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:22 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 25s)
  • 21:22 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:22 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:13 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 21:12 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:11 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 45s)
  • 21:10 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:09 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 21s)
  • 21:09 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five (duration: 01m 25s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five
  • 20:52 andrew@tin: Finished deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four (duration: 01m 36s)
  • 20:50 andrew@tin: Started deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four
  • 20:34 ppchelko@tin: Started restart [changeprop/deploy@5fdc03a]: Restart CP to force rule rebalance
  • 20:27 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 46s)
  • 20:26 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 20:26 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 20:26 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 20:24 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 22s)
  • 20:24 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 20:20 ottomata: starting deploy process to update scb cluster to librdkafka 0.11 and node-rdkafka 2. we will depool, stop puppet, deploy, test, start puppet on each node
  • 20:03 no_justification: gerrit: killed about 12 parallel clones of mediawiki/extensions/Math that had been running between 2-3 days (wtf?)
  • 19:24 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/AbstractRevision.php: T186077 (duration: 01m 11s)
  • 19:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateStyles on svwiki (T176082) (duration: 01m 11s)
  • 19:17 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Campaigns/CampaignsSecondaryAuthenticationProvider.php: T185870 (duration: 01m 13s)
  • 19:02 bsitzmann@tin: Finished deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94 (duration: 08m 21s)
  • 19:00 bblack: lvs@ulsfo - all back to normal
  • 18:55 bblack: lvs@ulsfo - puppet disabled, trying tagged vlan deploy
  • 18:54 bsitzmann@tin: Started deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94
  • 18:38 arlolra: Updated Parsoid to 961a5cf (T186630)
  • 18:27 arlolra@tin: Finished deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf (duration: 08m 11s)
  • 18:26 andrew@tin: Finished deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three (duration: 01m 16s)
  • 18:25 andrew@tin: Started deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three
  • 18:19 arlolra@tin: Started deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf
  • 18:10 ema: upgrade cp2026 to varnish 5
  • 17:55 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns400[12].wikimedia.org
  • 17:21 akosiaris: repool sca1004 (zotero) for T181121
  • 17:21 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 17:16 ema: upgrade cp2024 to varnish 5
  • 16:58 ema: upgrade cp2022 to varnish 5
  • 16:39 moritzm: installing PHP7 security updates
  • 16:32 moritzm: installing mysql security updates on auth*
  • 16:31 ema: upgrade cp2020 to varnish 5
  • 16:30 bblack: puppet disabled on all ntp servers for initial ulsfo recdns/ntp config process
  • 16:25 bblack: puppet disabled on lvs400[67] for initial ulsfo recdns config process
  • 16:23 elukey: stop archiva on meitnerium to swap /var/lib/archiva from the root partition to a new separate one - T186020
  • 16:20 akosiaris: reboot ganeti1005 T181121
  • 16:18 akosiaris: depool sca1004 (zotero) for T181121
  • 16:17 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 16:13 bblack: rebooting dns400[12] (downtimed, currently spare::system)
  • 16:13 ema: upgrade cp2017 to varnish 5
  • 16:11 andrew@tin: Finished deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two (duration: 01m 24s)
  • 16:10 andrew@tin: Started deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two
  • 16:05 bblack: ntp servers back to normal
  • 16:04 andrew@tin: Finished deploy [horizon/deploy@2f176e2]: updating with designate dashboard (duration: 01m 11s)
  • 16:03 andrew@tin: Started deploy [horizon/deploy@2f176e2]: updating with designate dashboard
  • 15:57 ema: upgrade cp2014 to varnish 5
  • 15:48 moritzm: installing libio-socket-ssl-perl update from jessie point release
  • 15:47 bblack: disabling puppet on all global dns recursors for controlled config deploy
  • 15:35 ema: upgrade cp2011 to varnish 5
  • 15:18 ema: upgrade cp2008 to varnish 5
  • 15:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1073 - T162807 (duration: 01m 12s)
  • 14:59 moritzm: installing icu security updates from jessie/stretch point releases
  • 14:56 ema: upgrade cp2005 to varnish 5
  • 14:49 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 14:47 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on meta and mediawiki.org (duration: 01m 12s)
  • 14:43 zeljkof: EU SWAT finished
  • 14:31 moritzm: upgrading deployment-mediawiki04 to HHVM linked against ICU 57
  • 14:23 ema: upgrade cp2002 to varnish 5
  • 13:54 marostegui: Rename dewiki tables on s8 slaves - T184599
  • 13:53 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454 (duration: 00m 02s)
  • 13:53 ariel@tin: Started deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454
  • 13:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with low weight - T162807 (duration: 01m 11s)
  • 13:41 marostegui: Drop dewiki already renamed tables and database on s8 master (db1071) - T184599
  • 13:22 marostegui: Fixing data drifts on db1073, also upgrade kernel, socket location and mysql - T162807
  • 13:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T162807 (duration: 01m 12s)
  • 13:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T184599 (duration: 01m 12s)
  • 13:09 moritzm: upgrade deployment servers and script runners to HHVM 3.18.7
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T184599 (duration: 01m 11s)
  • 13:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 - T184599 (duration: 01m 11s)
  • 13:02 moritzm: upgrade mwdebug servers to HHVM 3.18.7
  • 12:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 - T184599 (duration: 01m 11s)
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T184599 (duration: 01m 11s)
  • 12:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T184599 (duration: 01m 11s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 - T184599 (duration: 01m 11s)
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T184599 (duration: 01m 11s)
  • 11:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:37 marostegui: Fix replication on labsdb1010 - T186579
  • 11:33 akosiaris: reboot ganeti1005 T181121
  • 11:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 (duration: 01m 11s)
  • 11:12 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 12s)
  • 11:00 marostegui: Drop wikidata renamed tables and database from s5 eqiad hosts - T184599
  • 10:07 marostegui: Drop deleted databases from sanitarium and labsdb hosts - T186685
  • 10:07 moritzm: upgrading remaining nginx-full packages on mw* in eqiad to 1.13.6-2+wmf1~jessie1
  • 08:07 moritzm: upgrade remaining app servers to HHVM 3.18.7
  • 07:27 _joe_: depooled mw1256 from traffic, scap (faulty disk, T186535); now powering it off
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 58s)
  • 02:20 eileen: Update CiviCRM civicrm revision changed from 71b1e35b99 to 61acc9175e (deploy citibank, benevity import updates)
  • 01:30 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 01:30 andrew@tin: Finished deploy [horizon/deploy@9223ba7]: Now with static content, I hope (duration: 01m 15s)
  • 01:29 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 00:35 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/: Revert "Use wgEditSubmitButtonLabelPublish from upstream", Assume wpTextbox1 has an API registered already (duration: 01m 12s)
  • 00:33 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/CirrusSearch/: T186765: Add special handling for profiles into config dump (duration: 01m 27s)

2018-02-07

  • 23:59 mutante: restarted icinga-wm, too quiet
  • 21:53 ebernhardson: mwdebug1001 back to standard deployed versions
  • 21:51 bsitzmann@tin: Finished deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643) (duration: 06m 41s)
  • 21:44 bsitzmann@tin: Started deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643)
  • 21:40 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:40 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:39 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:39 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:33 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png (duration: 03m 55s)
  • 21:29 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png
  • 21:27 ebernhardson: deploying wmf.20 to en* (except enwiki) on mwdebug1001 to debug new cirrus errors in wmf.20/wmf.19 mixed sister search
  • 21:13 andrew@tin: Finished deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times (duration: 01m 24s)
  • 21:12 andrew@tin: Started deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times
  • 21:07 demon@tin: rebuilt and synchronized wikiversions files: mw.org also back to wmf.17
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:04 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 02m 38s)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 44s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:00 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 05s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 20:39 demon@tin: rebuilt and synchronized wikiversions files: revert, huge spike in db lag
  • 20:36 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 19:47 ejegg: updated SmashPig from 1f56978c0c to 1ebee97a45
  • 19:43 ejegg: updated payments-wiki from 39a7ef32e5 to fe311c2d26
  • 19:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add NS_MAIN to $wgNamespacesWithSubpages for cawikimedia T185436 (duration: 01m 12s)
  • 19:11 chasemp: after conversation with andrew we moved labweb to public for T186729
  • 19:09 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Rename Project NS on Wikimedia Canada Chapter wiki T185661 (duration: 01m 11s)
  • 18:55 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove old "accountcreator" rules now handled by default T185417 T186462 (duration: 01m 12s)
  • 18:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Tidy: Re-do this as a sorted negative list that gets shorter over time (duration: 01m 13s)
  • 18:07 jynus: fixing ferm breakage by restarting the service on db1051
  • 17:38 awight: ORES celery workers restarted on scb100[1-4]
  • 16:53 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options - https://gerrit.wikimedia.org/r/408718 (Unbreak ExtensionDistributor) (duration: 01m 12s)
  • 16:47 gehel: upgrade of tilerator / kartotherian on maps eqiad completed, sorry for the noise...
  • 16:46 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 21s)
  • 16:46 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:44 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:44 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:43 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:42 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:39 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:39 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:38 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:38 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:37 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 17s)
  • 16:37 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:31 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 20s)
  • 16:31 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:30 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 17s)
  • 16:28 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:28 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 16:27 gehel: upgrading tilerator / kartotherian on maps eqiad
  • 16:00 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1271.eqiad.wmnet
  • 14:37 moritzm: installing poppler security updates
  • 14:33 zeljkof: EU SWAT finished
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Updates to enable transliteration for crhwiki (T23582) (duration: 01m 11s)
  • 14:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add "Portal" namespace on it.wikiquote (T185232) (duration: 01m 13s)
  • 14:05 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 47s)
  • 14:03 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:58 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: (no justification provided) (duration: 03m 02s)
  • 13:55 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:38 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 02m 45s)
  • 13:36 moritzm: installing p7zip security updates
  • 13:35 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:35 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 21s)
  • 13:34 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:33 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 06s)
  • 13:32 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:20 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 55s)
  • 13:19 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:18 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 22s)
  • 13:17 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:16 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:16 marostegui: Drop wikidata tables and database from s5 codfw hosts - T184599
  • 13:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 01m 11s)
  • 12:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight (duration: 01m 11s)
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low weight (duration: 01m 40s)
  • 11:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186321 (duration: 01m 11s)
  • 11:09 elukey: install libc6-dbg on phab1001 to get a more precise gdb stack trace - T182832
  • 11:04 marostegui: Stop MySQL on db1069 for MySQL upgrade, kernel upgrade and change binlog format to statement - T186321
  • 10:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186321 (duration: 01m 09s)
  • 09:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1051 after the BBU change - T186049 (duration: 01m 14s)
  • 09:41 kartik@tin: Finished deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901) (duration: 03m 44s)
  • 09:38 marostegui: Failover back labsdb1010 to labsdb1009 - T174569
  • 09:37 kartik@tin: Started deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901)
  • 09:18 marostegui: Failover labsdb1009 to labsdb1010 - T174569
  • 09:16 marostegui: Failover back labsdb1010 to labsdb1011 - T174569
  • 09:05 marostegui: Failover labsdb1011 to labsdb1010 - T174569
  • 08:43 marostegui: Change triggers for s3 on db1095 - T174569
  • 08:21 marostegui: Change triggers for s1 on db1095 - T174569
  • 08:11 marostegui: Change triggers for s5 on db1095 - T174569
  • 07:53 marostegui: Change triggers for s8 on db1095 - T174569
  • 07:17 marostegui: Change triggers for s7 on db1102 - T174569
  • 07:05 marostegui: Change triggers for s6 on db1102 - T174569
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Start repooling db1051 after the BBU change - T186049 (duration: 01m 15s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 34s)
  • 01:14 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable fine grained usage tracking, another batch. (T186645) (duration: 01m 11s)
  • 01:05 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable AICaptcha data collection everywhere (T186244) (duration: 01m 11s)
  • 00:45 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Support fallback values for referrer policy (T180921) (duration: 01m 12s)
  • 00:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options (duration: 01m 11s)
  • 00:28 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on wikis with < 10 errors in all high-priority categories (T184656) (duration: 01m 09s)

2018-02-06

  • 23:02 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 04s)
  • 23:02 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 23:00 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 03s)
  • 23:00 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:56 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 02m 45s)
  • 22:53 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:42 ejegg: updated SmashPig standalone from 778e8f87b4 to 1f56978c0c
  • 22:23 hashar: Zuul/CI seems to work all fine now
  • 21:49 hashar: Flushing Zuul queue and upgrading to zuul_2.5.1-wmf2 | T186381
  • 21:49 hashar: Flushing Zuul queue and upgrading
  • 21:41 hashar: Going to shutdown Zuul in a few for an emergency hotfix | T186381
  • 21:35 andrew@tin: Finished deploy [horizon/deploy@a316e45]: (no justification provided) (duration: 01m 00s)
  • 21:34 andrew@tin: Started deploy [horizon/deploy@a316e45]: (no justification provided)
  • 21:14 legoktm: restarted zuul due to patch being stuck (T186381)
  • 20:25 andrew@tin: Finished deploy [horizon/deploy@fbf761e]: (no justification provided) (duration: 01m 21s)
  • 20:23 andrew@tin: Started deploy [horizon/deploy@fbf761e]: (no justification provided)
  • 20:13 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.20
  • 20:11 demon@tin: Synchronized php: symlink swap (duration: 01m 17s)
  • 19:25 hashar: Restarted Zuul due to T186381
  • 18:55 demon@tin: Finished scap: bootstrap wmf.20 @ testwiki (duration: 26m 09s)
  • 18:55 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 00m 15s)
  • 18:55 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:47 arlolra: Updated Parsoid to 8a0ff6c (T183515, T129372, T181408)
  • 18:46 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 06m 23s)
  • 18:40 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:39 arlolra@tin: Finished deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c (duration: 03m 47s)
  • 18:35 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 18:29 demon@tin: Started scap: bootstrap wmf.20 @ testwiki
  • 18:22 demon@tin: Pruned MediaWiki: 1.31.0-wmf.16 (duration: 07m 29s)
  • 18:15 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 16:56 elukey: restart httpd on phab1001
  • 16:50 gehel: upgrading kartotherian / tilerator on maps codfw completed
  • afk: restarting jenkins for updates
  • 16:41 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 16:41 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:40 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:40 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 36s)
  • 16:39 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:38 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 16:37 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 16:36 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 30s)
  • 16:36 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:35 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 01s)
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2003.codfw.wmnet
  • 16:30 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:30 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:29 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 02m 34s)
  • 16:29 mutante: mw1262 started hhvm, it had Unhandled server exception: Class undefined: Psr\Log\LogLevel
  • 16:27 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:26 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:24 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 34s)
  • 16:24 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:15 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:14 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 22s)
  • 16:14 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:10 gehel: upgrading kartotherian / tilerator on maps codfw
  • 15:36 elukey: drain + shutdown of analytics1038 to replace faulty BBU - T185409
  • 15:02 zeljkof: EU SWAT finished
  • 15:01 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow eliminators to undelete at urwiki (T185829) (duration: 00m 55s)
  • 14:53 marostegui: Poweroff db1051 for BBU replacement - T186049
  • 14:50 akosiaris: upgrade service-checker to 0.1.4 on scb1001
  • 14:45 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Typo, its 2018 not 2017 (T185794) (duration: 00m 55s)
  • 14:39 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T186530) (duration: 00m 55s)
  • 14:35 chasemp: disable puppet on labs things for a cautious change rollout
  • 14:33 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on test wikis (duration: 00m 56s)
  • 14:28 marostegui: Changing triggers on s2 - T174569
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on fiwiki, hewiki, ruwiki, svwiki (T185945) (duration: 00m 55s)
  • 14:14 mlitn@tin: Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionsDetailsWidget.js: T184380 (duration: 00m 55s)
  • 14:10 ladsgroup@tin: Synchronized wmf-config/Wikibase.php: Add entityUsageModifierLimits config for Wikibase (T185693) (duration: 00m 55s)
  • 14:07 urandom: re-enable smartpath on restbase1010 (revert experiment) - T178177
  • 13:35 gehel: upgrading prometheus-elasticsearch-exporter across all elasticsearch nodes
  • 12:32 marostegui: Power cycled dbstore1001 after it crashed - T186596
  • 11:54 marostegui: Sanitize s4 - T174569
  • 11:11 _joe_: forcing a resync of /dev/md1 on conf2001 to verify if the higher timeouts avoid consensus loss in etcd
  • 11:02 ema: restart pybal on codfw primary LVSs to make them reconnect to etcd
  • 11:01 ema: restart pybal on codfw secondary LVSs to make them reconnect to etcd
  • 10:57 ema: restart pybal on eqiad primary LVSs to make them reconnect to etcd
  • 10:55 ema: restart eqiad secondary LVSs to make them reconnect to etcd
  • 10:47 _joe_: rolling restart of the eqiad etcd cluster
  • 10:39 _joe_: rolling restart of the codfw cluster to pick up the config changes
  • 09:38 marostegui: Sanitizing s2 - T174569
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1077 (duration: 00m 55s)
  • 08:21 elukey: rollback apache/httpd changes on phab1001 (restart required)
  • 08:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 with low weight (duration: 00m 53s)
  • 07:06 marostegui: Stop MySQL on db1077 for a full upgrade
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for MariaDB and kernel upgrade (duration: 00m 56s)
  • 06:49 marostegui: Fix replication on labsdb1010 - T186579
  • 03:32 demon@tin: Finished deploy [gerrit/gerrit@f25f017]: adding gitiles plugin (duration: 00m 10s)
  • 03:32 demon@tin: Started deploy [gerrit/gerrit@f25f017]: adding gitiles plugin
  • 03:17 foks: reset email for User:Andrewman327
  • 02:32 demon@tin: Synchronized tests/Defines.php: no op (duration: 00m 55s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 15s)
  • 01:47 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AICaptcha data collection on group0/group1 T186244 (duration: 00m 56s)
  • 00:25 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ps.svg: SWAT: Update the ps mobile wordmark T184442 (duration: 00m 55s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure settings feedback link T182217 (duration: 00m 56s)

2018-02-05

  • 23:21 mutante: nihal - restarted puppetdb service
  • 23:07 mobrovac@tin: Finished deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395 (duration: 03m 29s)
  • 23:04 mobrovac@tin: Started deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395
  • 22:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 55s)
  • 22:45 mobrovac@tin: Synchronized wmf-config/jobqueue.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 56s)
  • 22:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate (duration: 00m 54s)
  • 22:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate
  • 21:47 mholloway-shell@tin: Finished deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a (duration: 06m 38s)
  • 21:47 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023 (duration: 02m 27s)
  • 21:45 chasemp: asw-b-codfw# rollback 0 pending questions on T183167
  • 21:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023
  • 21:41 mholloway-shell@tin: Started deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a
  • 21:07 tgr@tin: Finished scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki (duration: 18m 24s)
  • 20:48 tgr@tin: Started scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki
  • 19:44 demon@tin: Synchronized wmf-config/InitialiseSettings.php: collation for abwiki (duration: 00m 55s)
  • 19:32 demon@tin: Finished scap: adding collation for Abkhaz (duration: 05m 12s)
  • 19:27 demon@tin: Started scap: adding collation for Abkhaz
  • 19:26 demon@tin: Synchronized multiversion/MWWikiversions.php: drop php5.3 support (duration: 00m 56s)
  • 19:22 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder extension for urwiki (duration: 00m 56s)
  • 19:05 elukey: executed 'echo '/srv/apache2_dump/core.%h.%e.%p.%t' > /proc/sys/kernel/core_pattern' on phab1001 - T182832
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps (duration: 14m 42s)
  • 18:42 ppchelko@tin: Started deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps
  • 18:37 mutante: added bstorm to acl*operations-team (project 29) on Phabricator (T185493)
  • 18:35 elukey: add 'ulimit -c unlimited' to /etc/default/apache2 to see if httpd's CoreDumpDirectory works properly on phab1001
  • 18:35 mutante: welcome new root shell user bstorm
  • 18:31 mutante: added bstorm to the 'wmf' and 'ops' LDAP groups (modify-ldap-groups on terbium) (T185493)
  • 18:30 ppchelko@tin: Finished deploy [restbase/deploy@55e9d87]: Enable ensure_content_type filter for mobile content (duration: 12m 04s)
  • 18:18 ppchelko@tin: Started deploy [restbase/deploy@55e9d87]: Enable ensure_content_type filter for mobile content
  • 18:07 gehel@tin: Finished deploy [wdqs/wdqs@d7eb899]: wdqs blazegraph + gui + updater upgrade (duration: 02m 36s)
  • 18:04 gehel@tin: Started deploy [wdqs/wdqs@d7eb899]: wdqs blazegraph + gui + updater upgrade
  • 17:52 ejegg: updated payments-wiki from 341cb573a1 to 39a7ef32e5
  • 17:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@d970b61]: Update mobileapps to 7a9fab3 (duration: 05m 45s)
  • 17:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@d970b61]: Update mobileapps to 7a9fab3
  • 16:10 marostegui: Renaming wikidata tables on s5 on eqiad - T184599
  • 16:03 marostegui: Renaming wikidata tables on s5 on codfw - T184599
  • 15:54 mholloway-shell@tin: Finished deploy [mobileapps/deploy@c9c774e]: Update mobileapps to 1411ccb (duration: 06m 06s)
  • 15:48 mholloway-shell@tin: Started deploy [mobileapps/deploy@c9c774e]: Update mobileapps to 1411ccb
  • 15:26 elukey: temporary setting CoreDumpDirectory /srv/apache2_dump to httpd on phab1001 (+ httpd reload) to investigate core dumps for T182832
  • 14:46 hashar: European SWAT completed. I have not deployed matmarex patches to change Abkhaz collation ( https://gerrit.wikimedia.org/r/#/c/406185/ https://gerrit.wikimedia.org/r/#/c/406187/ )
  • 14:41 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder for Estonian Wikipedia (etwiki) - T186107 (duration: 00m 55s)
  • 14:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgNamespaceRobotPolicies on ptwiki's NS_USER to noindex - T185660 (duration: 00m 55s)
  • 14:31 hashar@tin: Synchronized wmf-config/throttle.php: Add throttle rules - T185794 T185811 (duration: 00m 55s)
  • 14:24 hashar@tin: Synchronized php-1.31.0-wmf.17/extensions/Flow/includes/Import/OptInController.php: OptInController catch both errors and exception - T184670 (duration: 00m 55s)
  • 14:22 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Fix typo in arwikibooks rollbacker group - T185720 (duration: 00m 56s)
  • 14:14 hashar@tin: Synchronized wmf-config/throttle.php: Add throttle rule for an event - T185930 (duration: 00m 55s)
  • 14:11 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add 'rollbacker' group at arwikibooks - T185720 (duration: 00m 56s)
  • 13:20 marostegui: Rename dewiki tables on s8 master (db1071 - with no replication) before dropping them - T184599
  • 12:20 marostegui: Drop empty wikidata database from s5 master (db1070) - T184599
  • 12:17 marostegui: Drop old and renamed wikidata tables from s5 master (db1070) - T184599
  • 11:30 godog: expand smart metrics checking rollout with https://gerrit.wikimedia.org/r/#/c/403621/
  • 11:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify db1078 comment as it is the new candidate master for s3 (duration: 00m 55s)
  • 11:04 hashar: Upgraded jenkins-debian-glue to 0.18.4-wmf1 | T186494
  • 11:03 elukey: restart eventlogging/forwarder legacy-zmq on eventlog1001 due to slow memory leak over time (cached memory down to zero)
  • 10:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1078 (duration: 00m 55s)
  • 09:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1078 traffic (duration: 00m 55s)
  • 09:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1078 traffic - T186321 (duration: 00m 55s)
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 with low traffic - T186321 (duration: 00m 53s)
  • 08:44 marostegui: Stop MySQL on db1078, upgrade mariadb, kernel and socket location - T186321
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T186321 (duration: 00m 55s)
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 56s)
  • 07:45 marostegui: Deploy schema change on s8 primary master (db1071) - T174569
  • 07:43 elukey: install libjson-c2-dbg on phab1001 to allow better debugging of httpd/mod-php stuck process - T182832
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 03s)

2018-02-04

  • 22:40 elukey: restart aphlict.service on phab1001 to force it to pick up the new logfile (/var/log/aphlict/aphlict.log rather than the .log.1)
  • 06:18 _joe_: reduced raid resync speed on conf2* to 5000 KB/s
  • 04:33 _joe_: restarted etcdmirror on conf2002, failure caused by raid resyncs in codfw

2018-02-03

  • 03:55 legoktm: restarting zuul to drop 407165,3 from the queue

2018-02-02

  • 23:40 no_justification: gerrit: one last restart to try and force gerrit/phab session restart
  • 22:42 jynus: reloading m2 dbproxy
  • 22:08 no_justification: cobalt/gerrit2001: purged libbcprov-java libbcpkix-java, cleaned up old symlinks
  • 21:45 demon@tin: Finished deploy [gerrit/gerrit@98f5d9a]: Gerrit 2.14.6 (duration: 00m 14s)
  • 21:45 demon@tin: Started deploy [gerrit/gerrit@98f5d9a]: Gerrit 2.14.6
  • 21:42 no_justification: cobalt: disabling puppet so it doesn't restart gerrit
  • 21:41 no_justification: bringing down gerrit for upgrade
  • 20:54 demon@tin: Synchronized docroot/wikipedia.org/spec.yaml: expose swagger spec (duration: 00m 56s)
  • 20:47 elukey: truncated /var/log/aphlict/aphlict.log to 1G (was 26G) to avoid overhead for the upcoming first logrotate on phab1001
  • 16:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original traffic for db1100 (duration: 00m 54s)
  • 16:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 (duration: 00m 55s)
  • 16:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1100 (duration: 00m 54s)
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1100 - T186321 (duration: 00m 54s)
  • 15:50 marostegui: Restart MySQL on db1100 - T186321
  • 15:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T186321 (duration: 00m 55s)
  • 15:34 moritzm: uploaded HHVM 3.18.5+dfsg+wmf5+icu57 to jessie-wikimedia/component/icu57 (HHVM 3.18.8 linked against an ICU 57 backport from stretch)
  • 15:25 mutante: ganeti1004 - stopped and started VM ununpentium
  • 14:53 akosiaris: reboot ganeti1005 after emptying it. T181121
  • 13:59 elukey: reboot meitnerium via gnt-instance reboot on ganeti1005 to pick up new disk config - T186020
  • 13:16 moritzm: installing w3m security updates on trusty
  • 12:57 moritzm: installing updated kernels on remaining jessie DB servers
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 55s)
  • 11:57 godog: roll-restart nginx on thumbor and swift-proxy on ms-fe to apply https://gerrit.wikimedia.org/r/407411
  • 11:39 moritzm: uploaded php-wikidiff2 1.5.1+deb9u2 to apt.wikimedia.org (despite the source package name, this package only builds hhvm-wikidiff2 now as php-wikidiff2 is instead updated via stretch-backports, the old internal package will eventually be phased out when we move to PHP7)
  • 11:12 ema: cache_upload: repool cp4026 (varnish 5)
  • 11:07 ema: cache_upload: upgrade cp4026 to varnish 5
  • 10:43 ema: cache_upload: repool cp4025 (varnish 5)
  • 10:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 00m 55s)
  • 10:39 ema: cache_upload: upgrade cp4025 to varnish 5
  • 10:24 ema: cache_upload: repool cp4024 (varnish 5)
  • 10:20 ema: cache_upload: upgrade cp4024 to varnish 5
  • 10:18 moritzm: installing ruby security updates on trusty
  • 09:57 godog: roll-upgrade thumbor to 1.11 - T178072 T185478 T185483 T185485 T183907 T179954
  • 09:46 gilles: Add thumborUrl to Swift config in PrivateSettings.php
  • 09:13 ema: cache_upload: repool cp4023 (varnish 5)
  • 09:08 ema: cache_upload: upgrade cp4023 to varnish 5
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 54s)
  • 08:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 00m 55s)
  • 08:37 elukey: apt-get install php5-dbg on phab1001 as attempt to have a better gdb output for T182832
  • 08:35 ema: cache_upload: repool cp4022 (varnish 5)
  • 08:29 ema: cache_upload: upgrade cp4022 to varnish 5
  • 08:23 marostegui: Stop replication in sync db1089 - db1065 - T162807
  • 08:23 moritzm: installing curl security updates on trusty (Debian already updated)
  • 08:21 ema: cache_upload: repool cp4021 (varnish 5)
  • 08:14 ema: cache_upload: upgrade cp4021 to varnish 5
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 55s)
  • 07:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 55s)
  • 07:10 marostegui: Fixing data drifts on db1065 - T162807
  • 05:37 elukey: truncate /var/log/aphlict/aphlict.log to 25G as temp measure to avoid phab1001's root partition to fill up

2018-02-01

  • 23:37 mutante: creating new 100GB virtual disk for ganeti VM meitnerium (T186020)
  • 23:12 eileen: update civicrm revision changed from 849bba4186 to 71b1e35b99 (deploy civitoken)
  • 22:37 ejegg: updated payments-wiki from 40145892e7 to 341cb573a1
  • 21:56 raita: Removed 2FA from User:Jehochman
  • 21:52 raita: Removed 2FA from User:Superzerocool (on Mon, Jan 29): https://phabricator.wikimedia.org/T185731
  • 20:50 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool of db1083 (duration: 00m 55s)
  • 20:41 jynus: deployed modified query killer to enwiki replicas
  • 20:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: emergency depool of db1083 (duration: 00m 55s)
  • 19:19 chasemp: labservices1001:~# logrotate --force /etc/logrotate.conf
  • 19:17 chasemp: labservices1002:~# logrotate --force /etc/logrotate.conf
  • 19:04 demon@tin: Pruned MediaWiki: 1.31.0-wmf.16 [keeping static files] (duration: 01m 16s)
  • 19:02 demon@tin: Pruned MediaWiki: 1.31.0-wmf.15 (duration: 14m 55s)
  • 16:26 andrewbogott: apt-get install 'designate' on labservices1001 and 1002 — routine upgrade
  • 15:39 moritzm: upgrading nginx on mw1266-mw1299 (for T164456)
  • 15:27 moritzm: restarting apache/HHVM on deployment servers to pick up libxml2/curl security updates
  • 15:14 moritzm: installing curl security updates
  • 14:48 moritzm: installing tiff security updates
  • 14:44 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2019 (duration: 00m 57s)
  • 14:40 moritzm: restarting nginx on sodium to pick up libxml2 security update
  • 14:34 moritzm: restarting apache on rutherfordium to pick up libxml2 security update
  • 14:01 moritzm: restarting nginx on puppetdb hosts to pick up libxml2 security update
  • 13:56 moritzm: restarting nginx on meitnerium/archiva to pick up libxml2 security update
  • 13:42 gehel: restarting nginx on wdqs* for upgrade
  • 13:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Update db1051 reason for depooling (duration: 00m 56s)
  • 13:23 akosiaris: force puppet run on all postgres servers for https://gerrit.wikimedia.org/r/407424
  • 13:20 jynus: stop and reimage es2019
  • 13:13 moritzm: restarting apache on krypton to pick up libxml2 security update
  • 13:13 gehel: restarting postgresql and nodejs services on maps*
  • 13:09 gehel: upgrade nging on elastic*
  • 12:54 moritzm: restarting nginx on debug proxies to pick up libxml2 security update
  • 12:53 moritzm: restarting apache on hafnium to pick up libxml2 security update
  • 12:06 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2018, depool es2019 (duration: 00m 57s)
  • 12:03 moritzm: restarting squid on URL downloaders to pick up libxml2 security update
  • 11:53 moritzm: installing libxml2 security updates
  • 10:33 godog: roll restart thumbor to lower subprocess timeout - T185479
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 53s)

2018-01-31

  • 23:56 mutante: restarting apache on phabricator server, same pattern as described in T182832
  • 23:06 bblack: re-pooling ulsfo in DNS - T185228
  • 23:04 bblack: re-pooling ulsfo in DNS
  • 23:00 bblack: restarting ulsfo varnish-fe processes
  • 22:55 bblack: un-downtiming various ulsfo things
  • 22:28 mepps: updated civicrm from c70f01cd83 to 849bba4186
  • 22:04 mepps: updated civicrm from c70f01cd83 to 63c918837c
  • 21:57 mholloway-shell@tin: Finished deploy [mobileapps/deploy@18d263a]: Update mobileapps to 3d717fa (duration: 06m 11s)
  • 21:51 mholloway-shell@tin: Started deploy [mobileapps/deploy@18d263a]: Update mobileapps to 3d717fa
  • 21:35 mutante: fixed icinga config for cp4024 parents
  • 20:29 demon@tin: Synchronized .gitmodules: consistency (duration: 00m 54s)
  • 20:25 demon@tin: Synchronized docroot/wikimedia.org/: bye bye firefox os. you will (not) be missed (duration: 00m 58s)
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4032.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4031.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4030.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4029.ulsfo.wmnet
  • 18:54 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4027.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4028.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4026.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4025.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4024.ulsfo.wmnet
  • 18:53 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4023.ulsfo.wmnet
  • 18:52 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4022.ulsfo.wmnet
  • 18:52 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4021.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4032.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
  • 18:47 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4030.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4029.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4028.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4027.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4026.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4025.ulsfo.wmnet
  • 18:46 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4024.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4023.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4022.ulsfo.wmnet
  • 18:45 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4021.ulsfo.wmnet
  • 18:40 robh: putting all ulsfo servers into maint mode
  • 18:10 XioNoX: deactivating bgp session from ulsfo to office
  • 16:17 marostegui: Optimize wbc_entity_usage on s6 on db1102
  • 16:15 robh: depooling ulsfo for https://phabricator.wikimedia.org/T185228
  • 15:44 akosiaris: reimage ores100{1..9} T171851
  • 14:37 godog: bump prometheus global instance retention to 15 months - T160677
  • 12:25 marostegui: Fix replication on labsdb1004
  • 09:32 moritzm: rolling restart of thumbor/nginx to pick up libxml security update
  • 08:21 moritzm: installing clamav security update on fermium
  • 07:55 moritzm: installing libxml security updates
  • 07:48 marostegui: Stop MySQL on db1030 - T184397
  • 07:47 marostegui: Remove db1030 from tendril - T184397
  • 07:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 56s)
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 57s)
  • 07:08 marostegui: Force BBU relearn on db1051 - T186049
  • 06:19 elukey: restart varnish backend on cp4024 - failed fetches / 503s
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 46s)
  • 01:51 mutante: catchpoint: recycled gwicke's user and turned it into a user for volans, upgraded him to admin (T162857)
  • 00:55 krinkle@tin: Synchronized wmf-config: no-op, adding files for beta cluster (duration: 00m 59s)
  • 00:51 krinkle@tin: Synchronized wmf-config/profiler.php: no-op (comment-only) (duration: 00m 58s)

2018-01-30

  • 21:39 demon@tin: Synchronized docroot/noc/conf/open.dblist: (no justification provided) (duration: 00m 57s)
  • 21:38 demon@tin: Synchronized dblists/open.dblist: Adding open.dblist (duration: 00m 57s)
  • 19:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 (duration: 00m 57s)
  • 18:32 mutante: powercyling amslvs4, to be reinstalled as bast3003
  • 18:08 moritzm: installing PHP security updates
  • 15:52 moritzm: installing libxml2 security updates
  • 15:35 moritzm: installing libxcursor security updates
  • 15:30 jynus: stop and reimage es2018
  • 14:42 moritzm: installing curl security updates on app server canaries along with HHVM restart
  • 13:15 moritzm: installing rsync security updates on trusty
  • 12:15 moritzm: installing libxtst updates
  • 10:57 moritzm: installing ffmpeg security updates
  • 09:34 moritzm: installing wireshark security updates
  • 08:35 moritzm: installing libxml2 security updates
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 58s)
  • 00:31 demon@tin: rebuilt and synchronized wikiversions files: not changing versions, testing something

2018-01-29

2018-01-28

  • 18:39 bblack: testme

2018-01-26

  • 16:32 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: BETA: Enable FileImporter on testwiki with open config PT 2/2 (duration: 00m 56s)
  • 16:30 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: BETA: Enable FileImporter on testwiki with open config PT 1/2 (duration: 00m 58s)
  • 06:40 niharika29@tin: Finished deploy [scholarships/scholarships@5d2fca4]: Update deadline for scholarships application (duration: 00m 03s)
  • 06:39 niharika29@tin: Started deploy [scholarships/scholarships@5d2fca4]: Update deadline for scholarships application
  • 04:31 urandom: bootstrapping restbase2009-c - T184100
  • 02:32 urandom: bootstrapping restbase2009-b - T184100

2018-01-25

  • 22:24 mutante: restarting gerrit service to apply a few small config changes https://gerrit.wikimedia.org/r/#/q/topic:gerrit-trivial-tweaks+(status:open+OR+status:merged)
  • 22:07 mutante: restarting apache on phabricator server
  • 22:06 urandom: bootstrapping restbase2009-a - T184100
  • 18:13 _joe_: restart hhvm on a few api appservers, high cpu load
  • 14:52 urandom: bootstrapping restbase2008-c - T184100
  • 07:44 urandom: bootstrapping restbase2008-b - T184100
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 32s)
  • 01:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es2011 (duration: 00m 56s)
  • 01:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2011 (duration: 00m 57s)
  • 00:24 urandom: bootstrapping restbase2008-a - T184100

2018-01-24

  • 23:15 ema: cp4025: restart varnish backend due to mbox lag
  • 19:57 jynus: starting es2011 reimage
  • 19:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2011 (duration: 00m 57s)
  • 18:35 no_justification: gerrit: restarting services, will be back momentarily
  • 18:32 urandom: bootstrapping restbase2007-c - T184100
  • 08:16 ema: cp4024: restart varnish-be due to 503s
  • 06:26 urandom: bootstrapping restbase2007-b - T184100
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 33s)
  • 01:14 matt_flaschen: SWAT complete
  • 01:10 matt_flaschen: Deployed 'T185304: NWE: Don't attempt to set selection on unattached textarea' in extensions/VisualEditor
  • 01:02 mattflaschen@tin: Synchronized php-1.31.0-wmf.17/extensions/VisualEditor/: (no justification provided) (duration: 00m 58s)
  • 00:40 mobrovac@tin: Finished deploy [zotero/translators@8f53531]: Update translators to 528296d (duration: 00m 08s)
  • 00:39 mobrovac@tin: Started deploy [zotero/translators@8f53531]: Update translators to 528296d
  • 00:08 urandom: bootstrapping restbase2007-a - T184100

2018-01-23

  • 17:37 robh: mc2036 offline until mainboard fix
  • 14:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Unify comments about sanitarium masters (duration: 00m 56s)
  • 14:36 zeljkof: EU SWAT finished
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update the project namespace in Nepali Wikipedia (T184865) (duration: 00m 56s)
  • 14:17 zeljkof: continuing EU SWAT
  • 14:14 zeljkof: EU SWAT finished
  • 14:13 zfilipin@tin: Synchronized php-1.31.0-wmf.17/extensions/WikibaseQualityConstraints/: SWAT: Add missing DISTINCT to SPARQL query (T184705) (duration: 01m 02s)
  • 13:03 moritzm: installing libxtst, libxfixes, libxrandr, libxi security updates
  • 10:56 moritzm: installing libx11 security updates
  • 10:43 moritzm: installing sudo security updates
  • 09:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 00m 56s)
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1089 (duration: 00m 56s)
  • 08:24 moritzm: installing gdk-pixbuf security updates
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 (duration: 00m 56s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 56s)
  • 06:50 elukey: restart varnish backend on cp4021, 503s and mailbox lag
  • 06:47 marostegui: Stop replication in sync on db2048 and db1089 - T162807
  • 06:23 marostegui: Stop replicaiton in sync db1089 and db1105:3311 - T162807
  • 06:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 00m 57s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 31s)

2018-01-22

  • 22:38 mutante: rebooting the-server-formerly-known-as-amslvs4 to PXE to reinstall it as bast3003. doesnt work
  • 21:02 ottomata: restarting archiva
  • 19:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add Draft namespace to newiki (T184157) (duration: 00m 56s)
  • 19:34 catrope@tin: Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionDetailsWidget.js: T184380 (duration: 00m 56s)
  • 19:31 catrope@tin: Synchronized php-1.31.0-wmf.17/extensions/InputBox/InputBox.hooks.php: T185367 (duration: 00m 58s)
  • 18:16 gehel@tin: Finished deploy [wdqs/wdqs@f59ed29]: (no justification provided) (duration: 02m 12s)
  • 18:15 gehel: updating wdqs GUI
  • 18:14 gehel@tin: Started deploy [wdqs/wdqs@f59ed29]: (no justification provided)
  • 17:11 joal@tin: Finished deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands) (duration: 10m 14s)
  • 17:01 joal@tin: Started deploy [analytics/refinery@5b8edb8]: Regular weekly deploy (before freeze for all-hands)
  • 16:51 volans: upgraded debdeploy and cumin to latest released on neodymium/sarin - T182575
  • 15:49 moritzm: upgrade image scalers in eqiad to HHVM 3.18.7
  • afk: restarting jenkins
  • 14:59 moritzm: upgrade mw1221-mw1235 to HHVM 3,18.7
  • 14:43 zeljkof: EU SWAT finished
  • 14:41 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update officewiki logo, add HD logo for officewiki (T184575) (duration: 00m 56s)
  • 14:39 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update officewiki logo, add HD logo for officewiki (T184575) (duration: 00m 56s)
  • 14:36 elukey: truncate (again) /var/log/upstart/neutron-server.log on labtestnet2001
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow bureaucrats@mr.wiki to grant&revoke accountcreator (T184553) (duration: 00m 56s)
  • 14:26 moritzm: uploaded debdeploy 0.0.99.2 for jessie-wikimedia, stretch-wikimedia, trusty-wikimedia to apt.wikimedia.org
  • 14:23 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add https://audiovis.nac.gov.pl to $wgCopyUploadsDomains (T184853) (duration: 00m 56s)
  • 14:12 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Remove $wgWBQualityConstraintsIncludeDetailInApi setting (T180614) (duration: 00m 56s)
  • 14:11 gehel: cleanup leftover logrotate configuration on wdqs*
  • 14:05 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable fine grained lua tracking for arwiki, fawiki, viwiki (T185032) (duration: 00m 57s)
  • 13:46 marostegui: Force BBU relearn on db1016 - T166344
  • 12:38 volans: upgraded cumin on labpuppetmasters hosts to 2.0.0
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1063 as vslow - T184397 (duration: 00m 56s)
  • 12:22 moritzm: upgrade mw1238-mw1258 to HHVM 3,18.7
  • 12:01 marostegui: Change x1 codfw topology: db2034 to replicate from eqiad T184888
  • 11:45 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2036 (duration: 00m 57s)
  • 11:38 volans: uploaded cumin_2.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 09:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s)
  • 09:50 jynus: running heavy reads on db2043, db2036 to try to reproduce s3 codfw crash
  • 09:25 marostegui: Stop replication in sync db1099:3311 and db1089 - T162807
  • 09:21 marostegui: Stop MySQL on db1030 to clone db1063 - T184397
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1030 - T184397 (duration: 00m 56s)
  • 09:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066, depool db1099:3311 - T162807 (duration: 00m 56s)
  • 08:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1067 weight (duration: 00m 56s)
  • 08:31 moritzm: upgrading video scalers to HHVM 3.18.7
  • 07:51 marostegui: Stop replication in sync db1089 and db1066 - T162807
  • 07:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067, depool db1066 - T162807 (duration: 00m 56s)
  • 07:11 marostegui: Stop replication in sync db1089 and db1067 - T162807
  • 07:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 56s)
  • 07:04 elukey: truncated /var/log/upstart/neutron-server.log on labtestnet2001 - / disk space exhausted
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1063 from s8 to s6 - T184397 (duration: 00m 58s)
  • 06:18 marostegui: Compress ruwiki on db1102 - T182450
  • 02:37 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 07m 24s)

2018-01-21

  • 17:21 marostegui: Compress frwiki and jawiki on db1102 - T182450
  • 12:03 marostegui: Defragment s2 on db1102 - T182450
  • 02:35 urandom: bootstrapping restbase2012-b - T184100

2018-01-20

  • 23:20 urandom: bootstrapping restbase2012-a - T184100
  • 17:36 elukey: forced bbu learn cycle on analytics1038 (cache policy flapping from WriteBack to WriteThrough)
  • 16:57 urandom: bootstrapping restbase2011-c - T184100
  • 12:53 urandom: bootstrapping restbase2011-b - T184100
  • 03:32 urandom: bootstrapping restbase2011-a - T184100

2018-01-19

  • 22:53 matt_flaschen: Ran (time foreachwikiindblist flow.dblist extensions/Flow/maintenance/FlowFixInconsistentBoards.php --force) 2>&1|tee --append ~/FlowFixInconsistentBoards_all_2018-01-19_actual_force.txt
  • 21:28 urandom: bootstrapping restbase2010-c - T184100
  • 19:43 mutante: ms-be3003 - power up via mgmt to check if still connected and usable as temp bastion (T184936)
  • 18:58 urandom: bootstrapping restbase2010-b - T184100
  • 17:58 chasemp: labcontrol1002:~# ip addr del 208.80.154.12/32 dev eth0
  • 17:58 chasemp: labcontrol1002:~# ip addr del 208.80.154.102/32 dev eth0
  • 17:55 chasemp: labcontrol1001:~# ip addr del 208.80.154.94/32 dev eth0
  • 17:50 reedy@tin: Synchronized dblists/s3.dblist: alphasort and remove dupes (duration: 01m 01s)
  • 17:11 jynus: stopping mariadb on db2016,17,18,19,23,28&29 T184090
  • 16:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool ddb1089 and depool db1067 (duration: 00m 56s)
  • 16:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 - T162807 (duration: 00m 56s)
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 - T174569 (duration: 00m 56s)
  • 15:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 - T162807 (duration: 00m 56s)
  • 15:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Tune s1 and s3 database weights (duration: 00m 57s)
  • 15:38 anomie: Running migrateArchiveText.php on all wikis that need it (T184629)
  • 15:24 anomie: Running migrateArchiveText.php on metawiki (T184629)
  • 15:23 godog: bootstrap cassandra-a on restbase2010 - T184100
  • 14:48 anomie: Running migrateArchiveText.php on testwiki (T184629)
  • 14:31 moritzm: installing krb5 updates from jessie point release
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 56s)
  • 12:03 moritzm: installing imagemagick security updates
  • 11:46 moritzm: installing sensible-utils security update
  • 11:29 jynus@tin: Synchronized wmf-config/db-eqiad.php: Decommission old codfw masters (duration: 00m 55s)
  • 11:20 moritzm: upgrading tor on radium to 0.3.2.9
  • 11:18 jynus@tin: Synchronized wmf-config/db-codfw.php: Decommission old codfw masters (duration: 00m 56s)
  • 11:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 56s)
  • 10:55 jynus: restarting es2002
  • 10:53 moritzm: updated tor packages on apt.wikimedia.org to 0.3.2.9-1~d80
  • 10:19 jynus: stop mariadb at db2018 to clone it away
  • 10:02 jynus: restarting es2001
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 54s)
  • 09:56 ema: cp4026 restart varnish-be because of mbox lag
  • 09:10 marostegui: Stop replication in sync db1089 and db1105:3311 - T162807
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 00m 57s)
  • 09:08 godog: start cassandra-a on restbase1015 - T184100
  • 07:11 marostegui: Stop x1 on dbstore2002 to copy its content to db2034 - T184888
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 55s)
  • 06:31 marostegui: Stop replication in sync db1089 and db1099:3311 - T162807
  • 06:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T162807 (duration: 00m 56s)
  • 06:22 marostegui: Deploy schema change on db1109 - T174569
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T174569 (duration: 00m 57s)
  • 03:21 TimStarling: on bast1001: restarting bacula-fd with master key decryption enabled, restarting restore job
  • 01:20 TimStarling: attempting to restore home_pmtpa from bacula to bast1001
  • 00:19 ebernhardson: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: T185246: Removing unused citizendium from $wgRelatedSitesPrefixes (duration: 00m 56s)
  • 00:11 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: T185250 Switch wiktionary sister search on enwiki to title only (step 2) (duration: 00m 56s)
  • 00:09 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T185250 Switch wiktionary sister search on enwiki to title only (step 1) (duration: 00m 57s)

2018-01-18

  • 23:11 urandom: bootstrapping restbase1015-b -- T184100
  • 22:36 herron: added ruby-rgen-0.7.0-1 (backported package from jessie) to trusty-wikimedia apt repo (T182894)
  • 21:03 arlolra@tin: Finished deploy [parsoid/deploy@a95fede]: Update Parsoid config, again (duration: 09m 39s)
  • 20:53 arlolra@tin: Started deploy [parsoid/deploy@a95fede]: Update Parsoid config, again
  • 20:31 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.17
  • 20:09 mutante: releases1001 - /srv/patches got created, initial manual rsync using /usr/local/sbin/sync-srv-patches created by rsync::quickdatacopy, mw patches exists on nightlies server now
  • 20:09 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/Score/includes/Score.php: SWAT: Always pass FileBackend instance to `new FileRepo()` T185204 (duration: 01m 12s)
  • 20:01 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: (no justification provided) (duration: 01m 09s)
  • 20:00 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: (no justification provided)
  • 20:00 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: (no justification provided) (duration: 00m 44s)
  • 19:59 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: (no justification provided)
  • 19:56 arlolra@tin: Finished deploy [parsoid/deploy@8736b8c]: Updating Parsoid config (duration: 02m 01s)
  • 19:54 arlolra@tin: Started deploy [parsoid/deploy@8736b8c]: Updating Parsoid config
  • 19:53 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/VisualEditor/modules/ve-mw/ui/pages/ve.ui.MWTemplatePlaceholderPage.js: SWAT: Update TitleInput getTitle to getMWTitle (duration: 01m 09s)
  • 19:24 arlolra: Updated Parsoid to af06386 (T45094)
  • 19:20 ema: cache_upload: upgrade cp3049 to varnish 5
  • 19:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update linter stats for commonswiki less frequently T184280 (duration: 01m 13s)
  • 19:17 arlolra@tin: Finished deploy [parsoid/deploy@fcc2b63]: Updating Parsoid to af06386 (duration: 09m 32s)
  • 19:08 arlolra@tin: Started deploy [parsoid/deploy@fcc2b63]: Updating Parsoid to af06386
  • 19:00 ema: cache_upload: repool cp3046 (varnish 5)
  • 18:58 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: T184670: Hide Flow beta feature everywhere but testwiki (duration: 01m 10s)
  • 18:54 ema: cache_upload: upgrade cp3046 to varnish 5
  • 18:47 bsitzmann@tin: Finished deploy [mobileapps/deploy@669fb5b]: Update mobileapps to 2690899 (T184328 T184557 T177007 T184669 T177430 T185050) (duration: 07m 03s)
  • 18:40 bsitzmann@tin: Started deploy [mobileapps/deploy@669fb5b]: Update mobileapps to 2690899 (T184328 T184557 T177007 T184669 T177430 T185050)
  • 18:39 ema: cache_upload: repool cp3045 (varnish 5)
  • 18:33 ema: cache_upload: upgrade cp3045 to varnish 5
  • 18:23 mlitn@tin: Finished deploy [3d2png/deploy@74b1ed7]: Updating 3d2png repo (duration: 00m 50s)
  • 18:22 mlitn@tin: Started deploy [3d2png/deploy@74b1ed7]: Updating 3d2png repo
  • 18:00 ema: cache_upload: repool cp3044 (varnish 5)
  • 17:55 ema: cache_upload: upgrade cp3044 to varnish 5
  • 17:39 moritzm: rebooting sodium (and temporarily disable icinga-wm due to some expected spam due to clients failing to run apt-get update)
  • 17:33 jynus: starting compare.py on s3 codfw (it triggered db2036 crash before)
  • 17:31 ema: cache_upload: repool cp3039 (varnish 5)
  • 17:26 ema: cache_upload: upgrade cp3039 to varnish 5
  • 17:02 ema: cache_upload: repool cp3036 (varnish 5)
  • 16:55 ema: cache_upload: upgrade cp3036 to varnish 5
  • 15:54 ema: cache_upload: repool cp3048 (varnish 5)
  • 15:49 ema: cache_upload: upgrade cp3048 to varnish 5
  • 15:40 moritzm: rebooting labsdb1004 for kernel security update
  • 15:40 ema: cache_upload: repool cp3047 (varnish 5)
  • 15:34 ema: cache_upload: upgrade cp3047 to varnish 5
  • 15:33 moritzm: reboot labsdb1006 (OSM slave) for kernel security update
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T162807 (duration: 01m 12s)
  • 15:15 mforns@tin: Finished deploy [analytics/refinery@78f98d9]: deploying refinery to add ISO codes to pageviews by country (duration: 04m 12s)
  • 15:11 mforns@tin: Started deploy [analytics/refinery@78f98d9]: deploying refinery to add ISO codes to pageviews by country
  • 15:01 marostegui: Stop replication in sync db1089 and db1066 - T162807
  • 15:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T162807 (duration: 01m 11s)
  • 14:58 moritzm: installing bind security updates (we only use the client-side tools)
  • 14:58 volans: reprepro includedeb jessie-wikimedia python-requests-mock_1.3.0-3_all.deb
  • 14:45 ema: cache_upload: repool cp3038 (varnish 5)
  • 14:44 herron: disabling puppet agents during deploy of 404587, 404689
  • 14:39 ema: cache_upload: upgrade cp3038 to varnish 5
  • 14:39 godog: restart hhvm on mw1233
  • 14:31 _joe_: restarting hhvm on a few API appservers
  • 14:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T174569 (duration: 01m 12s)
  • 14:28 ema: cache_upload: repool cp3035 (varnish 5)
  • 14:25 marostegui@tin: Synchronized wmf-config/db-codfw.php: Promote db2043 to s3 master after db2036 crash (duration: 01m 12s)
  • 14:25 godog: restart hhvm on mw1227
  • 14:23 ema: cache_upload: upgrade cp3035 to varnish 5
  • 14:19 jynus: starting mysql on db2043
  • 14:17 jynus: stopping mysql on db2043
  • 14:10 zeljkof: EU SWAT finished
  • 14:10 ema: cache_upload: repool cp3037 (varnish 5)
  • 14:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikibooks (T185182) (duration: 01m 13s)
  • 13:54 ema: cache_upload: upgrade cp3037 to varnish 5
  • 13:49 moritzm: upgrade mw* servers in eqiad running 3.18.5+dfsg-1+wmf3 (recent installations) to 3.18.5+dfsg-1+wmf4
  • 13:19 jynus: changing topology of codfw s3 databases
  • 13:05 akosiaris: reboot poolcounter2001 for PCID/INVPCID CPU feature enabling
  • 13:03 akosiaris: reboot webperf1001 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:57 akosiaris: enable puppet across the fleet after nitrogen (puppetdb) reboot
  • 12:56 akosiaris: reboot nitrogen for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:52 jgleeson: turned on donations queue consumer process-control job (actual time of change 17/01/18 ~16:20)
  • 12:45 akosiaris: reboot seaborgium for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 12:43 elukey: bohrium rebooted for kernel upgrades
  • 12:43 akosiaris: disable puppet across the fleet for nitrogen (puppetdb) reboot
  • 12:40 elukey: set piwik in readonly mode and stopped mysql on bohrium (prep step for reboot)
  • 12:36 akosiaris: reboot chlorine.eqiad.wmnet etcd1003.eqiad.wmnet etcd1005.eqiad.wmnet fermium.wikimedia.org install1002.wikimedia.org krypton.eqiad.wmnet kubestagetcd1003.eqiad.wmnet logstash1009.eqiad.wmnet mwdebug1001.eqiad.wmnet sca1004.eqiad.wmnet for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 11:34 akosiaris: reboot logstash1008 etcd1002 kubestagetcd1002.eqiad.wmnet for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 11:12 ema: cp3046: restart varnish-be due to mbox lag
  • 11:06 volans: disabled puppet on tegmen to test impact on puppetdb - T170740
  • 10:57 akosiaris: reboot actinium.wikimedia.org aluminium.wikimedia.org argon.eqiad.wmnet boron.eqiad.wmnet bromine.eqiad.wmnet darmstadtium.eqiad.wmnet dbmonitor1001.wikimedia.org dubnium.wikimedia.org dysprosium.wikimedia.org etcd1001.eqiad.wmnet etcd1004.eqiad.wmnet fermium.wikimedia.org hassium.eqiad.wmnet kubestagetcd1001.eqiad.wmnet logstash1007.eqiad.wmnet meitnerium.wikimedia.org mendelevium.eqiad.wmnet mwdebug1002.eqiad.wmnet m
  • 10:45 ema: cp3034: restart varnishxcps and varnishmedia, they were both using 100% of a cpu core
  • 10:35 Amir1: ladsgroup@terbium:/srv/mediawiki/php-1.31.0-wmf.17$ mwscript extensions/WikibaseQualityConstraints/maintenance/ImportConstraintStatements.php --wiki wikidatawiki (T184720)
  • 10:30 akosiaris: reboot etherpad1001 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2034 from s1 as it will be in x1 - T184888 (duration: 01m 12s)
  • 10:25 mobrovac@tin: Finished deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541 (duration: 12m 18s)
  • 10:12 mobrovac@tin: Started deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541
  • 10:10 mobrovac@tin: Finished deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541 (duration: 02m 29s)
  • 10:07 mobrovac@tin: Started deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541
  • 10:07 moritzm: rebooting rdb1002/rdb1004/rdb1006/rdb1008 for kernel security update
  • 09:58 akosiaris: reboot etcd1006 for PCID, INVPCID feature enabling (INVPCID not supported on current hardware, but still enabling it cluster wide)
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 09:43 ema: cache_upload: repooled cp3034 running varnish 5
  • 09:38 elukey: reboot thorium (analytics webserver) for security upgrade - This maintenance will cause temporary unavailability of the Analytics websites
  • 09:27 marostegui: !log Stop replication in sync db1089 and db2048 (codfw master) - T162807
  • 09:26 jynus: reimage es2003 to stretch
  • 09:21 elukey: reboot druid1001 for kernel upgrades
  • 09:20 akosiaris: reboot oresrdb2001 for PCID/INVPCID CPU feature enabling
  • 09:10 akosiaris: reboot alcyone pollux sca2004 poolcounter2002 serpens for PCID/INVPCID CPU feature enabling
  • 09:07 marostegui: Stop replication in sync db1089 db1067 - T162807
  • 08:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 01m 13s)
  • 08:37 godog: bootstrap cassandra-c on restbase1013
  • 08:30 moritzm: reboot iron for kernel security update
  • 06:27 marostegui: Deploy schema change on s8 db1087 (sanitarium master) with replication (this will generate lag on labs servers) - T174569
  • 06:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T174569 (duration: 01m 12s)
  • 06:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 - T174569 (duration: 01m 13s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.16) (duration: 07m 18s)
  • 01:08 twentyafterfour: phabricator deployment finished without incident.
  • 01:01 twentyafterfour: Evening SWAT completed. Starting phabricator deployment of #phabricator-2018-07-17 [release/2017-01-17/1]
  • 01:00 twentyafterfour@tin: Finished scap: Evening SWAT (duration: 24m 29s)
  • 00:35 twentyafterfour@tin: Started scap: Evening SWAT

2018-01-17

  • 23:38 mutante: [terbium:~] $ echo 'https://annual.wikimedia.org' | mwscript purgeList.php
  • 22:54 urandom: bootstrapping restbase1013-b - T184100
  • 22:00 andrewbogott: rebooting californium, silver, labcontrol1001, labservices1001
  • 21:03 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.17 (duration: 01m 11s)
  • 20:57 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.17
  • 20:45 thcipriani@tin: Synchronized php-1.31.0-wmf.17/vendor/wikibase/data-model-services: Add missing files from wikibase/data-model-services 3.9.0 (duration: 01m 15s)
  • 20:41 thcipriani@tin: Synchronized php-1.31.0-wmf.17/includes/ServiceWiring.php: [MCR] RevisionStore::getTitle final logged fallback to master PART II (duration: 01m 12s)
  • 20:40 thcipriani@tin: Synchronized php-1.31.0-wmf.17/includes/Storage/RevisionStore.php: [MCR] RevisionStore::getTitle final logged fallback to master PART I (duration: 01m 04s)
  • 20:35 pnorman@tin: Finished deploy [kartotherian/deploy@ecdda41]: (no justification provided) (duration: 05m 44s)
  • 20:30 pnorman@tin: Started deploy [kartotherian/deploy@ecdda41]: (no justification provided)
  • 20:05 andrewbogott: rebooted labservices1002, labcontrol1002, labnet1002
  • 19:56 andrewbogott: rebooting labpuppetmaster1001
  • 19:46 andrewbogott: rebooting labpuppetmaster1002
  • 19:45 papaul: Powering down mw2140 for main board replacement
  • 18:20 niharika29@tin: Synchronized php-1.31.0-wmf.17/includes/EditPage.php: Update Save/Publish button flag from 'constructive' to 'progressive' https://gerrit.wikimedia.org/r/#/c/404733/ (duration: 01m 14s)
  • 18:09 moritzm: uploading HHVM 3.18.5+wmf4 for stretch-wikimedia to apt.wikimedia.org (3.18.7 with the patch https://github.com/facebook/hhvm/commit/bd7b2bcfe70b053a3a001480653012f68599250f backed out)
  • 18:08 ejegg: turned off main silverpop recipient data fetch job
  • 17:55 mutante: gerrit login page design changed (https://gerrit.wikimedia.org/r/402665) in case you were worried it was a fake page trying to steal your login, heh
  • 17:44 moritzm: resetting RAC on labsdb1004 (serial console inaccessible)
  • 17:17 chasemp: reboot labstore2003
  • 17:12 madhuvishy: Rebooting labstore2004
  • 17:08 godog: bootstrap cassandra-a on restbase1013
  • 17:06 ema: upgrade pybal on primary LVSs to 1.14.3 T184715, T184721
  • 16:52 ema: upgrade secondary LVSs to pybal 1.13.4 T184715, T184721
  • 16:33 XioNoX: routing ns2 to radon
  • 16:26 ema: reboot baham (codfw authdns) for kernel upgrade
  • 16:24 XioNoX: routing ns1 to eqiad
  • 16:17 chasemp: labmon1001:~# service grafana-server
  • 16:17 ema: reboot radon (eqiad authdns) for kernel upgrade
  • 16:13 jgleeson: updated civicrm from 354f32fe8a to c70f01cd83
  • 16:12 chasemp: labmon1001:~# /sbin/reboot
  • 16:09 XioNoX: routing ns0 to codfw (baham)
  • 16:07 moritzm: upgrading HHVM in codfw to 3.18.7 (wmf4)
  • 16:06 moritzm: upgrading nginx on mwdebug servers to 1.13.6-2+wmf1~jessie1
  • 16:05 jgleeson: turned off donations queue consumer process-control job
  • 16:00 ema: pybal 1.14.3 uploaded to apt.w.o
  • 15:51 chasemp: labstore1002:~# /sbin/reboot
  • 15:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 after fixing data drifts - T162807 (duration: 01m 12s)
  • 15:41 _joe_: dropping ruwiki htmlCacheUpdate records stuck int he old jobqueue
  • 15:36 moritzm: upgrading nginx on mw servers in codfw to 1.13.6-2+wmf1~jessie1
  • 15:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1104 (duration: 01m 12s)
  • 14:57 moritzm: resetting RAC on labsdb1007 (serial console inaccessible)
  • 14:53 moritzm: resetting RAC on labsdb1006 (serial console inaccessible)
  • 14:38 chasemp: labstore1001:~# /sbin/reboot
  • 14:27 zeljkof: EU SWAT finished
  • 14:23 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create "eliminator" user group on ur.wikipedia (T184607) (duration: 01m 12s)
  • 14:14 moritzm: repooling chromium
  • 14:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Draft Namespace in enwikiversity (T184957) (duration: 01m 12s)
  • 14:07 moritzm: rebooting chromium for kernel security update
  • 14:04 gehel: restart of elasticsearch / cirrus eqiad completed (cluster still recovering)
  • 14:03 moritzm: depooling chromium
  • 13:51 chasemp: reboot labstore2003
  • 13:46 akosiaris: reboot sca2003 webperf2001 planet2001 poolcounter2002 mx2001 kubetcd200{1,2,3} install2002 dbmonitor2001 alsafi acrux hassaleh diadem nihal pybal-test200{1,2,3} releases2001 tureis for PCID, INVPCID
  • 13:45 chasemp: labstore2002:~# sudo update-grub && /sbin/reboot
  • 13:40 chasemp: labstore2001:~# /sbin/reboot
  • 13:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 (duration: 01m 13s)
  • 13:31 akosiaris: reboot acrab for PCID,INVPCID enabling
  • 13:22 marostegui: Deploy schema change on db1099:3318 - https://phabricator.wikimedia.org/T174569
  • 13:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 - T174569 (duration: 01m 12s)
  • 13:17 moritzm: upgrading app server canaries to 3.18.5+wmf4
  • 13:12 marostegui: Fixing drifts on db1065 - T162807
  • 12:28 moritzm: uploading HHVM 3.18.5+wmf4 for jessie-wikimedia to apt.wikimedia.org (3.18.7 with the patch https://github.com/facebook/hhvm/commit/bd7b2bcfe70b053a3a001480653012f68599250f backed out)
  • 12:10 moritzm: updating HHVM in deployment-prep to 3.18.5+wmf4
  • 11:40 godog: bootstrap cassandra-b on restbase1016
  • 11:28 moritzm: rearmed keyholder on neodymium
  • 11:24 moritzm: rebooting neodymium for kernel security update
  • 11:19 _joe_: restarted nginx on mw1346, was in a bad state
  • 10:51 moritzm: reset RAC on chromium, serial console is inaccessible
  • 10:42 moritzm: repooling hydrogen
  • 10:39 moritzm: rebooting hydrogen for kernel security update
  • 10:34 moritzm: depooling hydrogen again
  • 10:22 moritzm: repooling hydrogen (and pdns-recursor restarted), experiment concluded
  • 10:14 moritzm: depooling hydrogen (and keeping pdns-recursor stopped for a few minutes to check whether problems with load-balanced recdns traffic are still an issue)
  • 10:11 moritzm: reset RAC on hydrogen, serial console was inaccessible
  • 10:01 godog: start cassandra-a on restbase1016
  • 09:52 elukey: reboot druid1002 for kernel upgrades
  • 09:46 elukey: removed upstart config for brrd on eventlog1001 (failing and spamming syslog, old leftover?)
  • 09:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Full repool db1101:3318 (duration: 01m 11s)
  • 09:30 moritzm: rebooting flerovium and furud for kernel security update
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1101:3318 (duration: 01m 12s)
  • 09:14 godog: reimage restbase1016 - T184100
  • 09:06 elukey: reboot analytics1003 for kernel upgrades
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 01m 11s)
  • 08:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101:3318 (duration: 15m 42s)
  • 08:44 elukey: reboot stat100[456] for kernel upgrades
  • 07:48 elukey: restart varnish backend on cp4024 (ton of 503s, icinga alerting for mailbox lag)
  • 07:46 oblivian@neodymium: conftool action : set/pooled=inactive; selector: cluster=appserver,name=mw12([0-1][0-9]|20)\.eqiad\.wmnet
  • 07:45 _joe_: depooling mw1209-1220 from the appserver cluster for decommissioning, T185004
  • 06:47 marostegui: Remove labsdb1001 and labsdb1003 from tendril - T184832
  • 06:40 marostegui: Stop MySQL on labsdb1001 (already dead) and labsdb1003 - T184832
  • 06:29 marostegui: Stop replication in sync on db1089 and s1 codfw master (db2048) - T162807
  • 06:28 marostegui: Deploy schema change on db1104 - T174569
  • 06:21 marostegui: Upgrade mariadb and kernel on db1104
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T174569 (duration: 01m 14s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.16) (duration: 07m 11s)
  • 00:28 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T182616 Remove cirrus AB test config for hewiki (duration: 01m 09s)
  • 00:26 ebernhardson@tin: Synchronized fc-list: SWAT: T184664 Updating fonts list and sorting it (duration: 01m 12s)
  • 00:21 ebernhardson@tin: Synchronized fc-list: SWAT: T184664 Updating fonts list and sorting it (duration: 01m 12s)
  • 00:10 ebernhardson@tin: Synchronized php-1.31.0-wmf.16/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T182616 Turn off cirrus AB test on hewiki (duration: 01m 12s)
  • 00:08 ebernhardson@tin: Synchronized php-1.31.0-wmf.17/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T182616 Turn off cirrus AB test on hewiki (duration: 01m 14s)

2018-01-16

  • 22:57 niharika29@tin: Finished deploy [scholarships/scholarships@728d203]: Update privacy statement and delete invalidated translation files. T184659 (duration: 00m 02s)
  • 22:57 niharika29@tin: Started deploy [scholarships/scholarships@728d203]: Update privacy statement and delete invalidated translation files. T184659
  • 22:53 thcipriani@tin: rebuilt and synchronized wikiversions files: group0 to 1.31.0-wmf.17
  • 22:40 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for htmlCacheUpdate jobs for all wikis but en, commons and wikidata - T182023 (duration: 01m 12s)
  • 22:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@19b9bdd]: Switch htmlCacheUpdates for all but en, commons, wikidata T182023 (duration: 00m 35s)
  • 22:39 ppchelko@tin: Started deploy [cpjobqueue/deploy@19b9bdd]: Switch htmlCacheUpdates for all but en, commons, wikidata T182023
  • 22:19 thcipriani@tin: Synchronized php-1.31.0-wmf.17/extensions/WikimediaMessages/WikimediaMessages.hooks.php: Update access to ORES isModelEnabled() (duration: 01m 13s)
  • 22:19 ottomata: apt-get install librdkafka1=0.9.4-1~jessie1 librdkafka++1=0.9.4-1~jessie1 on scb* to put librdkafka back at node-rdkafka compat version (somehow this was upgraded yesterday...very dangerous!!)
  • 22:16 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.17 and rebuild l10n cache (duration: 25m 45s)
  • 21:50 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.17 and rebuild l10n cache
  • 21:15 andrewbogott: rebooting labvirt1014 and 1015
  • 21:04 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.16
  • 20:59 andrewbogott: rebooting labvirt1013
  • 20:42 demon@tin: Finished scap: wmf.17 files, no bootstrap of i18n tho (x2) (duration: 06m 33s)
  • 20:35 demon@tin: Started scap: wmf.17 files, no bootstrap of i18n tho (x2)
  • 20:34 demon@tin: scap aborted: wmf.17 files, no bootstrap of i18n tho (duration: 08m 59s)
  • 20:34 andrewbogott: rebooting labvirt1011
  • 20:32 herron: re-enabling puppet agents
  • 20:25 demon@tin: Started scap: wmf.17 files, no bootstrap of i18n tho
  • 20:24 herron: temporarily disabling puppet agents while troubleshooting puppet crl
  • 20:21 andrewbogott: rebooting labvirt1010
  • 20:07 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.16
  • 20:03 andrewbogott: rebooting labvirt1009
  • 19:46 andrewbogott: rebooting labvirt1008
  • 19:46 thcipriani@tin: Synchronized php-1.31.0-wmf.16/includes/Storage/RevisionStore.php: RevisionStore, fix loadSlotContent with no $blobFlags T184749 (duration: 01m 13s)
  • 19:30 twentyafterfour: restarted wikibugs (several attempts, eventually it worked)
  • 18:50 chasemp: reboot labvirt1020
  • 18:44 chasemp: reboot labvirt1019
  • 18:35 andrewbogott: rebooting labvirt1007
  • 18:30 arlolra@tin: Finished deploy [parsoid/deploy@1026fd2]: Updating Parsoid to 231bfff (duration: 13m 13s)
  • 18:25 herron: removing ganeti VM puppetcompiler1001
  • 18:19 moritzm: rebooting labmon1002 for kernel security update
  • 18:17 arlolra@tin: Started deploy [parsoid/deploy@1026fd2]: Updating Parsoid to 231bfff
  • 17:59 moritzm: rebooting labnet100[34] and labcontrol100[34] for kernel security update
  • 17:52 herron: re-enabled puppet agents
  • 17:50 andrewbogott: rebooting labvirt1005
  • 17:45 herron: disabled puppet agents troubleshooting T184444
  • 17:31 andrewbogott: rebooting labvirt1004
  • 17:11 andrewbogott: upgrading and rebooting labvirt1002
  • 17:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1092 original weight (duration: 01m 12s)
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1092 (duration: 01m 12s)
  • 16:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1092 (duration: 01m 12s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 - T174569 (duration: 01m 08s)
  • 16:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1101:3317 (duration: 01m 12s)
  • 16:11 oblivian@neodymium: conftool action : set/pooled=no; selector: cluster=api_appserver,name=mw120[1-8]\.eqiad\.wmnet
  • 16:10 _joe_: depooling mw1201-1208 from the API cluster, T185004
  • 16:09 moritzm: rebooting praseodymium for kernel security update
  • 16:08 godog: bootstrap cassandra-c on restbase1018
  • 16:04 chasemp: add arturo to acl*operations-team
  • 16:03 moritzm: rebooting labweb* hosts for kernel security update
  • 15:57 andrewbogott: rebooting labvirt1001
  • 15:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1101:3317 after kernel upgrade (duration: 01m 12s)
  • 15:44 elukey: reboot druid1003 for kernel upgrades
  • 15:41 moritzm: rebooting achernar for kernel security update
  • 15:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101:3317 after kernel upgrade (duration: 01m 12s)
  • 15:31 moritzm: rebooting acamar for kernel security update
  • 15:30 marostegui: Deploy schema change on db1101:3318 - T174569
  • 15:11 marostegui: Upgrade mariadb and kernel on db1101
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 db1101:3318 for schema change, mariadb upgrade and kernel upgrade - T162807 (duration: 01m 12s)
  • 14:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 01m 09s)
  • 14:52 moritzm: rebooting graphite1001 for kernel security update
  • 14:35 moritzm: rebooting graphite1003 for kernel security update
  • 14:25 moritzm: powercycling labtestservices2003, stuck in reboot
  • 14:18 moritzm: powercycling labtestservices2001, stuck in reboot
  • 14:13 zeljkof: EU SWAT finished
  • 14:12 elukey: reboot druid100[56] for kernel upgrades
  • 14:11 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Restrict sending mails to new users" config change (T184470) (duration: 01m 13s)
  • 14:09 godog: bootstrap cassandra-b on restbase1018
  • 14:01 moritzm: rebooting labtest* hosts for kernel security update
  • 13:56 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=druid1004.*.wmnet
  • 13:54 moritzm: rebooting graphite1002 for kernel security update
  • 13:52 elukey: reboot druid1004 for kernel upgrades
  • 13:46 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=druid1004.*.wmnet
  • 13:46 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=druid1004*.wmnet
  • 13:44 moritzm: rebooting graphite2002 for kernel security update
  • 13:31 oblivian@neodymium: conftool action : set/weight=25; selector: cluster=api_appserver,name=mw134[3-8[B]\.eqiad\.wmnet
  • 13:28 moritzm: rebooting graphite2001 for kernel security update
  • 13:20 oblivian@neodymium: conftool action : set/pooled=yes; selector: cluster=api_appserver,name=mw134[3-7]\.eqiad\.wmnet
  • 12:56 elukey: reboot kafka100[23] for kernel upgrades
  • 11:59 ariel@tin: Finished deploy [dumps/dumps@c165ca0]: enable 7z prefetch files for page content dumps (duration: 00m 04s)
  • 11:59 ariel@tin: Started deploy [dumps/dumps@c165ca0]: enable 7z prefetch files for page content dumps
  • 11:51 moritzm: rebooting mc2* hosts for kernel security update
  • 11:27 elukey: reboot kafka1001 for kernel upgrades
  • 11:12 moritzm: reboot maerlant for kernel security update
  • 11:08 moritzm: uploaded HHVM 3.18.7 for stretch-wikimedia to apt.wikimedia.org
  • 10:59 godog: roll-restart swift object server - T167400
  • 10:57 moritzm: reboot nescio for kernel security update
  • 10:13 godog: start cassandra-a on restbase1018 - T184100
  • 09:56 moritzm: upgrading canary app servers to HHVM 3.18.7
  • 09:50 marostegui: Stop replication in sync db1089 - db1105:3311 - T162807
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 01m 12s)
  • 09:38 _joe_: started refreshLinks additional jobs for commonswiki,ruwiki
  • 09:30 oblivian@neodymium: conftool action : set/weight=10; selector: cluster=api_appserver,name=mw13(39|4[12]).eqiad.wmnet
  • 09:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 01m 12s)
  • 09:24 oblivian@neodymium: conftool action : set/pooled=yes:weight=1; selector: cluster=api_appserver,name=mw13(39|4[12]).eqiad.wmnet
  • 09:16 moritzm: installing libxml2 security updates on mw* servers (so that it gets picked up along the HHVM 3.18.7 rollout)
  • 09:12 moritzm: installing krb5 security updates (we're just using rev deps)
  • 09:08 jynus: upgrade and reboot db1031 after switchover
  • 08:49 moritzm: rearmed key holder on sarin
  • 08:45 moritzm: rebooting sarin for kernel security update
  • 08:38 marostegui: Stop replication in sync db1089 - db1099:3311 - T162807
  • 08:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066, depool db1099:3311 - T162807 (duration: 01m 12s)
  • 08:28 jynus: master x1 eqiad failover has finished
  • 08:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Promote db1055 as the new x1 master (duration: 00m 49s)
  • 08:17 jynus: setting db1031 (x1 master) as read only
  • 08:11 jynus: start x1 eqiad master failover
  • 08:02 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 and db1056 after maintenance (duration: 00m 49s)
  • 07:42 jynus: moving replication topology of x1 replicas
  • 07:34 marostegui: Deploy schema change on dbstore1001 (s8) - T174569
  • 07:30 marostegui: Stop replication in sync db1066 and db1089 - T162807
  • 07:30 jynus: upgrade and reboot db1056
  • 07:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 and db1089 - T162807 (duration: 01m 13s)
  • 07:29 oblivian@neodymium: conftool action : set/weight=25; selector: name=mw1340.eqiad.wmnet
  • 07:17 jynus: upgrade and reboot db1055
  • 07:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 and db1056 for maintenance (duration: 01m 12s)
  • 07:03 marostegui: Deploy schema change on dbstore1002 (s8) - T174569
  • 06:32 marostegui: Deploy schema change on db1092 - T174569
  • 06:24 marostegui: Upgrade kernel on db1092
  • 06:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T174569 (duration: 01m 32s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 10s)

2018-01-15

  • 23:40 demon@tin: Synchronized wmf-config/InitialiseSettings.php: turn educationprogram back on for cs.wikipedia -- turns out there was no consensus and a patch should never have been written 😡 (duration: 01m 13s)
  • 18:50 _joe_: pooled mw1340 as an api appserver
  • 18:43 oblivian@puppetmaster1001: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:42 oblivian@neodymium: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:34 oblivian@neodymium: conftool action : set/pooled=active; selector: name=mw1338.eqiad.wmnet
  • 18:01 moritzm: uploading HHVM 3.18.7 (3.18.5+dfsg-1+wmf3) for jessie-wikimedia to apt.wikimedia.org
  • 17:44 moritzm: updating HHVM in deployment-prep to HHVM 3.18.7
  • 17:08 godog: bootstrap cassandra-c on restbase1017
  • 16:53 jynus: upgrade and restart db2018
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 and db1089 - T162807 (duration: 01m 12s)
  • 16:49 jynus: finished codfw s3 master switchover
  • 16:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover s3 codfw master from db2018 to db2036 (duration: 01m 12s)
  • 16:41 _joe_: restarting hhvm on mw1227, threads stuck in HPHP::jit::enterTCImpl
  • 16:31 marostegui: Force WB on db2033 - T184888
  • 16:24 jynus: restarting db2036 to set as master
  • 16:20 jynus: starting codfw s3 master switchover
  • 15:55 marostegui: Stop replication in sync db1067 and db1089 - T162807
  • 15:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 12s)
  • 15:44 jynus: upgrade and restart db2074
  • 15:33 jynus: upgrade and restart db2057
  • 15:08 jynus: upgrade and restart db2050
  • 14:58 jynus: upgrade and restart db2043
  • 14:46 jynus: upgrade and restart db2036
  • 14:41 zeljkof: EU SWAT finished
  • 14:40 zfilipin@tin: Synchronized php-1.31.0-wmf.16/extensions/ContentTranslation: SWAT: CX1: Fix translation view UI overlaps (T184662 T184130) (duration: 01m 16s)
  • 14:08 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable lua fine grained usage tracking in some wikis (T184322) (duration: 01m 14s)
  • 14:05 moritzm: reboot rdb* hosts in codfw for kernel security update
  • 13:41 gehel: starting rolling reboot of elasticsearch / cirrus eqiad for kernel upgrade
  • 13:38 elukey: reboot eventlog1001 for kernel updates
  • 13:20 elukey: reboot kafka2003 for kernel upgrades
  • 12:04 jynus: upgrade and restart db2017
  • 11:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 01m 12s)
  • 11:54 moritzm: rebooting ores1* for kernel security update
  • 11:36 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw13(3[8-9]|4[0-9]).*
  • 11:21 godog: upload scap 3.7.6-1 - T127762
  • 11:10 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 14s)
  • 11:09 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 14s)
  • 11:08 godog: bootstrap cassandra-a on restbase1017
  • 10:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 12s)
  • 10:52 gehel: lowering disk watermark on elasticsearch eqiad to shuffle shards around
  • 10:51 jynus: s2 codfw master swithover finished
  • 10:51 hashar: Upgrading zuul to 2.5.1 on contint1001 / contint2001 | T158243
  • 10:51 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover codfw s2 master from db2017 to db2035 (duration: 01m 12s)
  • 10:50 elukey: reboot kafka2002 for kernel updates
  • 10:48 hashar: Upgrading zuul to 2.5.1 on contint1001 / contint2001
  • 10:27 jynus: upgrade and restart db2035
  • 10:22 jynus: starting codfw s2 master switchover
  • 10:16 jynus: start proxysql on terbium
  • 10:15 moritzm: reboot wasat for kernel security update
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 01m 09s)
  • 09:58 elukey: rolling reboots of aqs hosts (1005->1009) for kernel updates
  • 09:45 marostegui: Deploy schema change on s8 codfw master (db2045) with replication (this will generate lag on s8 codfw) - T174569
  • 09:32 elukey: reboot kafka2001 for kernel updates
  • 09:11 hashar: upgrading Zuul on contint2001 (zuul-merger) | https://gerrit.wikimedia.org/r/#/c/356181/
  • 09:09 hashar: upgrading Zuul on contint1001 | https://gerrit.wikimedia.org/r/#/c/356181/
  • 09:07 elukey: reboot aqs1004 for kernel updates
  • 08:44 jynus: disconnecting codfw -> eqiad replication for x1
  • 08:42 moritzm: reboot wezen for kernel security update
  • 08:22 moritzm: rebooting bast1001 for kernel security update
  • 08:15 moritzm: rebooting terbium for kernel security update
  • 08:11 ema: lvs400[56]: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 07:58 _joe_: reenabling puppet on all systems where it was previously enabled, after various testing
  • 07:50 _joe_: forcing puppet run on the puppetmasters to force pluginsync for function change
  • 07:41 _joe_: disabling puppet in all of production before merging https://gerrit.wikimedia.org/r/402345
  • 07:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Replace db1063 with db1087 as vslow in s8 (duration: 01m 12s)
  • 07:11 marostegui: Deploy schema change on silver (labswiki) and labtestweb2001 (labtestwiki) - T174569
  • 06:52 marostegui: Upgrade MariaDB on db1065
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 to fix data drifts on the archive table - T162807 (duration: 01m 13s)
  • 06:13 marostegui: Deploy schema change on db1070 (s5 master) - T174569
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 07m 50s)

2018-01-12

  • 20:07 mutante: mw1227 hhvm-restart
  • 20:07 mutante: mw1227 - high load: hhvm-dump-debug > /root/hhvm-dump-debug-2017012.log | Backtrace saved as /tmp/hhvm.2203.bt.
  • 19:19 ejegg: disabled Omnimail recipient load backfill job
  • 19:09 bblack: leftover cruft from expired digicert-2016 certs all cleaned up now :)
  • 19:08 jynus: upgrade and restart db2091
  • 18:32 jynus: upgrade and restart db2088
  • 18:28 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1014.eqiad.wmnet
  • 17:59 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1014.eqiad.wmnet
  • 17:58 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1012.eqiad.wmnet
  • 17:41 jynus: upgrade and restart db2064
  • 17:34 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1012.eqiad.wmnet
  • 17:33 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1010.eqiad.wmnet
  • 17:31 demon@tin: Synchronized docroot/mediawiki/: prettier keys page (duration: 01m 13s)
  • 17:28 cwd: re-enabled payments,civi,listener,p-c
  • 17:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 01m 09s)
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1010.eqiad.wmnet
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1009.eqiad.wmnet
  • 17:06 cwd: disabled payments/civi/listener
  • 17:06 cwd: disabled process-control jobs
  • 17:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1089 (duration: 01m 12s)
  • 16:52 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1009.eqiad.wmnet
  • 16:52 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1008.eqiad.wmnet
  • 16:46 jynus: upgrade and restart db2063
  • 16:34 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1008.eqiad.wmnet
  • 16:34 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1007.eqiad.wmnet
  • 16:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 (duration: 01m 12s)
  • 16:19 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1007.eqiad.wmnet
  • 16:18 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 16:15 jynus: upgrade and restart db2056
  • 15:44 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2006.codfw.wmnet
  • 15:44 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2005.codfw.wmnet
  • 15:39 jynus: upgrade and restart db2049
  • 14:27 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2005.codfw.wmnet
  • 13:24 jynus: upgrade and restart db2041
  • 12:55 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 12:37 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2004.codfw.wmnet
  • 12:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2003.codfw.wmnet
  • 12:27 jynus: stop db2035 replication for maintenance
  • 12:23 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2035 for maintenance (duration: 01m 13s)
  • 12:15 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2003.codfw.wmnet
  • 12:14 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 11:42 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 11:41 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1066 (duration: 01m 12s)
  • 11:07 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 10:50 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2007.codfw.wmnet
  • 10:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1105:3311 and slowly repool db1066 (duration: 01m 13s)
  • 10:33 twentyafterfour@tin: Finished deploy [phabricator/deployment@61f1099]: (no justification provided) (duration: 03m 49s)
  • 10:33 elukey: reboot analytics1066->69 for kernel updates
  • 10:30 twentyafterfour@tin: Started deploy [phabricator/deployment@61f1099]: (no justification provided)
  • 10:29 twentyafterfour@tin: Finished deploy [phabricator/deployment@61f1099]: (no justification provided) (duration: 00m 07s)
  • 10:29 twentyafterfour@tin: Started deploy [phabricator/deployment@61f1099]: (no justification provided)
  • 10:24 moritzm: reboot job runners in codfw for kernel security update (along with update to HHVM 3.18.6)
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight db1105:3311 (duration: 01m 13s)
  • 10:11 godog: upload scap 3.7.5-1 - T184774
  • 10:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 and db1100 (duration: 01m 22s)
  • 10:02 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw2140.codfw.wmnet
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1100, db1105:3311, db1105:3312 (duration: 01m 23s)
  • 09:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1100 (duration: 01m 22s)
  • 09:11 godog: reboot ms-be2023 - sdn failed and raid controller isn't happy
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1105:3311 and db1105:3312 (duration: 01m 23s)
  • 09:07 elukey: reboot analytics1063->65 for kernel updates
  • 09:04 marostegui: Upgrade kernel on db1100
  • 09:00 elukey: forced remount of /mnt/hdfs on stat1005 after OOM
  • 08:46 moritzm: reboot remaining API servers in codfw for kernel security update (along with update to HHVM 3.18.6)
  • 08:14 moritzm: reboot video scalers in codfw for kernel security update
  • 07:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3312 with low weight (duration: 01m 22s)
  • 07:01 marostegui: Stop replication in sync db1089 db1105:3311 - T162807
  • 06:46 marostegui: Update mariadb and kernel on db1105 - T184256
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311, db1105:3312 - T162807 T184256 (duration: 01m 22s)
  • 06:24 marostegui: Deploy schema change on db1100 - T174569
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T174569 (duration: 01m 22s)
  • 00:51 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/WikibaseQualityConstraints/extension.json: SWAT: Declare dependency on jquery.makeCollapsible (duration: 01m 21s)
  • 00:43 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/WikibaseQualityConstraints/modules/ui/ConstraintReportGroup.less: SWAT: Do not hide default [Expand] link (duration: 01m 22s)
  • 00:40 thcipriani@tin: Synchronized php-1.31.0-wmf.16/extensions/WikibaseQualityConstraints/modules/ui/ConstraintReportGroup.less: SWAT: Do not hide default [Expand] link (duration: 01m 24s)

2018-01-11

  • 22:35 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/403774
  • 22:04 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403762/
  • 20:57 ottomata: restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403753/
  • 20:52 andrewbogott: rebooting labvirt1003
  • 20:12 twentyafterfour@tin: rebuilt and synchronized wikiversions files: Rollback group1 to wmf.15 due to T184749 refs T180749
  • 20:12 andrewbogott: rebooting labvirt1017 for kernel upgrade
  • 20:04 catrope@tin: Finished scap: SWAT (duration: 30m 12s)
  • 19:58 gehel: elasticsearch / cirrus / codfw rolling reboot completed. Cluster still recovering
  • 19:34 catrope@tin: Started scap: SWAT
  • 19:20 catrope@tin: Synchronized php-1.31.0-wmf.16/includes/: Deprecate old interwiki search result widget (duration: 02m 17s)
  • 19:09 catrope@tin: Synchronized php-1.31.0-wmf.16/extensions/Flow/modules/styles/flow/widgets/editor/mw.flow.ui.EditorWidget.less: T184631 (duration: 01m 22s)
  • 18:06 ema: lvs4007: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 18:00 jynus: upgrade and restart db1102- it may add some minutes of lag to some wikis on wikireplicas
  • 17:32 jynus: shutting down db1059 for maintenance
  • 16:57 akosiaris: upgrade apertium on scb100* nodes done
  • 16:55 godog: start rolling restart of restbase-test / restbase-dev cluster
  • 16:54 jynus: upgrade and restart db1095- it may add some minutes of lag to some wikis on wikireplicas
  • 16:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1099:3311 (duration: 01m 22s)
  • 16:28 moritzm: rebooting mwlog2001 for kernel security update
  • 16:19 moritzm: rebooting mwlog1001 for kernel security update
  • 16:05 moritzm: rebooting notebook1001 for kernel security update
  • 16:05 akosiaris: upgrade apertium on scb200* nodes
  • 15:59 moritzm: reboot lithium for kernel security update
  • 15:51 moritzm: reboot oxygen for kernel security update
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight (duration: 01m 21s)
  • 15:28 moritzm: reboot ruthenium for kernel security update
  • 15:26 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 15:18 akosiaris: clear trusty-wikimedia from apertium packages. The apertium services is a long time now on jessie and all users should have migrated by now. If not, they should
  • 15:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3311 weight (duration: 01m 23s)
  • 15:08 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1008.eqiad.wmnet
  • 15:05 moritzm: rolling reboot of prometheus in eqiad for kernel security update
  • 15:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1007.eqiad.wmnet
  • 14:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067,db1099:3318,db1099:3311, depool db1066 (duration: 01m 19s)
  • 14:58 marostegui: Upgrade mariadb and kernel on db1066
  • 14:47 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1007.eqiad.wmnet
  • 14:47 godog: continue swift frontend eqiad roll-restart, ms-fe1007 / ms-fe1008
  • 14:45 jynus@tin: Synchronized wmf-config/db-codfw.php: Promote db2040 as the new codfw-s7 master (duration: 01m 22s)
  • 14:40 moritzm: rolling reboot of prometheus in codfw for kernel security update
  • 14:37 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1271.eqiad.wmnet
  • 14:36 joal@tin: Finished deploy [analytics/refinery@ed8ecbc]: Patching interlanguage link and manually add a jar to our collection (duration: 04m 10s)
  • 14:36 jynus: running scap pull on mw1271
  • 14:32 joal@tin: Started deploy [analytics/refinery@ed8ecbc]: Patching interlanguage link and manually add a jar to our collection
  • 14:26 moritzm: powercycling mw1271
  • 14:25 zeljkof: EU SWAT finished
  • 14:17 jynus: upgrade and restart db2029
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create extendedconfirmed for kowiki (T184675) (duration: 01m 23s)
  • 14:14 akosiaris: set migration_downtime to 2000ms for seaborgium
  • 14:01 moritzm: reboot hafnium for kernel security update
  • 14:00 moritzm: reboot tungsten for kernel security update
  • 13:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1099:3318 weight (duration: 01m 15s)
  • 13:56 jynus: perform master switchover of s7 codfw
  • 13:42 moritzm: rebooting ores2* for kernel security update
  • 13:34 jynus: upgrade and restart db2077
  • 13:34 moritzm: rebooting bast2001 for kernel security update
  • 13:31 moritzm: migrating instances off ganeti1001 for subsequent reboot for kernel security update
  • 13:27 moritzm: failover the ganeti master in eqiad to ganeti1004
  • 12:39 volans: Icinga failover back to einsteinium completed - T170353
  • 12:38 moritzm: rearmed keyholder on naos
  • 12:36 moritzm: migrating instances off ganeti1007 for subsequent reboot for kernel security update
  • 12:34 moritzm: rebooting naos for kernel security update
  • 12:28 volans: Start Icinga failover back to einsteinium - T170353
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 with low weight (duration: 01m 44s)
  • 12:07 marostegui: Stop replication in sync db1089 db1099:3311 - T162807
  • 12:03 moritzm: migrating instances off ganeti1006 for subsequent reboot for kernel security update
  • 11:33 moritzm: migrating instances off ganeti1005 for subsequent reboot for kernel security update
  • 11:14 moritzm: migrating instances off ganeti1004 for subsequent reboot for kernel security update
  • 11:07 moritzm: reboot remaining job runners in eqiad for kernel security update (along with update to HHVM 3.18.6)
  • 11:02 akosiaris: upload cg3_1.0.0~r12254-1+wmf1_amd64 to apt.wikimedia.org/jessie-wikimedia/main
  • 11:02 moritzm: migrating instances off ganeti1003 for subsequent reboot for kernel security update
  • 10:56 akosiaris: upload apertium_3.4.2~r68466-3+wmf1_amd64to apt.wikimedia.org/jessie-wikimedia/main T181464
  • 10:54 akosiaris: set kvm:migration_downtime to 30ms for both eqiad/codfw ganeti clusters. Then set migration_downtime 30000 for nitrogen/nihal
  • 10:52 moritzm: rearmed keyholder on tin
  • 10:47 moritzm: rebooting tin for kernel security update
  • 10:43 marostegui: Upgrade and restart db1099:3311 and db1099:3318
  • 10:41 jynus: upgrade and restart db2068
  • 10:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1110 original weight (duration: 01m 04s)
  • 10:27 moritzm: rolling reboot of sca/zotero clusters for kernel security update
  • 10:23 jynus: upgrade and restart db2061
  • 10:20 akosiaris: upload hfst_3.13.0~r3461-1+wmf1_amd64 to apt.wikimedia.org/jessie-wikimedia/main T181463
  • 10:14 moritzm: migrating instances off ganeti1002 for subsequent reboot for kernel security update
  • 10:07 jynus: upgrade and restart db2054
  • 10:06 moritzm: rebooting rhenium for kernel security update
  • 10:00 elukey: reboot analytics1059-61 for kernel updates
  • 10:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 weight (duration: 01m 06s)
  • 09:41 moritzm: reboot bast4002 for kernel security update
  • 09:34 elukey: reboot analytics1055->1058 for kernel updates
  • 09:32 godog: cleanup ores metrics older than 30d - T169969
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with low weight - T174569 (duration: 01m 08s)
  • 09:24 gehel: relforge reboot completed
  • 09:08 gehel: reboot of relforge* for kernel upgrade
  • 09:04 elukey: reboot analytics1051->1054 for kernel updates
  • 09:00 gehel: logstash rolling restart completed
  • 08:57 moritzm: reboot remaining mediawiki app servers in eqiad for kernel security update (along with update to HHVM 3.18.6)
  • 08:55 marostegui: Upgrade db1110 kernel - T184256
  • 08:36 moritzm: powercycling wtp2013 (apparently didn't come back up after reboot)
  • 08:27 marostegui: Fix data drifts on enwiki.archive on codfw - T162807
  • 08:21 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=logstash1007.eqiad.wmnet
  • 08:17 gehel: rolling restart of logstash for kernel upgrade
  • 07:50 marostegui: Deploy schema change on db1110 - T174569
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 - T174569 (duration: 01m 03s)
  • 07:47 moritzm: reboot remaining mediawiki API servers for kernel security update (along with update to HHVM 3.18.6)
  • 07:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T174569 (duration: 01m 03s)
  • 07:24 marostegui: Drop external_user table from s3 - T184247
  • 07:17 foks: Removed 2FA from Amjaabc
  • 07:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 db1099:3318 - T162807 T184256 (duration: 01m 02s)
  • 06:32 marostegui: Deploy schema change on db1082.s5 with replication (this will generate lag on labs) - T174569
  • 06:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T174569 (duration: 01m 02s)
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 - T174569 (duration: 01m 03s)
  • 06:21 marostegui: Upgrade mariadb+kernel on db1089
  • 06:17 marostegui: Force BBU relearn on db1059 - T184160
  • 02:37 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 11m 10s)
  • 00:57 urandom: bootstrapping restbase1011-c -- T184100

2018-01-10

  • 23:50 eileen: civicrm revision changed from 429a5c5385 to 354f32fe8a, deploy contact change, contact search fixes, install cleanup
  • 21:57 twentyafterfour: group1 looks stable. This concludes the MediaWiki train for today.
  • 21:54 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.16 (duration: 01m 02s)
  • 21:53 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.16
  • 21:47 twentyafterfour@tin: Finished scap: group0 to 1.31.0-wmf.16 refs T180749 (duration: 38m 29s)
  • 21:09 twentyafterfour@tin: Started scap: group0 to 1.31.0-wmf.16 refs T180749
  • 20:49 twentyafterfour@tin: Synchronized php-1.31.0-wmf.16: Sync wmf.16 to deploy multiple patches from addshore refs T180749 (duration: 10m 23s)
  • 20:14 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: Update eventstreams with newer service-template-node: T171011 (duration: 04m 11s)
  • 20:09 otto@tin: Started deploy [eventstreams/deploy@ee854df]: Update eventstreams with newer service-template-node: T171011
  • 20:09 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: Update eventstreams deploy test to scb2002: T171011 (duration: 00m 24s)
  • 20:09 otto@tin: Started deploy [eventstreams/deploy@ee854df]: Update eventstreams deploy test to scb2002: T171011
  • 20:05 jynus: upgrade and restart dbstore2002
  • 20:00 jynus: upgrade and restart dbstore2001
  • 19:45 jynus: upgrade and restart db2047
  • 19:32 urandom: bootstrapping restbase1011-b -- T184100
  • 19:22 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Add throttle rule for Paris University and sort other by date T184618 (duration: 01m 03s)
  • 19:00 jynus: upgrade and restart db1059
  • 18:45 chasemp: reboot labtestvirt2002.codfw.wmnet w/ new kernel
  • 18:40 andrewbogott: upgrading labvirt1018 kernel and rebooting
  • 18:23 jynus: upgrade and restart db2040
  • 17:59 jynus: upgrade and restart db2087
  • 17:48 andrewbogott: installing linux-image-generic-lts-xenial on labtestvirt2003
  • 17:44 jynus: upgrade and restart db2086
  • 16:55 elukey: reboot analytics1047->50 for kernel updates
  • 16:43 akosiaris: wtp* rolling restarts for meltdown finished
  • 16:39 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1006.eqiad.wmnet
  • 16:38 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 16:35 godog: bounce thumbor-instances on thumbor1001
  • 16:26 anomie: Running cleanupUsersWithNoId.php on dewiki and wikidatawiki
  • 16:22 ottomata: restarting kafka jumbo brokers to apply java.security certpath restrictions
  • 16:08 godog: roll-restart swift frontend in eqiad for kernel upgrade
  • 16:06 moritzm: migrating instances off ganeti2001 for subsequent reboot for kernel security update
  • 16:05 moritzm: switched ganeti master node in codfw to ganeti2004
  • 16:03 marostegui: Deploy schema change on db1096.s5 - https://phabricator.wikimedia.org/T174569
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T174569 (duration: 01m 02s)
  • 15:59 godog: start cassandra-a on restbase1011
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T174569 (duration: 01m 03s)
  • 15:32 moritzm: rebooting yubico auth servers for kernel security update
  • 15:14 moritzm: reboot netmon1002 / netmon2001 for kernel security update
  • 14:54 ema: codfw LVSs: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:51 godog: start cassandra-a on restbase1011 - T184100
  • 14:50 zeljkof: EU SWAT finished
  • 14:50 jynus: dropping dewiki from dbstore2001:3318 T184599
  • 14:47 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: translationadmin: remove configuration equal to CommonSettings.php (T184314) (duration: 01m 02s)
  • 14:46 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: translationadmin: typo fix (duration: 01m 03s)
  • 14:42 chasemp: new meltdown images are live in cloud land
  • 14:34 jynus: dropping wikidatawiki from dbstore2001:3315 T184599
  • 14:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Lift the cap on IP address to create accounts on mrwiki (T184579) (duration: 01m 04s)
  • 14:05 moritzm: migrating instances off ganeti2002 for subsequent reboot for kernel security update
  • 13:37 moritzm: migrating instances off ganeti2003 for subsequent reboot for kernel security update
  • 13:26 _joe_: restarting pybal on lvs2003
  • 13:03 mobrovac@tin: Finished deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974 (duration: 08m 00s)
  • 12:55 mobrovac@tin: Started deploy [restbase/deploy@a2aabfb]: API: add top-by-country, change recommendation route, fix duplicates in onthisday - T181520 T170877 T175974
  • 12:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T174569 (duration: 01m 03s)
  • 12:54 marostegui: Deploy schema change on db1097:3315 - https://phabricator.wikimedia.org/T174569
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T174569 (duration: 01m 03s)
  • 12:38 moritzm: migrating instances off ganeti2004 for subsequent reboot for kernel security update
  • 12:19 moritzm: migrating instances off ganeti2005 for subsequent reboot for kernel security update
  • 12:11 moritzm: rebooting einsteinium for kernel security update
  • 11:51 moritzm: migrating instances off ganeti2006 for subsequent reboot for kernel security update
  • 11:45 godog: downtime decomissioned restbase cassandra 2 hosts
  • 11:39 moritzm: rebooting mw1201-mw1208 for kernel security update (along with update to HHVM 3.18.6)
  • 11:33 marostegui: Deploy schema change on db1106 - T174569
  • 11:26 elukey: reboot analytics1044->47 for kernel updates
  • 11:23 moritzm: migrating instances off ganeti2007 for subsequent reboot for kernel security update
  • 11:19 volans: Icinga failover to tegmen completed - T170353
  • 11:12 moritzm: migrating instances off ganeti2008 for subsequent reboot for kernel security update
  • 11:07 volans: start failovering of Icinga to tegmen - T170353
  • 10:55 elukey: reboot analytics1040->43 for kernel updates
  • 10:29 godog: reimage restbase1011 to test HBA mode - T184100
  • 10:16 moritzm: rebooting bast4001 for kernel security update
  • 10:06 elukey: rebooting analytics1035 (hadoop worker node and hdfs journal node) for kernel updates
  • 10:02 moritzm: rebooting tegmen for kernel security update
  • 09:50 godog: shut cassandra 2 on restbase legacy nodes - T184100
  • 09:40 moritzm: rebooting kubernetes workers (plus staging hosts) for kernel security update
  • 09:39 ema: eqiad LVSs: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:32 marostegui: Upgrade kernel on db1067
  • 09:27 godog: stop restbase on cassandra 2 nodes - T184100
  • 09:15 marostegui: Deploy schema change on db1051 - T174569
  • 09:12 moritzm: rebooting radium (tor relay) for kernel security update
  • 08:42 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 01m 05s)
  • 08:38 marostegui: Deploy schema change on s5 dbstore1001 - T174569
  • 08:33 moritzm: rebooting mw1299-mw1306 (job runners) for kernel security update (along with update to HHVM 3.18.6)
  • 08:28 hashar: contint1001: upgraded Zuul 2.5.0-8-gcbc7f62-wmf4jessie1 .. 2.5.0-8-gcbc7f62-wmf6 | T158243
  • 08:13 marostegui: Deploy schema change on s5 dbstore1002 - T174569
  • 07:44 moritzm: rebooting mw1262-mw1275 for kernel security update (along with update to HHVM 3.18.6)
  • 07:37 marostegui: Drop external_user from wikidatawiki - T184247
  • 06:17 marostegui: Deploy schema change on s5 codfw master (db2052) with replication (this will generate lag on codfw) - T174569
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 02s)
  • 01:39 mutante: mw1226 - high load - hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1739PST.log ; restart-hhvm
  • 00:43 mutante: rebooting gerrit server for kernel upgrade
  • 00:18 mutante: rebooting phabricator server for kernel upgrade

2018-01-09

  • 22:52 godog: ms-be1033 truncate unrotated and big server.log
  • 22:22 aaron@tin: Synchronized php-1.31.0-wmf.16/includes/Setup.php: 68b4bbf (duration: 01m 15s)
  • 22:20 mutante: netmon2001 - arming keyholder for rancid
  • 21:10 mepps: updated SmashPig from 45aa62650c to 778e8f87b4
  • 20:57 twentyafterfour@tin: Finished scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2) (duration: 36m 34s)
  • 20:21 twentyafterfour@tin: Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749 (attempt 2)
  • 20:14 twentyafterfour@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="test2wiki" --outdir="/tmp/scap_l10n_3984299293" --threads=10 --lang en --quiet' returned non-zero exit status 1 (duration: 02m 44s)
  • 20:13 mutante: netmon2001 - rebooting
  • 20:12 twentyafterfour@tin: Started scap: Deploy 1.31.0-wmf.16 to test wikis and rebuild l10n. refs T180749
  • 20:04 mutante: gerrit2001 - rebooting
  • 20:00 mutante: phab2001 - reboot for upgrade
  • 19:20 mepps: rolledback SmashPig from 0c45b1a684 to 45aa62650c
  • 19:07 mepps: updated SmashPig from 45aa62650c to 0c45b1a684
  • 18:42 mutante: ms-fe3002,ms-fe3001 - powering down, removing from puppet and icinga, ms-be* removing from puppet/icinga (T169518)
  • 18:38 mutante: ms-fe3001 - shutting down for decom, removed from puppet
  • 18:38 mutante: mw1227 still not showing recovery, using restart-hhvm
  • 18:29 mutante: mw1227 killed it one more time and also restarted apache.. now load going down
  • 18:26 mutante: mw1227 hhvm-dump-debug > /root/hhvm-dump-debug-20170109-1024PST.log ; then killed hhvm and restarted it with systemctl
  • 17:56 twentyafterfour: MediaWiki Train: Branching 1.31.0-wmf.16
  • 17:41 moritzm: rebooting image scalers in codfw for kernel security update (along with HHVM update)
  • 17:30 volans: re-enabled Icinga event handlers on RAID checks for lvs3001
  • 17:17 ema: failover traffic back to lvs3001, raid rebuilt
  • 17:15 godog: depool restbase cassandra 2 nodes - T184100
  • 16:35 cmjohnson1: disabling pupppet for decom on mw1180-1200
  • 16:28 volans: disabled Icinga event handlers on RAID checks for lvs3001, WIP on the host
  • 16:18 gehel: starting cluster reboot for elasticsearch / cirrus codfw
  • 16:09 bd808: data-services: added s8.{analytics,web}.db.svc.eqiad.wmflabs and aliases (T181643, T184179)
  • 16:09 elukey: re-started mysql on dbstore1002 (and slave replication) after hw maintenance
  • 15:44 godog: roll-restart swift frontends in codfw and eqiad
  • 15:40 akosiaris@tin: Finished deploy [servermon/servermon@10e165e]: Testing scap check (duration: 00m 02s)
  • 15:40 akosiaris@tin: Started deploy [servermon/servermon@10e165e]: Testing scap check
  • 15:31 gehel: reboot maps-test* for kernel upgrade
  • 15:30 elukey: stop mysql on dbstore1002 as prep step for shutdown (stop all slaves, mysql stop)
  • 15:23 herron: puppet master reboots complete. re-enabling puppet agents
  • 15:18 ema: lvs3001 disk swap: failover traffic to lvs3003 T166965
  • 15:10 elukey: reboot analytics1028 (hadoop worker and hdfs journal node) for kernel updates
  • 15:07 anomie: Creating MCR tables on all wikis (T183486)
  • 15:01 herron: temporarily disabling puppet agents and rebooting puppet masters for security updates
  • 15:00 elukey: reboot kafka-jumbo1006 for kernel updates
  • 14:59 ema: lvs3001: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267, replace sdb T166965
  • 14:48 moritzm: rolling reboot of scb in eqiad for kernel security update
  • 14:41 elukey: reboot kafka-jumbo1005 for kernel updates
  • 14:36 godog: upgrade and roll-restart thumbor in codfw/eqiad - T182656 T183907 T169144
  • 14:32 elukey: reboot kafka1023 for kernel updates
  • 14:21 elukey: reboot kafka-jumbo1004 for kernel updates
  • 14:14 moritzm: rolling reboot of scb in codfw for kernel security update
  • 14:14 ema: lvs3003: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:07 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Save -> Publish on remaining Wikinewses which haven't updated - https://gerrit.wikimedia.org/r/#/c/403077/ (duration: 00m 53s)
  • 14:06 ema: lvs3002: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:04 elukey: reboot kafka1022 for kernel updates
  • 14:01 godog: copy poolcounter from jessie-wikimedia into stretch-wikimedia - T183385
  • 13:51 elukey: reboot kafka-jumbo1003 for kernel updates
  • 13:34 moritzm: rebooting remaining video scalers in eqiad for kernel security update (along with HHVM update)
  • 13:10 elukey: reboot kafka1020 for kernel updates
  • 13:07 mobrovac@tin: Finished deploy [restbase/deploy@837f5a9]: Force deploy on all targets - T184110 (duration: 07m 23s)
  • 13:00 mobrovac@tin: Started deploy [restbase/deploy@837f5a9]: Force deploy on all targets - T184110
  • 12:58 moritzm: rebooting labnodepool* for kernel security update
  • 12:55 akosiaris@tin: Finished deploy [servermon/servermon@10e165e]: Update servermon (duration: 00m 02s)
  • 12:54 akosiaris@tin: Started deploy [servermon/servermon@10e165e]: Update servermon
  • 12:23 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1014.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1012.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1011.eqiad.wmnet
  • 12:22 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1010.eqiad.wmnet
  • 12:19 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1009.eqiad.wmnet
  • 12:17 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1008.eqiad.wmnet
  • 12:17 moritzm: rebooting scb2001 for kernel security update
  • 12:09 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1007.eqiad.wmnet
  • 12:07 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 12:05 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2005.codfw.wmnet
  • 12:04 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 12:03 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2003.codfw.wmnet
  • 11:58 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 11:56 godog: roll-restart restbase c3 nodes in codfw/eqiad
  • 11:50 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:43 moritzm: rebooting app servers mw1238-mw1258 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 11:25 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 11:17 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 11:03 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2004.codfw.wmnet
  • 11:03 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2006.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2006.codfw.wmnet
  • 11:02 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2004.codfw.wmnet
  • 11:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2001.codfw.wmnet
  • 10:59 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 10:07 ema: cp3041 soft lockup, rebooting
  • 10:03 elukey: reboot kafka-jumbo1002 for kernel updates
  • 09:59 ema: failover traffic lvs3002 -> lvs3004 (new kernel)
  • 09:51 ema: lvs3004: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:35 elukey: reboot kafka1014 for kernel updates
  • 09:32 godog: deploy restbase to cassandra 3 nodes
  • 09:11 godog: roll restart swift in eqiad for kernel upgrade
  • 08:39 moritzm: rebooting app servers in codfw for kernel security update
  • 08:15 jynus: stopping dbstore2001:s5 for cloning to s8
  • 06:32 _joe_: restarting pdfrender on scb1003
  • 06:29 marostegui@tin: Synchronized docroot/noc/conf/s8.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 48s)
  • 06:27 marostegui@tin: Synchronized dblists/s8.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 50s)
  • 06:26 marostegui@tin: Synchronized dblists/s5.dblist: Deploy the dblist files with the correct databases after the split (duration: 00m 53s)
  • 06:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove read_only from s5 and s8 T177208 T181645 (duration: 00m 27s)
  • 06:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Splitting s5 and s8 T177208 T181645 (duration: 00m 50s)
  • 06:07 jynus: stopping slave and reseting on db1071
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Set s5 on read-only to start failover T177208 T181645 (duration: 00m 50s)
  • 05:12 marostegui: Start pre-failover tasks T177208 T181645
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 05m 31s)
  • 00:43 mutante: phabricator servers: upgraded php5-*, openssh
  • 00:17 mutante: netmon1002/2001 - upgraded php7.0 related packages | krypton (webserver_misc_apps) - upgraded php5 packages
  • 00:08 mutante: contint1001/2001 - upgraded php5-related packages
  • 00:06 mutante: releases1001/2001 - upgraded kernel image, planet - upgraded openssl et al

2018-01-08

  • 23:56 mutante: rutherfordium (people.wm.org) - upgrading PHP5
  • 21:52 bsitzmann@tin: Finished deploy [mobileapps/deploy@1bfd4b0]: Update mobileapps to d20915c (T184430 T184429) (duration: 05m 33s)
  • 21:47 bsitzmann@tin: Started deploy [mobileapps/deploy@1bfd4b0]: Update mobileapps to d20915c (T184430 T184429)
  • 21:30 arlolra: Updated Parsoid to e133312 (T182349, T183893, T159985)
  • 21:22 arlolra@tin: Finished deploy [parsoid/deploy@1dac474]: Updating Parsoid to e133312 (duration: 10m 31s)
  • 21:12 arlolra@tin: Started deploy [parsoid/deploy@1dac474]: Updating Parsoid to e133312
  • 21:05 mutante: new Wikipedia lanuage: "inh" - recreating/reloading DNS zones to add "inh" (Ingush) from langs.tmpl (T184374) https://wikitech.wikimedia.org/wiki/Add_a_wiki#DNS
  • 20:09 ejegg: rolled back smashpig payments listener from 0e703f502d to 45aa62650c
  • 19:34 ottomata: rebooting analytics1002 and then analytics1001 to apply proxyuser changes and kernel update
  • 19:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove language button from Wikidata and MediaWiki T183665 (duration: 00m 51s)
  • 19:04 ejegg: updated SmashPig payments listener from 45aa62650c to 0e703f502d
  • 18:15 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2040 (duration: 00m 50s)
  • 18:10 gehel@tin: Finished deploy [wdqs/wdqs@c680f55]: (no justification provided) (duration: 02m 03s)
  • 18:08 gehel@tin: Started deploy [wdqs/wdqs@c680f55]: (no justification provided)
  • 16:57 milimetric@tin: Finished deploy [analytics/refinery@f99e7dd]: Update and re-run interlanguage job (duration: 11m 28s)
  • 16:45 milimetric@tin: Started deploy [analytics/refinery@f99e7dd]: Update and re-run interlanguage job
  • 16:36 jynus: stopping replication on db2040
  • 16:28 cormacparle: About to run refreshFileHeaders.php on all wikis to fix https://phabricator.wikimedia.org/T178849
  • 15:23 elukey: reboot kafka1013 for kernel updates
  • 15:17 jynus@tin: Synchronized wmf-config/db-codfw.php: Fix db2039 comments (duration: 00m 50s)
  • 15:12 ema: cache_upload: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 15:03 hashar@tin: Synchronized dblists/group1-wikipedia.dblist: Add test2wiki as a group1 wiki - T182326 (duration: 00m 50s)
  • 14:57 gehel: rolling reboot of maps servers for kernel upgrade
  • 14:56 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable fine grained usage tracking in hewiki - T172914 (duration: 00m 50s)
  • 14:51 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add Translation: namespace on Punjabi Wikisource - T179807 (duration: 00m 50s)
  • 14:48 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Turn on mapframe for Arabic Wikipedia - T183764 (duration: 00m 51s)
  • 14:33 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add new namespace aliases on zhwiki - T183711 (duration: 00m 50s)
  • 14:28 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable commons import in tawikisource - T181774 (duration: 00m 48s)
  • 14:27 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Update logo for chrwiki, add the HD version T180553 (duration: 00m 50s)
  • 14:26 ema: cache_text: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 14:25 hashar@tin: Synchronized static/images/project-logos: Update logo for chrwiki, add the HD version T180553 (duration: 00m 51s)
  • 14:23 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Move wiktionary HD logo to wiktionaries - T183922 (duration: 00m 50s)
  • 14:21 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgKartographerStaticMapframe for lvwiki - T183981 (duration: 00m 51s)
  • 14:16 hashar@tin: Synchronized wmf-config/Wikibase-production.php: Don’t check constraints on example properties - T183267 (duration: 00m 51s)
  • 13:50 moritzm: rebooting mw image scalers in eqiad for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 13:42 gehel: rolling restart of wdqs servers for kernel upgrades
  • 13:41 elukey: reboot analytics10[36-39] for kernel updates
  • 13:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1109 original status (duration: 00m 50s)
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up db1109 (duration: 00m 52s)
  • 13:07 joal@tin: Finished deploy [analytics/aqs/deploy@ab85797]: Add pageview top-by-country endpoint (duration: 17m 57s)
  • 13:05 moritzm: rebooting mw1259/mw1260 (video scalers) for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 12:59 elukey: reboot kafka1012 for kernel updates
  • 12:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 27s)
  • 12:49 joal@tin: Started deploy [analytics/aqs/deploy@ab85797]: Add pageview top-by-country endpoint
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 59s)
  • 12:37 fdans@tin: Finished deploy [analytics/aqs/deploy@ab85797]: (no justification provided) (duration: 00m 16s)
  • 12:37 fdans@tin: Started deploy [analytics/aqs/deploy@ab85797]: (no justification provided)
  • 12:35 moritzm: rebooting mw1209-mw1220 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 12:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: revert warm up s8 future hosts - T177208 (duration: 02m 58s)
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Warm up s8 future hosts - T177208 (duration: 00m 52s)
  • 12:18 akosiaris@tin: Finished deploy [servermon/servermon@b9832c5]: Update servermon (duration: 00m 02s)
  • 12:18 akosiaris@tin: Started deploy [servermon/servermon@b9832c5]: Update servermon
  • 12:01 moritzm: rebooting mw1221-mw1235 for kernel security update (along with update to HHVM 3.18.6 where applicable)
  • 11:38 moritzm: rebooting mwdebug* for kernel security update
  • 11:28 godog: puppet node deactivate wtp10[568] - T177374
  • 11:06 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 11:05 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 10:50 godog: roll restart swift in codfw for kernel upgrades
  • 10:40 akosiaris@tin: Finished deploy [servermon/servermon@53b81d8]: Update servermon (duration: 00m 02s)
  • 10:40 akosiaris@tin: Started deploy [servermon/servermon@53b81d8]: Update servermon
  • 10:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 and db1089 - T162807 (duration: 00m 50s)
  • 10:26 hashar: Started docker on contint1001 / contint2001 . They were missing the overlay/overlayfs kernel modules | T184410
  • 10:04 elukey: drain + reboot analytics1029,1031->1034 for kernel updates
  • 10:03 jynus: fixing wrong events on db2039, db1071,db2023, db2045, db2052, db1100
  • 09:53 godog: Flashing Smart Array P840 in Slot 3 [ 4.52 -> 6.06 ] on ms-be2037 - T184390 T141756
  • 09:46 hashar: rebooting CI
  • 09:46 godog: reboot ms-be2037 - T184390
  • 09:39 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 to mw1261,mw2251,mw1276 and all videoscalers (Recently rebooted/reimaged)
  • 09:38 hashar: upgrading contint1001 / contint1002 | T184267
  • 09:24 ema: cache_misc: upgrade to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 09:17 _joe_: starting 3 manual loops for consuming refreshLinks jobs for ruwiki
  • 09:14 marostegui: Force BBU relearn on db1059 - T184160
  • 08:30 moritzm: installing remaining openssl updates
  • 07:24 marostegui: Stop MySQL on db1039 for decommission - T184262
  • 07:17 marostegui: Remove db1039 from tendril - T184262
  • 07:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1039 as it will be decommissioned - T184262 (duration: 00m 50s)
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1039 as it will be decommissioned - T184262 (duration: 00m 50s)
  • 06:51 marostegui: Stop replication in sync on db1067 and db1089 - T162807
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 and db1089 - T162807 (duration: 00m 51s)
  • 06:32 marostegui: Disable BBU auto-learn on db1011
  • 06:17 marostegui: Deploy schema change on s7 primary master (db1062) - T174569
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.15) (duration: 06m 17s)

2018-01-07

  • 20:25 demon@tin: Synchronized wmf-config/interwiki.php: auto-sync with my plugin was busted 🙃 (duration: 00m 50s)
  • 19:56 demon@tin: Synchronized php-1.31.0-wmf.15/maintenance/Maintenance.php: fix stuff (duration: 00m 51s)
  • 19:32 demon@tin: Finished scap: Delete alswik(ibooks|iquote|tionary), mowik(ipedia|tionary) (duration: 21m 32s)
  • 19:10 demon@tin: Started scap: Delete alswik(ibooks|iquote|tionary), mowik(ipedia|tionary)
  • 08:52 elukey: re-enabled puppet on db110[78] - eventlogging_sync restarted on db1108 (analytics-slave) - T168414

2018-01-06

  • 08:09 elukey: re-enable eventlogging mysql consumers after database maintenance - T168414
  • 06:59 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw[1329-1333] (new appservers, was 120)
  • 06:49 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw1335 (new jobrunner, was 120)

2018-01-05

  • 22:27 tgr: T184263 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=eswiki --logwiki=metawiki "Mega849" "Mega809"
  • 20:47 demon@tin: Pruned MediaWiki: 1.31.0-wmf.12 [keeping static files] (duration: 02m 11s)
  • 18:15 otto@tin: Finished deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed) (duration: 00m 19s)
  • 18:15 otto@tin: Started deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed)
  • 18:14 otto@tin: Finished deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed) (duration: 02m 11s)
  • 18:11 otto@tin: Started deploy [analytics/superset/deploy@990bc38]: Running superset with python3 (fingers crossed)
  • 16:40 jynus: upgrade and restart labsdb1010
  • 16:29 akosiaris@tin: Finished deploy [servermon/servermon@cf88f3f]: Update servermon to 3c8538a (duration: 00m 02s)
  • 16:29 akosiaris@tin: Started deploy [servermon/servermon@cf88f3f]: Update servermon to 3c8538a
  • 16:07 akosiaris@tin: Finished deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a (duration: 00m 02s)
  • 16:07 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 16:06 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 16:05 akosiaris@tin: Finished deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a (duration: 00m 23s)
  • 16:04 akosiaris@tin: Started deploy [servermon/servermon@3c8538a]: Update servermon to 3c8538a
  • 15:50 marostegui: Upgrade db2071 kernel - T184256
  • 15:48 moritzm: rebooting multatuli for kernel update
  • 15:41 marostegui: Upgrade db2072 (mariadb and kernel) - T184256
  • 14:25 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1002.eqiad.wmnet
  • 14:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1002.eqiad.wmnet
  • 14:17 gehel: reboot maps1002 for kernel upgrade
  • 14:03 fdans@tin: (no justification provided)
  • 13:57 elukey@tin: Finished deploy [analytics/aqs/deploy@792c95d]: Add pageviews by country endpoint (duration: 01m 12s)
  • 13:56 elukey@tin: Started deploy [analytics/aqs/deploy@792c95d]: Add pageviews by country endpoint
  • 13:54 ema: upgrade cp3046 to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 13:53 fdans@tin: Finished deploy [analytics/aqs/deploy@792c95d]: (no justification provided) (duration: 00m 18s)
  • 13:52 fdans@tin: Started deploy [analytics/aqs/deploy@792c95d]: (no justification provided)
  • 13:37 gehel: rebooting wdqs1003 for kernel upgrade
  • 13:24 fdans@tin: Finished deploy [analytics/aqs/deploy@792c95d]: (no justification provided) (duration: 01m 32s)
  • 13:22 fdans@tin: Started deploy [analytics/aqs/deploy@792c95d]: (no justification provided)
  • 13:22 gehel: rebooting elastic1017 for kernel upgrade
  • 13:19 fdans: deploying Analytics Query Service
  • 12:44 elukey: reboot kafka-jumbo1001 for kernel updates
  • 12:43 ema: upgrade cp3007 to latest jessie point release (8.10) T182656 and linux kernel 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 12:04 jynus: upgrade and restart labsdb1011
  • 12:03 ema: reboot cp1008 into linux 4.9.65-3+deb9u1~bpo8+2 (KPTI) T184267
  • 10:15 godog: reboot restbase2004 to test kernel upgrade
  • 10:14 jynus: reboot labsdb1009
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 - T163190 (duration: 00m 27s)
  • 09:20 elukey: drain and reboot analytics1030 for kernel updates
  • 09:11 godog: reboot ms-be1014 to test update stretch kernel
  • 08:54 elukey: ran git checkout modules/role/manifests/puppetmaster/standalone.pp on labs-puppetmaster.wikimedia.org to unblock sync from prod
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T163190 (duration: 00m 28s)
  • 07:37 _joe_: rebooting mw1276 toio, kernel upgrade
  • 07:25 _joe_: rebooting mw1261
  • 06:49 marostegui: Stop replication in sync on db1039 and db1098:3317 - T163190
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 - T163190 (duration: 00m 27s)
  • 06:24 marostegui: Deploy schema change on db1094 - T174569
  • 06:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T163190 (duration: 00m 51s)
  • 03:54 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Undeploy EducationProgram from test2wiki (duration: 00m 48s)

2018-01-04

  • 23:30 apergos: rebooted releases1001 and 2001 (new kernel)
  • 22:09 moritzm: uploaded linux-meta 1.16 for jessie-wikimedia to apt.wikimedia.org (which installs the new KPTI-enabled kernel with the new ABI)
  • 22:03 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.15
  • 22:00 twentyafterfour: No blockers remain for T180748, proceeding to deploy wmf.15 to all wikis
  • 21:53 twentyafterfour@tin: Synchronized php-1.31.0-wmf.15/extensions/TitleBlacklist/TitleBlacklistPreAuthenticationProvider.php: Deploy 332fab0 to stop logspam and unblock the train (duration: 01m 02s)
  • 21:37 moritzm: uploaded linux-4.9.65-3+deb9u1~bpo8+2 for jessie-wikimedia to apt.wikimedia.org (provides KPTI backport)
  • 21:35 twentyafterfour@tin: Synchronized php-1.31.0-wmf.15/includes/parser/Parser.php: Deploy 601cf9d (duration: 01m 03s)
  • 21:33 twentyafterfour: deploying patches to unblock the train
  • 21:25 moritzm: reboot multatuli for kernel update
  • 20:06 twentyafterfour: There are still open blockers for wmf.15 - see T180748 .. attempting to resolve them to unblock the train.
  • 20:03 twentyafterfour: preparing to deploy the train (filling in for no_justification)
  • 19:51 joal@tin: Finished deploy [analytics/refinery@a69a2cd]: Regular analytics deploy (duration: 04m 38s)
  • 19:46 joal@tin: Started deploy [analytics/refinery@a69a2cd]: Regular analytics deploy
  • 18:58 bsitzmann@tin: Finished deploy [mobileapps/deploy@8bcffa9]: Update mobileapps to a4ba9fd (T182330 T177430 T170690 T182652 T184198) (duration: 06m 01s)
  • 18:52 bsitzmann@tin: Started deploy [mobileapps/deploy@8bcffa9]: Update mobileapps to a4ba9fd (T182330 T177430 T170690 T182652 T184198)
  • 18:27 jynus: upgrade and restart labsdb1009
  • 17:42 moritzm: upgrading HHVM on eqiad video scalers to 3.18.6
  • 17:40 demon@tin: Finished deploy [gerrit/gerrit@1e1a79d]: deploying hooks plugin (duration: 00m 10s)
  • 17:40 demon@tin: Started deploy [gerrit/gerrit@1e1a79d]: deploying hooks plugin
  • 16:38 jynus: upgrade and restart db2089 (s5/s6)
  • 16:14 jynus: upgrade and restart db2087 (s6/s7)
  • 15:44 jynus: upgrade and restart db2076
  • 15:36 jynus: upgrade and restart db2067
  • 15:31 demon@tin: Synchronized php-1.31.0-wmf.15/extensions/ActiveAbstract/: unbreak, T184177 (duration: 01m 02s)
  • 15:17 jynus: upgrade and restart db2060
  • 15:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 - T163190 (duration: 01m 02s)
  • 15:03 moritzm: upgrading HHVM on eqiad image scalers to 3.18.6
  • 14:54 jynus: restart db2046 database to move socket location
  • 14:24 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Adding Movepage-summary to wgForceUIMsgAsContentMsg T183848 (duration: 01m 02s)
  • 14:11 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Restrict sending mails to new users T182541 (duration: 01m 02s)
  • 13:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T163190 (duration: 01m 01s)
  • 13:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T163190 (duration: 01m 02s)
  • 13:35 marostegui: Stop replication in sync db1079 db1101:3317 T163190
  • 13:17 moritzm: upgrading HHVM on mw1180-mw1220 to 3.18.6
  • 12:53 moritzm: upgrading HHVM on mwdebug* to 3.18.6
  • 12:45 mobrovac@tin: Finished deploy [restbase/deploy@66b7efe]: Switch Mathoid to Cassandra 3 and drop Cassandra 2 references - T179419 (duration: 04m 05s)
  • 12:41 mobrovac@tin: Started deploy [restbase/deploy@66b7efe]: Switch Mathoid to Cassandra 3 and drop Cassandra 2 references - T179419
  • 12:07 mobrovac@tin: Finished deploy [mathoid/deploy@c9957ce]: Mathoid v0.7.1 - T172767 (duration: 05m 05s)
  • 12:02 mobrovac@tin: Started deploy [mathoid/deploy@c9957ce]: Mathoid v0.7.1 - T172767
  • 12:00 moritzm: upgrading HHVM on API canaries (mw1276-mw1279) to HHVM 3.18.6
  • 10:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T163190 (duration: 01m 01s)
  • 10:39 marostegui: Stop replication in sync on db1079 and db1101:3317 - T163190
  • 10:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T163190 (duration: 01m 02s)
  • 10:16 mobrovac@tin: Finished deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557 (duration: 02m 38s)
  • 10:14 mobrovac@tin: Started deploy [mathoid/deploy@7f664ff]: Update Mathoid in codfw to v0.7.0, take #2 - T183557
  • 09:58 jynus: restart and upgrade db2053
  • 09:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 - T163190 (duration: 03m 09s)
  • 09:38 moritzm: rebooting mw1307 and wtp1025 for kernel update
  • 09:13 moritzm: rebooting kubernetes1001 for kernel update
  • 08:57 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 on mw133[67] (new jobrunners)
  • 08:53 marostegui: Fixing inconsistencies on s7 - T163190
  • 08:48 marostegui: Deploy schema change on db1069 (s7) - T174569
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T174569 (duration: 01m 02s)
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T174569 (duration: 01m 02s)
  • 06:48 marostegui: Deploy schema change on db1079 (s7) with replication enabled - this will generate lag on labs replicas - T174569
  • 06:27 marostegui: Deploy schema change on db1068 (s4) master - T174569
  • 06:23 marostegui: Issue a BBU re-learn cycle on db1059 - T184160
  • 02:49 legoktm@tin: Synchronized php-1.31.0-wmf.15/extensions/Flow/Hooks.php: Fix CheckUser type check thingy - T182834 (duration: 01m 01s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 07m 50s)
  • 01:50 ladsgroup@tin: Synchronized dblists/group0.dblist: SWAT: Move testwiki2 from group0 to group1 (T182326) (duration: 01m 02s)

2018-01-03

  • 23:02 twentyafterfour: restarted apache on phab1001 to clear hung workers (refs T182832)
  • 22:31 bd808@tin: Finished deploy [striker/deploy@69f1b15]: Enhance membership request workflow and fix Diffusion repo creation (T168027, T182142) (duration: 00m 31s)
  • 22:31 bd808@tin: Started deploy [striker/deploy@69f1b15]: Enhance membership request workflow and fix Diffusion repo creation (T168027, T182142)
  • 21:41 ejegg: re-enabled ingenico audit
  • 21:27 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.15 (duration: 01m 01s)
  • 21:26 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.15
  • 21:26 twentyafterfour: deploying 1.31.0-wmf.15 to "Group 1" wikis
  • 21:01 ottomata: deleting stale topics from main kafka clusters: T149594
  • 20:56 mutante: uranium - revoked puppet cert, node deactivate, removing from DNS (T183209)
  • 20:50 mutante: uranium (ex-ganglia-web) is going into eternal downtime on Icinga.. shutdown -h RIP (T183209)
  • 20:23 thcipriani: updateCollation for eswiki running in screen as thcipriani on terbium
  • 20:19 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Do not enable lua fine grained tracking for any wiki T172914 (duration: 01m 02s)
  • 20:16 thcipriani@tin: Synchronized php-1.31.0-wmf.15/extensions/VisualEditor/lib/ve: SWAT: Update VE core submodule to master T182907 T183590 (duration: 01m 06s)
  • 20:06 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Close wikimania2017.wikimedia.org PART II T182493 (duration: 01m 04s)
  • 20:04 thcipriani@tin: Synchronized dblists/closed.dblist: SWAT: Close wikimania2017.wikimedia.org PART I T182493 (duration: 01m 02s)
  • 19:53 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Extension:Translate default permissions for Wikimedia wikis T178793 (duration: 01m 02s)
  • 19:42 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set category collation to uca-es-u-kn for eswiki T183802 (duration: 01m 02s)
  • 19:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Setup some namespace aliases for eswiki T183612 (duration: 01m 02s)
  • 19:14 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add configuration deboosting scientific articles on Wikidata T183510 (duration: 01m 02s)
  • 18:53 volans: restarted ircecho on einsteinium
  • 18:37 moritzm: uploaded hhvm 3.18.5+dfsg-1+wmf2+deb9u1 for stretch-wikimedia to apt.wikimedia.org
  • 18:25 ottomata: deploying change to produce statsv metrics to main kafka clusters from varnishkafka. statsv on hafnium will be restarted to consume from main. might cause a short blip in statsv metrics.
  • 18:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Decom db2028 (duration: 01m 01s)
  • 18:17 jynus@tin: Synchronized wmf-config/db-codfw.php: Decom db2028, repool pc2005 (duration: 01m 01s)
  • 17:47 otto@tin: Finished deploy [statsv/statsv@362d1a9]: statsv (duration: 00m 02s)
  • 17:47 otto@tin: Started deploy [statsv/statsv@362d1a9]: statsv
  • 17:35 godog: upload prometheus-jmx-exporter 0.10-3 to jessie/stretch
  • 17:35 demon@tin: Synchronized php-1.31.0-wmf.15/extensions/Wikibase: I9da46c36 (duration: 02m 00s)
  • 17:35 jynus: restart and upgrade db2046
  • 17:07 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2251.*.wmnet
  • 17:02 jynus: performing schema change on db2039 (s6)
  • 16:51 papaul: powering down pc2005 for maintenance
  • 16:18 otto@tin: Finished deploy [statsv/statsv@0a86be8]: revert (duration: 00m 02s)
  • 16:18 otto@tin: Started deploy [statsv/statsv@0a86be8]: revert
  • 16:18 otto@tin: Finished deploy [statsv/statsv@c390cdf]: no-op deployment of statsv with support for multiple topics (duration: 00m 03s)
  • 16:18 otto@tin: Started deploy [statsv/statsv@c390cdf]: no-op deployment of statsv with support for multiple topics
  • 16:14 papaul: powering down mw2251 for memory replacement and firmware uprade
  • 16:02 urandom: drop unused keyspaces in legacy restbase cluster - T183745
  • 15:51 niharika29@tin: Finished deploy [scholarships/scholarships@ec05ae7]: Remove outdated i18n files (duration: 00m 02s)
  • 15:51 niharika29@tin: Started deploy [scholarships/scholarships@ec05ae7]: Remove outdated i18n files
  • 15:48 jynus: stop pc2005's database for maintenance T183750
  • 15:46 jynus@tin: Synchronized wmf-config/db-codfw.php: "Depool" pc2005 (duration: 01m 02s)
  • 15:38 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2251.*.wmnet
  • 15:28 niharika29@tin: Finished deploy [scholarships/scholarships@ec05ae7]: Update i18n files (duration: 00m 02s)
  • 15:28 niharika29@tin: Started deploy [scholarships/scholarships@ec05ae7]: Update i18n files
  • 15:14 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover s6-master db2028 to db2039 (duration: 01m 01s)
  • 15:08 jynus: stopping db2028's mysql to apply new config
  • 15:01 godog: roll-restart thumbor in eqiad after upgrade - T183907
  • 15:00 ottomata: restarting kafka-jumbo brokers to enable tls version and cipher suite restrictions
  • 14:55 jynus: switchover db2028 to db2039 as codfw-s6-master
  • 14:39 godog: rollout python-thumbor-wikimedia 1.8 - T183907
  • 14:30 zeljkof: EU SWAT finished
  • 14:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Translation NS for kowikisource (T183836) (duration: 01m 00s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add patrol to Image-reviewer on Commons (T183835) (duration: 01m 02s)
  • 13:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T174569 (duration: 01m 02s)
  • 13:17 moritzm: upgrading mw1261-mw1265 to HHVM 3.18.5+dfsg-1+wmf2
  • 13:07 moritzm: uploaded hhvm 3.18.5+dfsg-1+wmf2 (including the fixes from 3.18.6) for jessie-wikimedia to apt.wikimedia.org
  • 12:53 moritzm: importing linux 4.9.65-3+deb9u1~bpo8+1 for jessie-wikimedia to apt.wikimedia.org
  • 12:14 mobrovac@tin: Finished deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557 (duration: 02m 10s)
  • 12:12 mobrovac@tin: Started deploy [mathoid/deploy@63b2ddc]: Bring back codfw in sync with eqiad - T183557
  • 11:57 moritzm: upgrading app servers in deployment-prep to hhvm 3.18.5+dfsg-1+wmf2 (which contains the patches from 3.18.6)
  • 11:52 jynus: upgrade and restart db2039
  • 11:49 jynus: disabling puppet on db2039 and db2028 in preparation for gerrit:401706 deployment
  • 11:47 akosiaris: boot ganeti1006. It exhibited page allocation stalls on Jan 1. T181121
  • 11:39 marostegui: Deploy schema change on db1086 - T174569
  • 11:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T174569 (duration: 01m 01s)
  • 11:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 - T174569 (duration: 01m 02s)
  • 11:32 mobrovac@tin: Finished deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557 (duration: 02m 15s)
  • 11:29 mobrovac@tin: Started deploy [mathoid/deploy@91648aa]: Update to Mathoid v0.7.0 in codfw only for T183557
  • 11:28 mobrovac@tin: Finished deploy [mathoid/deploy@91648aa]: (no justification provided) (duration: 00m 40s)
  • 11:28 mobrovac@tin: Started deploy [mathoid/deploy@91648aa]: (no justification provided)
  • 11:03 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1336.*.eqiad.wmnet
  • 11:00 mobrovac@tin: Started restart [changeprop/deploy@3c4f51d]: Pick up the new RESTBase DNS
  • 10:48 mobrovac@tin: Started restart [mobileapps/deploy@bf85a55]: Pick up the new RESTBase DNS
  • 10:45 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=restbase,name=codfw
  • 09:57 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1337.*.eqiad.wmnet
  • 08:59 elukey: stop eventlogging mysql insertion on eventlog1001 to allow db1107 maintenance - T168414
  • 06:57 marostegui: Deploy schema change on db1101:3317 - T174569
  • 06:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 - T174569 (duration: 01m 01s)
  • 06:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 - T174569 (duration: 01m 10s)
  • 06:47 kartik@tin: Finished deploy [cxserver/deploy@66e384e]: Update cxserver to cc01477 (duration: 04m 49s)
  • 06:43 kartik@tin: Started deploy [cxserver/deploy@66e384e]: Update cxserver to cc01477
  • 06:37 marostegui: Deploy schema change on s1 master db1052 - T174569
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.12) (duration: 06m 56s)
  • 01:26 eileen: civicrm updated civicrm revision changed from ffa9d7fc7a to 429a5c5385, config revision is a7b9b58595
  • 01:18 eileen: update process-control to use different reference to civicrm_root (symlinks) process-control config revision is a7b9b58595
  • 01:01 reedy@tin: Synchronized php-1.31.0-wmf.15/resources/src/mediawiki/mediawiki.editfont.css: T182320 (duration: 01m 01s)
  • 00:59 reedy@tin: Synchronized php-1.31.0-wmf.15/extensions/Flow: T182320 (duration: 01m 18s)
  • 00:58 reedy@tin: Synchronized php-1.31.0-wmf.15/extensions/CodeMirror: T182320 (duration: 00m 59s)
  • 00:51 eileen: rollback smashPig SmashPig revision changed from ab7802d5b3 to 45aa62650c (locked), config revision is 4a4c61ae1b
  • 00:38 reedy@tin: Synchronized php-1.31.0-wmf.12/resources/src/mediawiki.rcfilters/dm/: RCFilters (duration: 01m 02s)
  • 00:36 reedy@tin: Synchronized php-1.31.0-wmf.15/resources/src/mediawiki.rcfilters/dm/: RCFilters (duration: 01m 02s)
  • 00:09 reedy@tin: Synchronized wmf-config/CirrusSearch-common.php: Lower ElasticSearch index refresh interval for Wikidata to 5s (duration: 01m 02s)
  • 00:06 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Add wmgCirrusSearchRefreshInterval (duration: 01m 02s)

2018-01-02

  • 22:04 herron: upgrading trusty puppet agents to puppet 4
  • 21:00 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.15
  • 20:59 demon@tin: Synchronized php-1.31.0-wmf.15/includes/Setup.php: Aaron made me do it (duration: 01m 04s)
  • 20:48 ottomata: restarting kafka-jumbo brokers for version 1.0 upgrade
  • 19:15 demon@tin: Finished scap: wmf.15 bootstrap (duration: 34m 55s)
  • 18:46 subbu: started linter-reparse script on terbium to reprocess itwiki pages (safe to kill -9 the script at any point)
  • 18:40 demon@tin: Started scap: wmf.15 bootstrap
  • 18:37 ebernhardson: T183053 update index.refresh_interval for wikidatawiki_{content,general} on eqiad to 5s
  • 18:30 jgleeson: Updating Smashpig from 45aa62650c to ab7802d5b3
  • 18:21 arlolra@tin: Finished deploy [parsoid/deploy@4d55952]: Updating Parsoid to 28d7734 (duration: 11m 57s)
  • 18:20 awight@tin: Finished deploy [ores/deploy@eb0f776]: Update ORES service to eb0f776: T182614 (duration: 19m 55s)
  • 18:10 moritzm: rebooting multatuli for kernel test
  • 18:09 arlolra@tin: Started deploy [parsoid/deploy@4d55952]: Updating Parsoid to 28d7734
  • 18:00 awight@tin: Started deploy [ores/deploy@eb0f776]: Update ORES service to eb0f776: T182614
  • 17:28 demon@tin: Pruned MediaWiki: 1.31.0-wmf.11 (duration: 01m 24s)
  • 17:23 demon@tin: Pruned MediaWiki: 1.31.0-wmf.10 (duration: 01m 29s)
  • 17:05 ejegg: updated payments-wiki from e91db27108 to 40145892e7
  • 17:01 jynus: add missing mysql grants to db1103:s4
  • 16:53 jynus: add missing mysql grants to db1097:s4
  • 16:51 herron: restarted exim and spamd services on fermium, mx1001 and mx2001 for openssl update
  • 16:48 elukey@puppetmaster1001: conftool action : set/weight=30; selector: name=mw13(29|3[0-3]).*.eqiad.wmnet
  • 16:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1029 (duration: 00m 51s)
  • 15:55 moritzm: installing openssl updates on restbase* hosts
  • 15:53 elukey@puppetmaster1001: conftool action : set/weight=20; selector: name=mw13(29|3[0-3]).*.eqiad.wmnet
  • 15:33 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1335.*.eqiad.wmnet
  • 15:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1055 & db1056 x1 weight (duration: 00m 50s)
  • 15:16 akosiaris: boot ganeti1008 with older 4.4 kernel and migrate multiple VMs to it. T181121
  • 15:05 zeljkof: EU SWAT finished
  • 15:04 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Set watchcreations preference to true by default on Commons (T178750) (duration: 00m 51s)
  • 15:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set watchcreations preference to true by default on Commons (T178750) (duration: 00m 51s)
  • 14:54 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable mapframe on lvwiki (T183661) (duration: 00m 51s)
  • 14:42 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Revert "Switch Wikipedias from $wgLogoHD to direct using of a SVG" (T178942) (duration: 00m 51s)
  • 14:41 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Switch Wikipedias from $wgLogoHD to direct using of a SVG" (T178942) (duration: 00m 51s)
  • 14:33 zfilipin@tin: scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 14:32 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Switch Wikipedias from $wgLogoHD to direct using of a SVG (T178942) (duration: 01m 59s)
  • 14:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect to autoreview/editor at ruwikt (T183719) (duration: 00m 51s)
  • 14:12 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create rollbacker user group for ruwiktionary (T183655) (duration: 00m 52s)
  • 14:07 foks: removed 2FA for Martin_Urbanec
  • 14:00 moritzm: installing further openssl updates
  • 13:49 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1333.*.eqiad.wmnet
  • 13:48 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1332.*.eqiad.wmnet
  • 13:47 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1331.*.eqiad.wmnet
  • 13:45 marostegui: Deploy alter table db1098:3317 - T174569
  • 13:45 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1330.*.eqiad.wmnet
  • 13:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 - T174569 (duration: 00m 51s)
  • 13:42 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1329.*.eqiad.wmnet
  • 13:41 elukey: enable live traffic for new appservers mw1329->mw1333 (T165519)
  • 13:00 moritzm: installing openssl updates on remaining mw* hosts in eqiad
  • 12:25 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 51s)
  • 12:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 & db1056 as x1 replicas (2nd try) (duration: 00m 51s)
  • 12:21 akosiaris: empty ganeti1008 for kernel downgrade. T181121
  • 12:11 jynus: add missing mysql grants to db1055 and db1056
  • 11:42 moritzm: installing ncurses security updates
  • 11:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Revert: Repool db1055 & db1056 as x1 replicas (duration: 00m 51s)
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 50s)
  • 11:31 mobrovac@tin: Finished deploy [citoid/deploy@ee0bdf4]: Update to service template v0.5.4 - T151394 (duration: 04m 19s)
  • 11:30 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db1055 & db1056 as x1 replicas (duration: 00m 50s)
  • 11:27 mobrovac@tin: Started deploy [citoid/deploy@ee0bdf4]: Update to service template v0.5.4 - T151394
  • 09:45 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=api_appserver,name=mw1(1.*|200).eqiad.wmnet
  • 09:44 _joe_: setting api_appservers in the mw1180-1200 range to pooled=inactive, T183895
  • 09:43 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=appserver,name=mw1(1.*|200).eqiad.wmnet
  • 09:37 _joe_: setting appservers in the mw1180-1200 range to pooled=inactive, T183895
  • 09:28 godog: reboot ms-be1033 - T183724
  • 08:52 _joe_: restarting also mw1226-8, mw1223, mw1201,mw1203, mw1205-7
  • 08:36 _joe_: likewise for mw1285,mw1235,mw1232
  • 08:29 _joe_: restarting hhvm on mw1280,1282 for the same reasons
  • 08:26 _joe_: restarting hhvm on mw1317, multiple threads stuck in HPHP::jit::enterTCImpl
  • 08:23 elukey: restart druid coordinators on druid* to pick up new jvm settings
  • 08:19 _joe_: restarting hhvm on mw1313, concurrency HPHP::VariableUnserializer::unserializeVariant
  • 08:06 marostegui: Deploy alter table on db1039 (already depooled) - T174569
  • 07:56 marostegui: Deploy schema change on dbstore1001.s7 - T174569
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 52s)
  • 06:42 marostegui: Stop db1110 and dbstore1002.s5 replication in sync
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 to reimport dewiki.langlinks on dbstore1002 (duration: 00m 50s)

2018-01-01

2000s

2010s

2020s