Server Admin Log

From Wikitech
Jump to navigation Jump to search

2018-06-22

  • 19:54 jynus: applying transaction manually on dbstore1002 due to weird bug with savepoint on tokudb+image table

2018-06-21

  • 21:38 ebernhardson: banned elastic1036 from search cluster, waited for all load to shift away, and unbanned
  • 16:38 ejegg: updated CiviCRM from 349d43eb8b to e2c8ddd70e
  • 13:28 vgutierrez: Bump AES128-SHA pageview replacement to 4%
  • 12:49 chasemp: remove labvirt1019 canary to start debug of network
  • 11:08 hashar: Refreshed operations-puppet-tests-docker jenkins job to a new Docker container build that includes isc-dhcp-server | https://gerrit.wikimedia.org/r/c/integration/config/+/441367
  • 02:51 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.999) (duration: 07m 40s)
  • 02:33 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.8) (duration: 14m 12s)

2018-06-20

  • 21:18 ejegg: updated 'domain' settings in CiviCRM so name is now 'Wikimedia Foundation' and description is 'donate.wikimedia.org'
  • 20:08 bearND: rolled back "Update mobileapps to 9d856ec" (was just on canary)
  • 20:07 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@3420e67]: Update mobileapps to 9d856ec (duration: 03m 51s)
  • 20:03 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@3420e67]: Update mobileapps to 9d856ec
  • 19:54 ottomata: removed Kafka MirrorMaker from kafka10(12|13|14)
  • 16:01 sbisson@deploy1001: Finished deploy [kartotherian/deploy@9af9191]: Kartotherian: make Pl fallback to EN (duration: 03m 08s)
  • 15:58 sbisson@deploy1001: Started deploy [kartotherian/deploy@9af9191]: Kartotherian: make Pl fallback to EN
  • 15:57 sbisson@deploy1001: Finished deploy [kartotherian/deploy@9af9191]: Kartotherian: make Pl fallback to EN (duration: 00m 27s)
  • 15:56 sbisson@deploy1001: Started deploy [kartotherian/deploy@9af9191]: Kartotherian: make Pl fallback to EN
  • 15:52 sbisson@deploy1001: Finished deploy [kartotherian/deploy@9af9191]: (no justification provided) (duration: 03m 36s)
  • 15:49 sbisson@deploy1001: Started deploy [kartotherian/deploy@9af9191]: (no justification provided)
  • 14:59 dcausse: disk space issue on elastic1020 is due to shard rebalancing (currently receiving 2 enwiki_general shards but removing one wikidatawiki_content)
  • 14:07 imarlier@deploy1001: Finished deploy [performance/navtiming@742edb0]: (no justification provided) (duration: 00m 04s)
  • 14:07 imarlier@deploy1001: Started deploy [performance/navtiming@742edb0]: (no justification provided)
  • 14:03 imarlier@deploy1001: Finished deploy [performance/navtiming@8914e26]: (no justification provided) (duration: 00m 05s)
  • 14:03 imarlier@deploy1001: Started deploy [performance/navtiming@8914e26]: (no justification provided)
  • 13:43 imarlier@deploy1001: Finished deploy [performance/navtiming@995cb0f]: (no justification provided) (duration: 00m 05s)
  • 13:43 imarlier@deploy1001: Started deploy [performance/navtiming@995cb0f]: (no justification provided)
  • 13:38 anomie: Re-running populateExternallinksIndex60.php on plwiki and ptwiki for phab:T59176 (initial run collided with the s2 master switch).
  • 08:57 _joe_: removing user.log as well on tegmen
  • 08:54 _joe_: running logrotate on tegmen
  • 08:54 _joe_: killall nsca on tegmen
  • 08:51 _joe_: restarting nsca daemon on tegmen (gone wild, hundreds of subprocesses
  • 08:47 _joe_: removing user.log.1 and messages.log.1 on tegmen to save some space
  • 07:40 _joe_: powercycled mw1272, down since yesterday
  • 07:29 hashar: deploy1001: rebased php-1.32.0-wmf.8/extensions/Translate to catch up with a non production merged change ( https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Translate/+/441127 ).
  • 02:20 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.8) (duration: 08m 17s)

2018-06-19

  • 16:10 mobrovac@deploy1001: Finished deploy [proton/deploy@43af7d9]: (no justification provided) (duration: 00m 27s)
  • 16:09 mobrovac@deploy1001: Started deploy [proton/deploy@43af7d9]: (no justification provided)
  • 14:13 herron: re-enabling puppet agents
  • 14:11 herron: increased client_max_body_size on puppetdb nginx frontends from 30m to 60m https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/441044/
  • 14:06 herron: temporarily disabling puppet agents for deploy of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/441044/
  • 12:41 _joe_: hard reboot of ms-be1019 - unable to ssh, console showing i/o errors only
  • 04:56 ebernhardson: unban elastic1035
  • 04:45 ebernhardson: ban elastic1035 from cluster to allow it to recover
  • 02:36 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.999) (duration: 06m 53s)
  • 02:19 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.8) (duration: 07m 39s)

2018-06-18

  • 19:49 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@bcb2904]: GUI update (duration: 18m 52s)
  • 19:30 smalyshev@deploy1001: Started deploy [wdqs/wdqs@bcb2904]: GUI update
  • 19:30 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@37f6f32]: GUI update (duration: 00m 56s)
  • 19:29 smalyshev@deploy1001: Started deploy [wdqs/wdqs@37f6f32]: GUI update
  • 18:41 Trey314159: reindexing Serbian wikis on elastic@eqiad (T196404)
  • 16:39 urandom: DROP unused Cassandra keyspaces - T197080
  • 10:52 _joe_: removing wtp1043 from all pybal configuration until the disk is replaced T196886
  • 10:04 _joe_: initialize namespace "ci" on the kubernetes staging cluster T196654
  • 09:56 joal@deploy1001: Finished deploy [analytics/refinery@e9dbe79]: Regular weekly deploy (duration: 06m 54s)
  • 09:49 joal@deploy1001: Started deploy [analytics/refinery@e9dbe79]: Regular weekly deploy
  • 03:04 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.999) (duration: 14m 30s)
  • 02:33 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.8) (duration: 13m 12s)
  • 02:27 Trey314159: reindexing Serbian wikis on elastic@codfw (T196404)

2018-06-17

  • 22:24 no_justification: gerrit: started back up, nvm
  • 22:23 no_justification: gerrit: stopping services for a few minutes
  • 13:05 Trey314159: reindexing Serbo-Croatian wikis on elastic@eqiad (T196658)
  • 03:46 Trey314159: reindexing Serbo-Croatian wikis on elastic@codfw (T196658)

2018-06-16

  • 20:42 mdholloway: disabled puppet on maps-test2004 for testing new map styles setup
  • 19:24 Trey314159: reindexing Croatian wikis on elastic@eqiad (T196658)
  • 13:19 Trey314159: reindexing Croatian wikis on elastic@codfw (T196658)
  • 00:15 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Ica4cb6644 (duration: 00m 59s)

2018-06-15

  • 18:48 Trey314159: reindexing Bosnian wikis on elastic@eqiad (T196658)
  • 18:10 Trey314159: reindexing Bosnian wikis on elastic@codfw (T196658)
  • 17:15 krinkle@deploy1001: Synchronized wmf-config/mc.php: I619a2ff5db611 (duration: 00m 58s)
  • 15:49 mutante: install2002 - disabling puppet temp, live hackking DHCP config for debugging backup2001 install issue
  • 15:37 mutante: switching noc.wikimedia.org site from terbium to mwamiant1001 backend, running puppet on all cache::misc cp servers (T192092)
  • 15:19 ottomata: bouncing kafka broker on kafka-jumbo1001 to test https://gerrit.wikimedia.org/r/#/c/440520/
  • 15:19 gehel: rolling restart of elasticsearch eqiad for plugin upgrade completed - T194245
  • 14:49 elukey: restart varnishkafka-eventlogging on cp4028, errors logged
  • 14:43 elukey: restart varnishkafka-eventlogging on cp5012 as attempt to clear out the errors (not needed but logging it anyway)
  • 14:12 papaul: OS install on backup2001
  • 13:41 mepps: updated process control to 3b9f51e0fd
  • 12:56 legoktm@deploy1001: Synchronized wmf-config/mc.php: Make sure that mcrouter BagOStuff goes through ObjectCache::newFromParams() - T197450 (duration: 00m 57s)
  • 12:38 jynus: killing refresh counts on commonswiki
  • 12:30 jynus: reenabling db2040 consistency after slaves caught up
  • 12:13 papaul: OS install on bast2002
  • 10:32 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1119 after alter table (duration: 00m 58s)
  • 09:48 godog: delete cpjobqueue metrics older than 10d - T196067
  • 09:45 jynus: reenabling db2048 consistency after slaves caught up
  • 09:43 jynus: reducing temp. db2040 consistency to speed up slave lag catch up
  • 09:20 godog: fully remove ms-be1036 from swift due to hw failure - T196873
  • 09:03 bawolff: deploy patch T197279
  • 08:05 gehel: rolling restart of elasticsearch eqiad for plugin upgrade - T194245
  • 05:50 moritzm: slow rollout of debmonitor
  • 05:34 moritzm: installing gnupg security updates on trusty (Debian already fixed)
  • 05:13 marostegui: Deploy schema change on db1119 T191316 T192926 T89737 T195193
  • 05:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1119 for alter table (duration: 00m 57s)
  • 05:06 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1067 after alter table (duration: 01m 07s)

2018-06-14

  • 23:40 thcipriani@deploy1001: Pruned MediaWiki: 1.32.0-wmf.3 (duration: 04m 38s)
  • 23:31 thcipriani@deploy1001: Pruned MediaWiki: 1.32.0-wmf.2 (duration: 04m 50s)
  • 23:24 jforrester@deploy1001: Synchronized php-1.32.0-wmf.8/resources/src/mediawiki.rcfilters/styles/mw.rcfilters.less: T195903 SWAT for Roan (duration: 00m 59s)
  • 23:12 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T196400 for SWAT (duration: 01m 00s)
  • 21:56 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.999/extensions/PageTriage/includes/Hooks.php: Fix event presentation class names T197262 (duration: 00m 57s)
  • 21:47 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.999 T196585
  • 21:44 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.8/extensions/PageTriage/includes/Hooks.php: Fix event presentation class names T197262 (duration: 00m 52s)
  • 21:27 thcipriani@deploy1001: Pruned MediaWiki: 1.32.0-wmf.1 (duration: 06m 54s)
  • 21:18 thcipriani@deploy1001: Finished scap: testwiki to 1.32.0-wmf.999 (multi-content-revisions T196585) and rebuild l10n cache (duration: 65m 50s)
  • 21:12 bblack: re-enable and run puppet on cache_upload - T192555
  • 20:45 bblack: re-enable and run puppet on rest of cache_text (eqiad, eqsin, esams) - T192555
  • 20:21 bblack: re-enable and run puppet on text@ulsfo - T192555
  • 20:12 thcipriani@deploy1001: Started scap: testwiki to 1.32.0-wmf.999 (multi-content-revisions T196585) and rebuild l10n cache
  • 20:11 bblack: re-enable and run puppet on text@codfw - T192555
  • 19:21 vgutierrez: Reenable puppet in cache:misc nodes - T192555
  • 19:13 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.8
  • 18:17 niharika29@deploy1001: Synchronized wmf-config/CommonSettings.php: Bump ExtensionDistributor default to REL1_31 (duration: 00m 57s)
  • 18:14 niharika29@deploy1001: Synchronized langlist-labs: beta: declare beta sr.wikipedia and beta crh.wikipedia to langlist-labs (duration: 00m 58s)
  • 18:10 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on a few more wikis - T195263 (duration: 01m 00s)
  • 17:34 gehel: rolling restart of elasticsearch codfw completed - T194245
  • 17:10 ottomata: bouncing kafka broker on kafka2003 to make sure ACLs are okk
  • 17:08 ottomata: applying ACLs to Kafka main-codfw and main-eqiad - T196081
  • 16:41 vgutierrez: disable puppet on cache nodes before merging gerrit/440114 - T192555
  • 15:39 ottomata: switching EventStreams service to be backed by main-eqiad - T185225
  • 14:48 ejegg: disabled donations and refund queue consumers
  • 14:45 ejegg: updated CiviCRM from 69091d8 to f54f981
  • 14:40 mutante: moving wikidata query dispatcher from terbium to mwmaint1001 - scheduled downtime - check turned into a WARN - disabling puppet on mwmaint1001, removing crons on terbium, waiting a couple minutes for them to finish, re-enabling puppet on mwmaint1001 (T192092)
  • 14:32 moritzm: (slow) initial rollout of debmonitor-client
  • 13:47 zeljkof: EU SWAT finished
  • 13:46 zfilipin@deploy1001: Synchronized php-1.32.0-wmf.8/extensions/Wikibase: SWAT: Revert "Statement transclusion: when entity of unknown type in statement, display ID as string" (T195615) (duration: 01m 21s)
  • 13:35 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set a few of namespace aliases on ruwikisource (T196719) (duration: 00m 59s)
  • 13:11 volans@deploy1001: Finished deploy [debmonitor/deploy@476fd8b]: Release v0.1.4 (duration: 00m 19s)
  • 13:10 volans@deploy1001: Started deploy [debmonitor/deploy@476fd8b]: Release v0.1.4
  • 12:54 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 12:29 ema: cp3037: run update-ocsp-all
  • 12:23 gehel: rolling restart of elasticsearch codfw for plugin upgrade - T194245
  • 11:38 marostegui: Deploy schema change on db1067 T191316 T192926 T89737 T195193
  • 11:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1067 for alter table (duration: 00m 57s)
  • 11:16 elukey: upgrade cassandra on aqs* to 2.2.6-wmf5
  • 11:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1114 after alter table (duration: 01m 01s)
  • 10:04 addshore: @terbium: for i in {2001..3000}; do echo Lexeme:L$i; done | mwscript purgePage.php --wiki wikidatawiki // T197222
  • 09:59 addshore: @terbium: for i in {1001..2000}; do echo Lexeme:L$i; done | mwscript purgePage.php --wiki wikidatawiki // T197222
  • 09:57 addshore: @terbium: for i in {1..1000}; do echo Lexeme:L$i; done | mwscript purgePage.php --wiki wikidatawiki // T197222
  • 09:15 elukey: add debmonitor term to analytics-in4 on cr1/cr2 eqiad
  • 08:59 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@2680ba8]: Decrease the checkerJob delays recheck to 10 minutes (duration: 01m 18s)
  • 08:58 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@2680ba8]: Decrease the checkerJob delays recheck to 10 minutes
  • 08:31 elukey: restart hadoop hdfs master nodes to pick up the new journal node settings
  • 08:29 gehel: restart of elasticsearch / relforge for plugin updates
  • 08:08 mutante: switch backend for dbtree.wikimedia.org away from terbium to mwmaint1001 (T192092)
  • 08:07 elukey: roll restart of hadoop journal nodes to pick up the new configuration (two more journal nodes added)
  • 08:03 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BETAONLY: senses for beta wikidatas (duration: 00m 58s)
  • 07:59 moritzm: resuming rolling restart of cassandra on restbase2* to pick up OpenJDK security update
  • 07:41 marostegui: Deploy schema change on db1114 T191316 T192926 T89737 T195193
  • 07:41 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1114 for alter table (duration: 00m 58s)
  • 07:37 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 after alter table (duration: 00m 58s)
  • 05:47 mutante: LDAP - added user mepps to wmf group (T192472)
  • 05:15 marostegui: Deploy schema change on db1083 T191316 T192926 T89737 T195193
  • 05:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 for alter table (duration: 00m 58s)
  • 05:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 after alter table (duration: 01m 01s)
  • 05:03 marostegui: Deploy schema change on s4 primary master (db1068) T191316 T192926 T195193
  • 03:06 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Thu Jun 14 03:06:19 UTC 2018 (duration 10m 30s)
  • 02:55 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.8) (duration: 14m 42s)
  • 02:22 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.7) (duration: 09m 41s)

2018-06-13

  • 23:15 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/432093/ (duration: 00m 58s)
  • 23:07 maxsem@deploy1001: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/440178/ (duration: 01m 00s)
  • 21:20 awight@deploy1001: Finished deploy [ores/deploy@36037b6]: New badwords for ORES in English: T196468 (duration: 72m 56s)
  • 20:07 awight@deploy1001: Started deploy [ores/deploy@36037b6]: New badwords for ORES in English: T196468
  • 19:26 dduvall@deploy1001: Synchronized php: group1 wikis to 1.32.0-wmf.8 (duration: 00m 57s)
  • 19:25 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.8
  • 18:28 twentyafterfour: finished swat (28m overtime! :P)
  • 18:22 twentyafterfour: ran namespaceDupes for urwiktionary
  • 18:20 twentyafterfour@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: sync 5b47244 14ca2ba 63bc100m and 2569a77 refs T196488, T196744, T196727, T196763 (duration: 00m 57s)
  • 17:55 twentyafterfour@deploy1001: Synchronized static/images/project-logos/: sync bnwikivoyage logos refs T196803 (duration: 00m 58s)
  • 17:53 twentyafterfour@deploy1001: Synchronized static/images/project-logos/bnwikivoyage-1.5x.png: static/images/project-logos/bnwikivoyage-2x.png static/images/project-logos/bnwikivoyage.png sync bnwikivoyage logos refs T196803 (duration: 00m 58s)
  • 17:49 twentyafterfour@deploy1001: Synchronized wmf-config: Sync https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/436430/ for SWAT refs T184244 (duration: 01m 00s)
  • 17:08 mutante: terbium - closing unusued screen sessions for all Amir users (2)
  • 16:53 ppchelko@deploy1001: Finished deploy [restbase/deploy@f521e7e]: Add page_language to title_revision table T197082 (duration: 15m 44s)
  • 16:38 ppchelko@deploy1001: Started deploy [restbase/deploy@f521e7e]: Add page_language to title_revision table T197082
  • 16:31 XioNoX: moving mr1-eqiad interfaces to new router
  • 16:14 mutante: rsyncing /home dirs from terbium to mwmaint1001, they will appear later in a subdir "home-terbium" like it was done for tin->deploy1001 (T192092)
  • 16:13 urandom: ALTERing Cassandra schema - T197082
  • 16:11 moritzm: installing plexus-archiver security updates
  • 16:07 moritzm: installing imagemagick security updates on trusty (Debian already fixed)
  • 15:55 elukey: rolling restart of aqs on aqs100[4-9] to pick up the new config changes
  • 15:25 jynus: stopping db1053 and db1059 in preparation for decomm T194634 T196606
  • 15:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1076 (duration: 00m 58s)
  • 14:19 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 (duration: 00m 57s)
  • 14:15 anomie@deploy1001: Synchronized php-1.32.0-wmf.8/includes/Category.php: Backporting fix for T195397 (gerrit:440053) (duration: 01m 00s)
  • 14:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1076 with low weight (duration: 00m 58s)
  • 14:01 zeljkof: EU SWAT finished
  • 14:00 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Implementing Patroller User Rights for azwiki (T196488) (duration: 00m 57s)
  • 13:54 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: English aliases for extra namespaces on urwiktionary (T196614) (duration: 00m 58s)
  • 13:48 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix wrong language in ur.wiktionary namespace (T196614) (duration: 00m 58s)
  • 13:28 elukey@deploy1001: Finished deploy [analytics/aqs/deploy@160206f]: (no justification provided) (duration: 04m 11s)
  • 13:26 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make ProofreadPage operate on correct namespaces in pmswikisource (T197033) (duration: 00m 57s)
  • 13:24 elukey@deploy1001: Started deploy [analytics/aqs/deploy@160206f]: (no justification provided)
  • 13:12 zfilipin@deploy1001: Synchronized wmf-config/: SWAT: Enable FileImporter monolog channel in production (T195370) (duration: 01m 00s)
  • 13:02 moritzm: rolling restart of cassandra in codfw to pick up OpenJDK security update
  • 13:00 elukey@deploy1001: Finished deploy [analytics/aqs/deploy@84fab89]: Update AQS for T190213 (duration: 02m 38s)
  • 12:57 elukey@deploy1001: Started deploy [analytics/aqs/deploy@84fab89]: Update AQS for T190213
  • 12:46 elukey: restart mirror maker on kafka1012->1014 to pick up new openjdk-7 upgrades
  • 12:28 elukey: rolling restart of kafka on kafka1012->23 for openjdk-7 upgrades
  • 12:16 arturo: T196633 extend downtime for labcontrol1003
  • 11:21 moritzm: installing perl security updates
  • 10:59 marostegui: Deploy schema change on db1105:3311 T191316 T192926 T89737 T195193
  • 10:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 for alter table (duration: 00m 58s)
  • 10:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1089 after alter table (duration: 00m 59s)
  • 10:10 akosiaris: upload apertium-apy_0.11.3-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 09:00 marostegui: Restart mysql on dbstore1002 for maintenance
  • 08:46 ema: depool cp1053 T165252
  • 08:04 marostegui: Stop MySQL and reboot db1076 - T197063
  • 08:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1076 for binlog change - T197063 (duration: 00m 57s)
  • 07:59 oblivian@deploy1001: Synchronized wmf-config/ProductionServices.php: Remove unused redis shards from the jobqueue T197003 (duration: 00m 58s)
  • 07:57 Pchelolo: restart cpjobqueue on scb1001 cause, it's lost all it's workers
  • 07:55 marostegui: Deploy schema change on db1089 T191316 T192926 T89737 T195193
  • 07:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 for alter table (duration: 00m 58s)
  • 07:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Restore s2 default read-only message (duration: 00m 57s)
  • 07:41 marostegui: Stop MySQL on db1054 for socket update and binlog change
  • 07:33 godog: start removing ms-be1036 from swift rings - T196873
  • 06:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 after alter table (duration: 00m 59s)
  • 06:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove read only from s2 - T194870 (duration: 00m 33s)
  • 06:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove read only from s2 - T194870 (duration: 00m 34s)
  • 06:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Set s2 on read-only for primary db master maintnance - T194870 (duration: 01m 08s)
  • 06:00 marostegui: Starting s2 failover from db1054 to db1066 - T194870
  • 05:11 marostegui: Starting topology changes in order to get ready for s2 failover - T194870
  • 05:09 marostegui: Disable gtid on db1066
  • 05:03 marostegui: Deploy schema change on dbstore1001:s1 T191316 T192926 T89737 T195193
  • 03:14 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Wed Jun 13 03:14:53 UTC 2018 (duration 10m 19s)
  • 03:04 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.8) (duration: 15m 45s)
  • 02:31 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.7) (duration: 12m 21s)
  • 01:12 aaron@deploy1001: Synchronized wmf-config/mc.php: Add "memcached-mcrouter" to $wgObjectCaches as default for testwiki (duration: 00m 58s)
  • 01:01 aaron@deploy1001: Synchronized php-1.32.0-wmf.7/tests/phpunit/includes/db/LBFactoryTest.php: (no justification provided) (duration: 00m 57s)
  • 01:00 aaron@deploy1001: Synchronized php-1.32.0-wmf.7/includes/libs/rdbms/lbfactory/LBFactory.php: f83bad6 (duration: 00m 59s)
  • 00:55 aaron@deploy1001: Synchronized php-1.32.0-wmf.8/tests/phpunit/includes/db/LBFactoryTest.php: (no justification provided) (duration: 00m 58s)
  • 00:53 aaron@deploy1001: Synchronized php-1.32.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php: c2df9668d13 (duration: 00m 58s)
  • 00:46 bawolff@deploy1001: Synchronized php-1.32.0-wmf.7/extensions/CentralAuth/AntiSpoof/CentralAuthAntiSpoofHooks.php: Deploy I5f25c5 (prevent users from registering previously renamed users) (duration: 00m 57s)
  • 00:44 bawolff@deploy1001: Synchronized php-1.32.0-wmf.7/extensions/CentralAuth/extension.json: Deploy I5f25c5 (prevent users from registering previously renamed users) (duration: 00m 58s)
  • 00:40 bawolff@deploy1001: Synchronized php-1.32.0-wmf.8/extensions/CentralAuth/AntiSpoof/CentralAuthAntiSpoofHooks.php: Deploy I5f25c5 (prevent users from registering previously renamed users) (duration: 00m 57s)
  • 00:39 bawolff@deploy1001: Synchronized php-1.32.0-wmf.8/extensions/CentralAuth/extension.json: Deploy I5f25c5 (prevent users from registering previously renamed users) (duration: 00m 57s)
  • 00:37 bawolff@deploy1001: Synchronized wmf-config/CommonSettings.php: Deploy I5f25c5 (prevent users from registering previously renamed users) (duration: 00m 59s)

2018-06-12

  • 23:47 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add sites to the wgCopyUploadsDomains whitelist of Wikimedia Commons T195270, T195928 (duration: 00m 59s)
  • 23:05 tzatziki: resetting passwords for compromised accounts (T197046)
  • 23:00 bblack: cp3043 - done, reimaged, in live service for cache_upload
  • 22:59 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp3043.esams.wmnet
  • 22:37 tzatziki: (from yesterday) resetting passwords for compromised accounts (T197046)
  • 22:24 twentyafterfour: phabricator: I scheduled a 24 hour downtime in icinga for the phd service, to give me time to work on this issue. See T196840
  • 22:23 twentyafterfour: phabricator: taking phd offline to relieve the load on the m3 database cluster
  • 21:46 bblack: cp3046 - restart varnish backend for mbox lag
  • 21:40 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3043.esams.wmnet
  • 21:39 bblack: cp3043 - starting process to move to reimage into cache_upload
  • 19:16 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.8
  • 18:57 herron: restarted icinga service on einsteinium
  • 18:42 dduvall@deploy1001: Finished scap: testwiki to php-1.32.0-wmf.8 and rebuild l10n cache (duration: 39m 39s)
  • 18:03 dduvall@deploy1001: Started scap: testwiki to php-1.32.0-wmf.8 and rebuild l10n cache
  • 17:38 ariel@deploy1001: Finished deploy [dumps/dumps@038c8b3]: sync after snapshot1009 install (duration: 00m 04s)
  • 17:37 ariel@deploy1001: Started deploy [dumps/dumps@038c8b3]: sync after snapshot1009 install
  • 17:37 ariel@deploy1001: Finished deploy [dumps/dumps@038c8b3]: sync after snapshot1009 install (duration: 00m 07s)
  • 17:37 ariel@deploy1001: Started deploy [dumps/dumps@038c8b3]: sync after snapshot1009 install
  • 16:54 marxarelli: starting branch cut for 1.32.0-wmf.8
  • 16:11 volans@deploy1001: Finished deploy [debmonitor/deploy@0eca14a]: Release v0.1.3 (duration: 00m 22s)
  • 16:11 volans@deploy1001: Started deploy [debmonitor/deploy@0eca14a]: Release v0.1.3
  • 15:40 bblack: cp3034 - nevermind, doing different approach later in the day, still pooled in text for now!
  • 15:29 bblack: cp3043 switching from text to upload shortly, downtimed in icinga for 2h - https://gerrit.wikimedia.org/r/c/operations/puppet/+/439936
  • 15:07 ema: cp3039: restart varnish-backend
  • 14:38 addshore: file exporter importer slot done
  • 14:38 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: FileImporter/Exporter Enable FileExporter/Importer on group0 wikis T195370 (duration: 00m 51s)
  • 14:20 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: FileImporter/Exporter Allow setting of export target for FileExporter T195370 (duration: 00m 50s)
  • 14:09 addshore@deploy1001: Finished scap: FileExporter backport - Pre deployment backport (extension not yet deployed) (duration: 30m 37s)
  • 13:38 addshore@deploy1001: Started scap: FileExporter backport - Pre deployment backport (extension not yet deployed)
  • 13:16 moritzm: installing openjdk-8 security updates on restbase-dev along with cassandra restarts
  • 12:38 ema: cp3035: restart varnish-be, mbox lag
  • 12:34 _joe_: repooling mw1230 after reimaging T196881
  • 12:14 marostegui: Deploy schema change on db1099:3311 T191316 T192926 T89737 T195193
  • 12:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 for alter table (duration: 00m 52s)
  • 12:11 marostegui: Deploy schema change on dbstore1002:s1 T191316 T192926 T89737 T195193
  • 12:05 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: dc=.*,service=mathoid,cluster=kubernetes,name=.*
  • 11:47 moritzm: updated component/cassandra311 on apt.wikimedia.org to 3.11.2
  • 10:26 jynus: setting expire_log_days on db1066 as 30
  • 10:21 godog: bounce stuck rsyslog on lithium / wezen - T136312
  • 09:41 vgutierrez: cp3037 has been depooled due to unknown hardware issues T196974
  • 08:48 marostegui: Stop replication on db2094 to change triggers for archive table
  • 08:36 volans: running puppet on failed hosts post small puppet outage and puppetdb reboot
  • 08:35 akosiaris: rebalance ganeti codfw cluster
  • 08:35 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3037.esams.wmnet
  • 08:33 akosiaris: reboot puppetdb1001 for spec-ctrl enable. Bundling it with a minor puppet outage to only have a torrent of harmless puppet failures once
  • 08:15 akosiaris: ganeti2002 reboot for microcode update
  • 08:04 akosiaris: ganeti2006 reboot for microcode update
  • 08:03 marostegui: Deploy schema change on s1 codfw primary master (db2048) with replication, this will generate lag on codfw T191316 T192926 T89737 T195193
  • 07:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1121 after alter table (duration: 00m 50s)
  • 07:43 akosiaris: ganeti2007 reboot for microcode update
  • 07:41 akosiaris: ganeti2003 reboot for microcode update
  • 07:31 mutante: closing idle screen session on tin (about to be decomed, dont use anymore)
  • 06:37 marostegui: Deploy schema change on db1121 with replication, this will generate lag on labsdb:s4 T191316 T192926 T89737 T195193
  • 06:37 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1121 for alter table (duration: 00m 50s)
  • 06:31 marostegui: Stop replication on db1095, db1102, db1125 to change triggers - T192926
  • 06:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 after alter table (duration: 00m 51s)
  • 05:09 marostegui: Deploy schema change on db1091 T191316 T192926 T89737 T195193
  • 05:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1091 for alter table (duration: 00m 52s)
  • 02:45 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Tue Jun 12 02:45:53 UTC 2018 (duration 10m 18s)
  • 02:35 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.7) (duration: 14m 10s)
  • 00:35 legoktm: remove non-deployers from wmf-deployment Gerrit group (T196959)
  • 00:16 ejegg: updated CiviCRM from 69091d8b5f to f54f9810ab

2018-06-11

  • 23:45 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Lower CirrusSearch delayed job drop timeout (duration: 00m 50s)
  • 23:26 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Tune CirrusSearch slow logging (duration: 00m 48s)
  • 23:18 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Promote Cirrus MLR models from AB test to prod (duration: 00m 51s)
  • 23:02 twentyafterfour: phabricator: restarting apache2 on phab1001 to free up apache workers
  • 22:17 awight@deploy1001: Finished deploy [ores/deploy@6ee8775]: ORES: bswiki, euwiki, srwiki models (duration: 33m 58s)
  • 21:43 awight@deploy1001: Started deploy [ores/deploy@6ee8775]: ORES: bswiki, euwiki, srwiki models
  • 20:57 arlolra: Updated Parsoid to 06b74d2 (T191843)
  • 20:49 arlolra@deploy1001: Finished deploy [parsoid/deploy@97cdab8]: Updating Parsoid to 06b74d2 (duration: 17m 09s)
  • 20:32 arlolra@deploy1001: Started deploy [parsoid/deploy@97cdab8]: Updating Parsoid to 06b74d2
  • 20:27 otto@deploy1001: Finished deploy [eventstreams/deploy@6b013f9]: Enable composite stream and timestamp since param - T196009 , T187418 (duration: 09m 52s)
  • 20:17 otto@deploy1001: Started deploy [eventstreams/deploy@6b013f9]: Enable composite stream and timestamp since param - T196009 , T187418
  • 20:02 ottomata: bouncing varnishkafka-webrequest on cp3039,cp3047,cp2007,cp3010
  • 19:56 ottomata: bouncing varnishkafka on cp3032
  • 19:29 aaron@deploy1001: Synchronized php-1.32.0-wmf.7/includes/libs/rdbms/ChronologyProtector.php: 11e596776f940 - add some logging details (duration: 00m 53s)
  • 18:50 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Restore full config (duration: 00m 16s)
  • 18:49 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Restore full config
  • 18:15 urandom: convert timeline indices to time-windowed compaction - T196024
  • 17:59 twentyafterfour: phabricator: taking phd offline while I clear out the queue backlog (downtime is logged in icinga) see T196840
  • 17:51 twentyafterfour: phabricator: rebuilding git parent caches
  • 17:33 gehel@deploy1001: Finished deploy [wdqs/wdqs@37f6f32]: new version of wdqs GUI and updater (duration: 13m 38s)
  • 17:27 twentyafterfour: phabricator: restarting phd for D1067
  • 17:25 twentyafterfour: Phabricator: deploying hotfix (D1067) refs T196840 T196860 T196855
  • 17:20 gehel@deploy1001: Started deploy [wdqs/wdqs@37f6f32]: new version of wdqs GUI and updater
  • 17:19 gehel@deploy1001: Finished deploy [wdqs/wdqs@37f6f32]: new version of wdqs GUI and updater (duration: 00m 03s)
  • 17:19 gehel@deploy1001: Started deploy [wdqs/wdqs@37f6f32]: new version of wdqs GUI and updater
  • 17:17 gehel@deploy1001: Finished deploy [wdqs/wdqs@37f6f32]: new version of wdqs GUI and updater (wdqs1009 only) (duration: 00m 26s)
  • 17:16 gehel@deploy1001: Started deploy [wdqs/wdqs@37f6f32]: new version of wdqs GUI and updater (wdqs1009 only)
  • 17:00 akosiaris: ganeti2008 reboot for microcode update
  • 16:42 marostegui: Set disk 32:1 offline on db1065 to get a new one - T196806
  • 16:02 akosiaris@deploy1001: Finished deploy [proton/deploy@97ec4bf]: (no justification provided) (duration: 01m 32s)
  • 16:00 akosiaris@deploy1001: Started deploy [proton/deploy@97ec4bf]: (no justification provided)
  • 15:44 akosiaris@deploy1001: Finished deploy [proton/deploy@97ec4bf]: (no justification provided) (duration: 00m 33s)
  • 15:43 akosiaris@deploy1001: Started deploy [proton/deploy@97ec4bf]: (no justification provided)
  • 15:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1084 after alter table (duration: 00m 50s)
  • 15:33 akosiaris@deploy1001: Finished deploy [proton/deploy@97ec4bf]: (no justification provided) (duration: 00m 22s)
  • 15:33 akosiaris@deploy1001: Started deploy [proton/deploy@97ec4bf]: (no justification provided)
  • 15:31 marostegui: Set offline disk 32:3 on db1063 - T196806
  • 15:31 marostegui: Set offline disk 32:1 on db1065 - T196806
  • 15:02 akosiaris@deploy1001: Finished deploy [proton/deploy@97ec4bf]: (no justification provided) (duration: 35m 33s)
  • 14:55 marostegui: Deploy schema change on db1084 T191316 T192926 T89737 T195193
  • 14:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1084 for alter table (duration: 00m 50s)
  • 14:53 akosiaris: reboot bohrium for kernel upgrades and spec-ctrl enabling. Manually stopped mysql behorehand
  • 14:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1081 after alter table (duration: 00m 50s)
  • 14:35 akosiaris: reboot mx1001, poolcounter1001 for kernel upgrades and spec-ctrl enabling
  • 14:26 akosiaris@deploy1001: Started deploy [proton/deploy@97ec4bf]: (no justification provided)
  • 14:25 godog: upload scap 3.8.2-1 - T196710
  • 14:21 hoo: Updated operations/dumps/dcat (536bd5b..559dee3) on snapshot1008
  • 13:58 marostegui: Deploy schema change on db1081 T191316 T192926 T89737 T195193
  • 13:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1081 for alter table (duration: 00m 48s)
  • 13:56 otto@deploy1001: Finished deploy [eventlogging/analytics@08a1dff]: Producing events with kafka timestamp set to event time - T196407 (duration: 00m 04s)
  • 13:56 otto@deploy1001: Started deploy [eventlogging/analytics@08a1dff]: Producing events with kafka timestamp set to event time - T196407
  • 13:54 otto@deploy1001: Finished deploy [eventlogging/eventbus@08a1dff]: Producing events with kafka timestamp set to event time - T196407 (duration: 01m 55s)
  • 13:52 otto@deploy1001: Started deploy [eventlogging/eventbus@08a1dff]: Producing events with kafka timestamp set to event time - T196407
  • 13:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 after alter table (duration: 00m 50s)
  • 13:47 hashar: European SWAT completed
  • 13:35 hashar@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix wgMetaNamespace for pswikivoyage - T196837 (duration: 00m 49s)
  • 13:29 hashar@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix wgMetaNamespace for pswikivoyage - T196837 (duration: 00m 50s)
  • 13:21 volans@deploy1001: Finished deploy [debmonitor/deploy@81d7333]: Release v0.1.2 (duration: 00m 56s)
  • 13:20 volans@deploy1001: Started deploy [debmonitor/deploy@81d7333]: Release v0.1.2
  • 13:20 hashar@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Use uploaded HD logo for bewikiquote - T196134 (duration: 00m 50s)
  • 13:17 hashar@deploy1001: Synchronized static/images/project-logos: Change logo files for bewikiquote - T196134 (duration: 00m 50s)
  • 13:13 hashar@deploy1001: Synchronized static/images/project-logos: Revert "Change bewikiquote logo" - T196134 (duration: 00m 51s)
  • 13:01 volans@deploy1001: Finished deploy [debmonitor/deploy@81d7333]: Release v0.1.2 (duration: 07m 16s)
  • 12:54 volans@deploy1001: Started deploy [debmonitor/deploy@81d7333]: Release v0.1.2
  • 11:45 arturo: T196633 downtime labcontrol100[3,4] due to unexpected puppet errors on installation of keystone
  • 11:38 arturo: T196633 deploy keystone to labcontrol100[3,4].wikimedia.org. Dormant daemon, no DB yet
  • 11:30 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@b5396cd]: Tune cirrus jobs concurrencies (duration: 00m 42s)
  • 11:30 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@b5396cd]: Tune cirrus jobs concurrencies
  • 11:23 marostegui: Deploy schema change on db1103:3314 T191316 T192926 T89737 T195193
  • 11:22 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 for alter table (duration: 00m 50s)
  • 11:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 after alter table (duration: 00m 51s)
  • 10:52 mutante: phab1002 - editing cached scap config /srv/deployment/phabricator/deployment-cache/.config to replace tin.eqiad with deploy1001.eqiad deployment server, run puppet. other options: run scap with --refresh-config, delet cached .config file (T196019) (T175288)
  • 10:29 _joe_: depooling permantently mw1230 for disk replacement, T196881
  • 10:11 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
  • 10:10 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 09:07 marostegui: Deploy schema change on db1097:3314 T191316 T192926 T89737 T195193
  • 09:06 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 for alter table (duration: 00m 52s)
  • 08:32 moritzm: installing gnupg2 security updates
  • 08:27 gehel: restart elastic1020 to enable G1 GC - T156137
  • 08:25 moritzm: installing gnupg1 security updates
  • 07:52 moritzm: installing gnupg security updates
  • 07:37 marostegui: Deploy schema change on dbstore1002:s4 T191316 T192926 T89737 T195193
  • 07:29 moritzm: installing openjdk-7 security updates
  • 06:39 elukey: restart pdfrender on scb1002
  • 06:14 marostegui: Deploy schema change on s4 codfw master (db2051) this will generate lag on codfw - T191316 T192926 T89737 T195193
  • 06:10 marostegui: Stop replication on db2095 to update triggers - T192926
  • 05:32 marostegui: Restart mysql on codfw sanitariums (db1095, db1102, db1124, db1125) to pick up new replication filters - T196748
  • 05:28 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool pc2005 - T196339 (duration: 00m 52s)
  • 05:25 marostegui: Restart mysql on codfw sanitariums (db2094, db2095) to pick up new replication filters - T196748
  • 05:17 marostegui: Stop MySQL and reboot pc2005 for intel-microcode update and final HW check - T196339
  • 05:15 marostegui: Deploy schema change on s6 primary master (db1061) - T191316 T192926 T195193
  • 02:42 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Mon Jun 11 02:42:42 UTC 2018 (duration 10m 16s)
  • 02:32 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.7) (duration: 13m 06s)

2018-06-10

  • 12:30 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: CSP in report mode for group0 (duration: 00m 55s)
  • 12:01 bawolff: disable some botpasswords (T194204)
  • 11:08 _joe_: restarting gerrit on cobalt as the web interface is unresponsive

2018-06-08

  • 22:41 no_justification: gerrit: restarting
  • 19:36 urandom: upgrade Cassandra to 3.11.2, restbase1015 & restbase1018 - T178905
  • 18:49 urandom: upgrade Cassandra to 3.11.2, restbase1009 & restbase1014 - T178905
  • 18:02 urandom: upgrade Cassandra to 3.11.2, restbase1017 & restbase1013 - T178905
  • 17:33 no_justification: gerrit: up mostly, but will see some errors about "wip" label for a bit until reindexing completes.
  • 17:28 cmjohnson: powering flerovium down to move to a different space in the rack
  • 17:14 demon@deploy1001: Finished deploy [gerrit/gerrit@7324140]: 2.15.2 (duration: 00m 11s)
  • 17:14 demon@deploy1001: Started deploy [gerrit/gerrit@7324140]: 2.15.2
  • 17:12 no_justification: gerrit: taking offline for 2.14 -> 2.15 upgrade
  • 15:35 urandom: upgrade Cassandra to 3.11.2, restbase1012 - T178905
  • 15:26 marostegui: Restart MySQL on codfw sanitarium hosts db2094, db2095 - https://phabricator.wikimedia.org/T196748
  • 15:00 urandom: upgrade Cassandra to 3.11.2, restbase1008 - T178905
  • 14:17 urandom: upgrade Cassandra to 3.11.2, restbase1011 & restbase1016 - T178905
  • 12:14 twentyafterfour: running phabricator public_task_dump script manually to confirm that it's working as expected.
  • 09:31 arturo: merging https://gerrit.wikimedia.org/r/#/c/436337/ for the eqia1 openstack deployment (labcontrol1003/labcontrol1004)
  • 09:19 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Unify and update sanitarium comments - T190704 (duration: 00m 50s)
  • 09:18 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Unify and update sanitarium comments - T190704 (duration: 00m 50s)
  • 09:06 akosiaris@deploy1001: Started deploy [proton/deploy@97ec4bf]: (no justification provided)
  • 08:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1066 after reboot (duration: 00m 50s)
  • 08:18 marostegui: Stop MySQL and reboot db1066 for intel-microcode install - T194870
  • 08:18 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1066 for reboot (duration: 00m 50s)
  • 07:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1113:3316 after alter table (duration: 00m 51s)
  • 07:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Restore db1091 original weight (duration: 00m 50s)
  • 07:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1091 (duration: 00m 50s)
  • 07:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increse weight for db1091 (duration: 00m 50s)
  • 06:47 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T196630 Remove unneeded description from performance survey definition (duration: 00m 52s)
  • 06:44 elukey: bounce kafka mirror maker main-eqiad-to-main-codfw (kafka200*) due to errors in the logs (also lag metrics not displaying)
  • 06:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 with low weight after reimage (duration: 00m 50s)
  • 05:34 marostegui: Deploy sanitarium events on db1124 - T190704
  • 05:23 marostegui: Stop MySQL on db1091 for reimage
  • 05:22 marostegui: Deploy schema change on db1113:3316 - T191316 T192926 T195193 T89737
  • 05:22 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1113:3316 for alter table (duration: 00m 52s)
  • 02:12 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@0346959]: Update mobileapps to 5ea008c (duration: 00m 42s)
  • 02:11 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@0346959]: Update mobileapps to 5ea008c

2018-06-07

  • 23:58 demon@deploy1001: Finished deploy [gerrit/gerrit@a07d943]: No-op of current deployed version, want to sync repo state (x2) (duration: 00m 14s)
  • 23:58 demon@deploy1001: Started deploy [gerrit/gerrit@a07d943]: No-op of current deployed version, want to sync repo state (x2)
  • 23:57 demon@deploy1001: Finished deploy [gerrit/gerrit@a07d943]: No-op of current deployed version, want to sync repo state (duration: 00m 06s)
  • 23:57 demon@deploy1001: Started deploy [gerrit/gerrit@a07d943]: No-op of current deployed version, want to sync repo state
  • 21:49 mobrovac@deploy1001: Finished deploy [proton/deploy@97ec4bf]: Initial deploy to production - T186748 (duration: 02m 19s)
  • 21:47 mobrovac@deploy1001: Started deploy [proton/deploy@97ec4bf]: Initial deploy to production - T186748
  • 20:37 herron: ircecho restarted
  • 19:47 urandom: rolling Cassandra restart, restbase2005, restbase2006, restbase2012 -- T178905
  • 19:36 XenoRyet: updated payments-wiki config from 3197c729ee to b11d120362
  • 19:29 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.7
  • 19:16 herron: stopped ircecho
  • 18:28 urandom: rolling Cassandra restart, restbase2004, restbase2008, restbase2011 -- T178905
  • 18:20 urandom: restarting Cassandra, restbase2003-c -- T178905
  • 18:13 subbu: Updated Parsoid (T183706, T192726, T194879, T196357, T196360, T43716)
  • 18:10 ssastry@deploy1001: Finished deploy [parsoid/deploy@2f80639]: Updating Parsoid to 7819c9e7 (duration: 10m 59s)
  • 17:59 ssastry@deploy1001: Started deploy [parsoid/deploy@2f80639]: Updating Parsoid to 7819c9e7
  • 17:21 awight@deploy1001: Finished deploy [ores/deploy@65ce165]: New home page for ORES; T196580 (take 2) (duration: 04m 34s)
  • 17:16 awight@deploy1001: Started deploy [ores/deploy@65ce165]: New home page for ORES; T196580 (take 2)
  • 17:16 awight@deploy1001: Finished deploy [ores/deploy@65ce165]: New home page for ORES; T196580 (duration: 03m 19s)
  • 17:16 reedy@deploy1001: Synchronized wmf-config/interwiki-labs.php: labs! (duration: 00m 57s)
  • 17:13 awight@deploy1001: Started deploy [ores/deploy@65ce165]: New home page for ORES; T196580
  • 16:45 urandom: rolling Cassandra restart, restbase2001, restbase2002, restbase2007 - T178905
  • 16:21 urandom: rolling Cassandra restart, restbase1007 - T178905
  • 15:59 hoo: Finished running "foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https" T196360 T183706 T195014 T196357
  • {{safesubst:SAL entry|1=15:38 urandom: upgrade Cassandra to 3.11.2, restbase1010-{a,b,c} - T178905}}
  • 15:30 hoo: Running "foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https" T196360 T183706 T195014 T196357
  • 15:23 hoo: Emptied out the sites and site_identifiers tables on pswikivoyage, pmswikisource, bnwikivoyage and sahwikiquote for T122520,
  • 15:20 hoo@deploy1001: Synchronized dblists/wikidataclient.dblist: Enable WikidataClient on sahwikiquote and pswikivoyage - T183706, T196360 (duration: 00m 57s)
  • 15:08 marostegui: Sanitize wikis on db1124 (current sanitarium for s3) - T196362 T196358 T196359 T195008 T193187
  • 15:04 reedy@deploy1001: Synchronized wmf-config/interwiki.php: (no justification provided) (duration: 00m 56s)
  • 14:56 ottomata: beginning rolling restarts of all cluster kafka brokers to apply log.message.timestamp.type=CreateTime - T196407
  • 14:53 reedy@deploy1001: rebuilt and synchronized wikiversions files: new wikis
  • 14:53 marostegui: Sanitize wikis on db1095 (old sanitarium) - T196362 T196358 T196359 T195008 T193187
  • 14:53 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: new wikis (duration: 00m 57s)
  • 14:50 reedy@deploy1001: Synchronized static/images/: new wikis (duration: 00m 57s)
  • 14:48 reedy@deploy1001: Synchronized multiversion/MWMultiVersion.php: idwikimedia (duration: 00m 57s)
  • 14:47 reedy@deploy1001: Synchronized dblists/: 5 new wikis (duration: 00m 55s)
  • 14:45 Reedy: created new wiki databases T183706 T192726 T194879 T196357 T196360
  • 14:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1088 (duration: 00m 57s)
  • 14:11 reedy@deploy1001: Synchronized wmf-config/jobqueue-labs.php: labs (duration: 00m 56s)
  • 14:10 reedy@deploy1001: Synchronized wmf-config/LabsServices.php: labs (duration: 00m 57s)
  • 14:02 reedy@deploy1001: Synchronized docroot/noc/: Add some more symlinked configs (duration: 00m 57s)
  • 13:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1066 (duration: 00m 57s)
  • 13:31 reedy@deploy1001: Synchronized php-1.32.0-wmf.7/extensions/WikimediaMaintenance/addWiki.php: add all the wikis (duration: 00m 56s)
  • 13:29 reedy@deploy1001: Synchronized php-1.32.0-wmf.6/extensions/WikimediaMaintenance/addWiki.php: add all the wikis (duration: 00m 58s)
  • 13:26 reedy@deploy1001: Synchronized dblists/all-labs.dblist: labs (duration: 00m 54s)
  • 13:24 reedy@deploy1001: Synchronized wikiversions-labs.json: labs (duration: 00m 57s)
  • 13:22 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1066 (duration: 00m 56s)
  • 13:17 zeljkof: EU SWAT finished
  • 13:16 zfilipin@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: When using the FileExporter set it as BeatFeature by default (T195370) (duration: 00m 56s)
  • 13:10 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add FileExporter to BetaFeaturesWhiteList (T195370) (duration: 00m 57s)
  • 12:59 marostegui: Deploy schema change on db1088 - T191316 T192926 T195193 T89737
  • 12:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 57s)
  • 12:54 jynus: stop, clone and reimage db1091
  • 12:52 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1091 (duration: 00m 57s)
  • 12:46 mutante: planet1001/puppetmaster: revoke old cert, sign new cert request, initial puppet run, reinstalled, will turn service active-active again once done (T168490)
  • 12:26 mutante: planet1001 - schedule downtime, boot to PXE, reinstall with stretch (ganeti) (T168490)
  • 12:15 moritzm: repooled mw1280 after hardware maintenance (T195734)
  • 11:31 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1085 after alter table (duration: 00m 57s)
  • 10:29 moritzm: installing openssl security updates
  • 08:50 marostegui: Deploy schema change on db1085 with replication, this will generate lag on labsdb for s6 section - T191316 T192926 T195193 T89737
  • 08:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 00m 56s)
  • 08:49 akosiaris: slow reboot of all ganeti eqiad VMs (except bohrium, puppetdb1001, poolcounter1001, mx1001) for kernel upgrades and picking up spec-ctrl cpu flag
  • 08:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1093 after alter table (duration: 00m 55s)
  • 08:34 marostegui: Stop replication on db1102:3316 and db1125:3316 to update triggers for archive table - T192926
  • 08:15 jynus: running ANALYZE on db2091 T196526
  • 07:46 marostegui: Deploy schema change on db1093 - T191316 T192926 T195193 T89737
  • 07:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 56s)
  • 07:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1098:3316 after alter table (duration: 00m 57s)
  • 07:09 moritzm: installing git security updates on trusty (Debian already fixed)
  • 06:55 jynus: relaxing write consistency on db2048 due to ongoing maintenance (sync_binlog,flush_log)
  • 06:26 marostegui: Deploy sanitarium events on db1125 - T190704
  • 06:24 jynus: phabricator maintenance finished
  • 06:10 jynus: start database maintenance on phabricator- brief interruptions could happen
  • 05:15 marostegui: Deploy event_sanitarium on codfw sanitariums - T190704
  • 05:12 marostegui: Deploy schema change on db1098:3316 - T191316 T192926 T195193 T89737
  • 05:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 56s)
  • 05:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 after alter table (duration: 00m 58s)
  • 04:01 eileen: civicrm revision changed from 1aaa798e9b to 69091d8b5f, config revision is c60375be8d
  • 03:05 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Thu Jun 7 03:05:05 UTC 2018 (duration 10m 19s)
  • 02:54 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.7) (duration: 07m 02s)
  • 02:30 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.6) (duration: 11m 24s)
  • 02:25 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@a07af40]: log
  • 02:00 paravoid: starting exim4 and reenabling puppet on mx1001, due to T196598
  • 01:12 ejegg: disabled fundraising mailing statistics fetch jobs
  • 01:00 ejegg: updated CiviCRM from 7a06a6a387 to 1aaa798e9b
  • 00:51 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@a07af40]: (no justification provided) (duration: 01m 02s)
  • 00:50 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@a07af40]: (no justification provided)
  • 00:33 ejegg: restarted fundraising jobs
  • 00:25 legoktm@deploy1001: Finished scap: Preference for responsive MonoBook, plus set mobile width cutoff to 550px (gerrit:437875, gerrit:437814) (duration: 64m 01s)

2018-06-06

  • {{safesubst:SAL entry|1=23:51 urandom: upgrade Cassandra to 3.11.2, restbase2012-{a,b,c} - T178905}}
  • 23:49 eileen: civicrm revision changed from ce960c0642 to 7a06a6a387, config revision is b7bc5dc31f
  • 23:43 eileen: civicrm revision changed from ac13914d03 to ce960c0642, config revision is b7bc5dc31f
  • {{safesubst:SAL entry|1=23:23 urandom: upgrade Cassandra to 3.11.2, restbase2009-{a,b,c} - T178905}}
  • 23:21 legoktm@deploy1001: Started scap: Preference for responsive MonoBook, plus set mobile width cutoff to 550px (gerrit:437875, gerrit:437814)
  • 23:19 eileen: civicrm revision changed from 0b97f1f5b2 to ac13914d03
  • 23:07 cwd: disabled process-control for civi update
  • 22:22 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@0346959]: Update mobileapps to 5ea008c (duration: 05m 33s)
  • 22:16 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@0346959]: Update mobileapps to 5ea008c
  • 20:47 ebernhardson: sighup logstash on logstash100[789] to reload config for gerrit.wikimedia.org/r/437657
  • {{safesubst:SAL entry|1=20:42 urandom: upgrade Cassandra to 3.11.2, restbase2005-{a,b,c} - T178905}}
  • 20:19 bearND: rolled back mobileapps deploy
  • 20:18 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@a07af40]: Update mobileapps to 3bf9be5 (T196402 T195948) (duration: 08m 37s)
  • {{safesubst:SAL entry|1=20:17 urandom: upgrade Cassandra to 3.11.2, restbase2011-{a,b,c} - T178905}}
  • 20:10 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@a07af40]: Update mobileapps to 3bf9be5 (T196402 T195948)
  • {{safesubst:SAL entry|1=19:38 urandom: upgrade Cassandra to 3.11.2, restbase2008-{a,b,c} - T178905}}
  • 19:19 thcipriani@deploy1001: Synchronized php: group1 wikis to 1.32.0-wmf.7 (duration: 00m 56s)
  • 19:18 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.7
  • 18:55 herron: stopped exim on mx1001 in prep for upgrade to stretch
  • {{safesubst:SAL entry|1=18:54 urandom: upgrade Cassandra to 3.11.2, restbase2004-{a,b,c} - T178905}}
  • 18:38 andrewbogott: rebooting labvirt1019, 1020
  • 18:36 andrewbogott: rebooting labvirt1018, 1021, 1022
  • 18:29 andrewbogott: rebooting labvirt1017
  • 18:23 andrewbogott: rebooting labvirt1016
  • 18:21 andrewbogott: rebooting labvirt1015
  • {{safesubst:SAL entry|1=18:19 urandom: upgrade Cassandra to 3.11.2, restbase2003-{a,b,c} - T178905}}
  • 18:14 andrewbogott: rebooting labvirt1013
  • 18:06 andrewbogott: rebooting labvirt1012
  • 17:59 andrewbogott: rebooting labvirt1011
  • {{safesubst:SAL entry|1=17:59 urandom: upgrade Cassandra to 3.11.2, restbase2010-{a,b,c} - T178905}}
  • 17:44 andrewbogott: rebooting labvirt1010
  • {{safesubst:SAL entry|1=17:01 urandom: upgrade Cassandra to 3.11.2, restbase2007-{a,b,c} - T178905}}
  • 16:59 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1084 fully (duration: 00m 56s)
  • 16:51 andrewbogott: rebooting labvirt1008
  • 16:33 andrewbogott: rebooting labvirt1007
  • 16:25 jynus: stop mysql @ db1051 in preparation for decom
  • 16:25 andrewbogott: rebooting labvirt1006
  • 16:16 andrewbogott: rebooting labvirt1005
  • 16:09 andrewbogott: rebooting labvirt1004
  • 16:05 XioNoX: lvs1002 repooled
  • 16:04 andrewbogott: rebooting labvirt1002
  • 15:57 twentyafterfour: reloading apache on phab1001 to free up some resources
  • 15:56 andrewbogott: rebooting labvirt1014
  • {{safesubst:SAL entry|1=15:54 urandom: upgrade Cassandra to 3.11.2, restbase2001-{a,b,c} - T178905}}
  • 15:51 XioNoX: disable pybal on lvs1002 - T187962
  • 15:39 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@402d729]: Adjust the cirrus concurrencies (duration: 00m 40s)
  • 15:38 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@402d729]: Adjust the cirrus concurrencies
  • 15:32 _joe_: cross enabling videoscalers,jobrunners in their respective pools
  • {{safesubst:SAL entry|1=15:27 urandom: upgrade Cassandra to 3.11.2, restbase2001-{b,c} - T178905}}
  • 15:26 XioNoX: stop pybal on lvs1001
  • 15:26 awight@deploy1001: Finished deploy [ores/deploy@65e979f]: ORES: new draft topic model; T176336 (duration: 25m 37s)
  • 15:13 _joe_: adding jobrunners, videoscalers to both pools with equal weight in codfw
  • 15:06 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=videoscaler,dc=codfw,service=nginx
  • {{safesubst:SAL entry|1=15:06 urandom: upgrade Cassandra to 3.11.2, restbase1007-{b,c} - T178905}}
  • 15:01 awight@deploy1001: Started deploy [ores/deploy@65e979f]: ORES: new draft topic model; T176336
  • 14:39 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=videoscaler,dc=codfw,service=nginx,name=mw22(4[1-5]|5[3-8]|6[1-9]).*
  • 14:35 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=videoscaler,dc=codfw,service=nginx,name=mw21(5[3-9]|6).*
  • 14:22 andrewbogott: rebooting labvirt1009
  • 14:21 jynus: setting m2 on read write
  • 14:18 jynus: setting gerrit on read only
  • 14:14 jynus: starting s2-master switchover from db1051 to db1065
  • 14:06 andrewbogott: rebooting labvirt1003
  • 13:58 jynus: disabling puppet on db1051, db1065
  • 13:35 marostegui: Deploy schema change on db1096:3316 - T191316 T192926 T195193 T89737
  • 13:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 00m 57s)
  • 13:06 moritzm: installing elfutils security updates
  • 13:01 akosiaris: starting slow rolling restart of all VMs on ganeti01.svc.codfw.wmnet
  • 12:59 akosiaris: add +spec_ctrl to ganeti01.svc.codfw.wmnet cluster default cpu_type
  • 12:50 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1084 with low load (duration: 00m 56s)
  • 12:24 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Switch CirrusSearch jobs to EventBus for all wikis, file 2/2 - T189137 (duration: 00m 56s)
  • 12:23 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@c8d62da]: Enable cirrus for everything T190327 (duration: 00m 47s)
  • 12:23 mobrovac@deploy1001: Synchronized wmf-config/jobqueue.php: Switch CirrusSearch jobs to EventBus for all wikis - T189137 (duration: 00m 57s)
  • 12:22 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@c8d62da]: Enable cirrus for everything T190327
  • 11:38 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase s4 weight for db1097 and db1103 (duration: 00m 56s)
  • 11:35 jynus: stop and reimage db1084
  • 11:27 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 00m 58s)
  • 10:19 awight@deploy1001: Finished deploy [ores/deploy@65e979f]: ORES canary deployment to ores2002.codfw.wmnet; T176336 (duration: 03m 44s)
  • 10:16 marostegui: Deploy schema change on dbstore1002:s6 - T191316 T192926 T195193 T89737
  • 10:15 awight@deploy1001: Started deploy [ores/deploy@65e979f]: ORES canary deployment to ores2002.codfw.wmnet; T176336
  • 10:15 awight@deploy1001: Finished deploy [ores/deploy@bf182e2]: ORES canary deployment to ores2002.codfw.wmnet; T176336 (duration: 00m 06s)
  • 10:15 awight@deploy1001: Started deploy [ores/deploy@bf182e2]: ORES canary deployment to ores2002.codfw.wmnet; T176336
  • 08:29 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011 - T190704
  • 08:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool all sanitariums masters - T190704 (duration: 00m 56s)
  • 08:24 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: wikidatawiki dispatching: dispatchMaxTime 720 (4 dispatchers at once) (duration: 00m 56s)
  • 08:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool all sanitariums masters - T190704 (duration: 00m 57s)
  • 07:48 marostegui: Stop replication on all sanitarium masters to move labsdb1011 - T190704
  • 07:30 marostegui: Stop MySQL on labsdb1011 to install intel-microcode and reboot
  • 06:32 marostegui: Deploy schema change on s6 codfw master (db2039), this will generate lag on s6 codfw - T191316 T192926 T195193 T89737
  • 06:14 marostegui: Stop slave on db2095:3316 to rebuild archive_insert and archive_update triggers - T192926
  • 06:14 ppchelko@deploy1001: Finished deploy [restbase/deploy@baa70b7]: Public release of feed availability endpoint T196402, take 2 (duration: 07m 13s)
  • 06:06 ppchelko@deploy1001: Started deploy [restbase/deploy@baa70b7]: Public release of feed availability endpoint T196402, take 2
  • 06:05 ppchelko@deploy1001: Finished deploy [restbase/deploy@baa70b7]: Public release of feed availability endpoint T196402 (duration: 11m 45s)
  • 06:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool all sanitariums masters - T190704 (duration: 01m 09s)
  • 05:53 ppchelko@deploy1001: Started deploy [restbase/deploy@baa70b7]: Public release of feed availability endpoint T196402
  • 05:46 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 - T190704
  • 05:31 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010 - T190704
  • 05:25 marostegui: Restart MySQL on labsdb1010
  • 05:24 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010 - T190704
  • 05:17 marostegui: Deploy schema change on db1070 s5 primary master - T191316 T192926 T195193
  • 04:44 kartik@deploy1001: Finished deploy [cxserver/deploy@8ce20ba]: Update cxserver to 391d7b6 (Fixing T196462) (duration: 03m 06s)
  • 04:41 kartik@deploy1001: Started deploy [cxserver/deploy@8ce20ba]: Update cxserver to 391d7b6 (Fixing T196462)
  • 03:10 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Wed Jun 6 03:10:14 UTC 2018 (duration 10m 15s)
  • 02:59 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.7) (duration: 15m 31s)
  • 02:27 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.6) (duration: 08m 23s)
  • 01:07 krinkle@deploy1001: Synchronized php-1.32.0-wmf.7/composer.json: I13dbdba2b9d / T196496 (duration: 00m 57s)
  • 01:06 krinkle@deploy1001: Synchronized php-1.32.0-wmf.7/vendor/: I5a5d7d / T196496 (duration: 01m 35s)
  • 00:15 reedy@deploy1001: Synchronized php-1.32.0-wmf.7/extensions/WikimediaMessages/: respect watchlist preference feature flag (duration: 00m 58s)

2018-06-05

  • 23:54 reedy@deploy1001: Synchronized wmf-config/: Config! (duration: 00m 57s)
  • 23:42 reedy@deploy1001: Synchronized wmf-config/: Config! (duration: 00m 57s)
  • 23:41 reedy@deploy1001: Synchronized rpc/: code updates (duration: 00m 56s)
  • 23:26 reedy@deploy1001: Synchronized multiversion/submodules.json: minus unicodeconverter (duration: 00m 56s)
  • 23:24 reedy@deploy1001: Synchronized wmf-config/: bye to UnicodeConverter (duration: 00m 57s)
  • 23:21 reedy@deploy1001: Synchronized wmf-config/: page creation log on Beta Labs (duration: 00m 56s)
  • 23:07 reedy@deploy1001: Synchronized wmf-config/: Updates (duration: 00m 58s)
  • 23:05 reedy@deploy1001: Synchronized composer.json: minus x! (duration: 00m 56s)
  • 23:04 reedy@deploy1001: Synchronized rpc: 644 (duration: 00m 56s)
  • 23:01 ebernhardson: restore ferm on elastic1018 and logstash1009
  • 22:57 ebernhardson: temporarily disable ferm on elastic1018 and logstash1007 to test theory
  • 22:57 ebernhardson: temporarily disable ferm on elastic1018 to test theory
  • 22:50 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Simplify PHP_SAPI conditionals (duration: 00m 57s)
  • 22:20 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/437638/ (duration: 00m 57s)
  • 21:41 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.7
  • 21:29 thcipriani@deploy1001: Finished scap: testwiki to php-1.32.0-wmf.7 and rebuild l10n cache (duration: 65m 25s)
  • 20:23 thcipriani@deploy1001: Started scap: testwiki to php-1.32.0-wmf.7 and rebuild l10n cache
  • 20:03 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@7ecc3b6]: Update mobileapps to 66727b7 (duration: 05m 21s)
  • 19:57 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@7ecc3b6]: Update mobileapps to 66727b7
  • 19:35 urandom: upgrade Cassandra to 3.11.2, restbase1007-a (canary) - T178905
  • 17:45 urandom: upgrade Cassandra to 3.11.2, restbase2001 (canary) - T178905
  • 17:00 urandom: upgrade Cassandra to 3.11.2, restbase-dev1006 - T178905
  • 16:58 Trey314159: reindexing Slovak wikis on elastic@eqiad (T191545)
  • 16:56 urandom: upgrade Cassandra to 3.11.2, restbase-dev1005 - T178905
  • 16:10 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Remove if file_exists for /etc/wikimedia-scaler (duration: 00m 51s)
  • 16:07 thcipriani: starting branch cut for 1.32.0-wmf.7
  • 16:05 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1081 fully (duration: 00m 49s)
  • 15:58 urandom: upgrade Cassandra to 3.11.2, restbase-dev1004 - T178905
  • 15:42 Trey314159: reindexing Slovak wikis on elastic@codfw (T191545)
  • 15:41 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 fully (duration: 00m 49s)
  • 15:30 bstorm_: reboot labsdb1005 for upgrades T195313
  • 15:24 marostegui: Stop MySQL on labsdb1005
  • 14:55 marostegui: Downtime labsdb1005 and labsdb1004 for maintenance on labsdb1005 - T195313
  • 14:52 urandom: restarting restbase-dev1004 - T178905
  • 14:51 moritzm: installing fixed kernels/microcode for spectre v2 on labvirt*
  • 14:42 jynus: reenabling puppet on all databases
  • 14:09 jynus: disabling puppet on eqiad dbs for 435751 gerrit deploy
  • 13:55 Amir1: ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --sleep 2 --check-old
  • 13:26 mobrovac@deploy1001: Finished deploy [restbase/deploy@a39743d]: Fix transform monitoring tests (duration: 03m 15s)
  • 13:24 zeljkof: EU SWAT finished
  • 13:24 zfilipin@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: SWAT: Remove ruwiki from MFSpecialCaseMainPage (T196223) (duration: 00m 51s)
  • 13:22 mobrovac@deploy1001: Started deploy [restbase/deploy@a39743d]: Fix transform monitoring tests
  • 13:22 mobrovac@deploy1001: Finished deploy [restbase/deploy@a39743d]: Fix transform monitoring tests (duration: 08m 00s)
  • 13:14 mobrovac@deploy1001: Started deploy [restbase/deploy@a39743d]: Fix transform monitoring tests
  • 13:13 mobrovac@deploy1001: Finished deploy [restbase/deploy@a39743d]: Fix transform monitoring tests (duration: 18m 02s)
  • 12:55 mobrovac@deploy1001: Started deploy [restbase/deploy@a39743d]: Fix transform monitoring tests
  • 12:13 kartik@deploy1001: Finished deploy [cxserver/deploy@0a350c3]: Update cxserver to 7fb7671 (duration: 01m 15s)
  • 12:12 kartik@deploy1001: Started deploy [cxserver/deploy@0a350c3]: Update cxserver to 7fb7671
  • 12:06 XioNoX: disable graceful-switchover on dual-RE routers
  • 11:35 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1081 with low load (duration: 00m 49s)
  • 11:30 elukey: manually set net.netfilter.nf_conntrack_tcp_timeout_time_wait to 65 (was 120) on mw* hosts
  • 11:21 mobrovac@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Switch CirrusSearch jobs for all wikis except wp, wd, commons - T189137 (duration: 00m 51s)
  • 11:21 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@aa5e94b]: Enable cirrus jobs in kafka for everything except wikipedia, wikidata and commons T190327 (duration: 00m 41s)
  • 11:21 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@aa5e94b]: Enable cirrus jobs in kafka for everything except wikipedia, wikidata and commons T190327
  • 11:07 elukey: manually set net.netfilter.nf_conntrack_tcp_timeout_time_wait to 65 (was 120)
  • 10:51 jynus: stop db1081 for reimage
  • 10:45 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1081 (duration: 00m 50s)
  • 10:38 akosiaris: reboot acrab for spec_ctrl CPU flag addition
  • 10:20 akosiaris: reboot acrab for kernel upgrade and qemu upgrade
  • 10:18 akosiaris: upgrade qemu to 1:2.8+dfsg-6+deb9u4 on ganeti01.svc.codfw.wmnet
  • 10:14 moritzm: installing batik security updates
  • 10:08 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 with low load (duration: 00m 50s)
  • 09:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 after alter table (duration: 00m 50s)
  • 09:22 jynus: stop db1080 for reimage
  • 09:09 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 (duration: 00m 49s)
  • 09:01 marostegui: Deploy schema change on db1110 - T191316 T192926 T89737 T195193
  • 09:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1110 for alter table (duration: 00m 51s)
  • 08:52 mobrovac@deploy1001: Synchronized wmf-config/jobqueue.php: Switch video scaling jobs to EventBus - T190327 (duration: 00m 52s)
  • 08:52 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@63b30a6]: Enable videoscaler jobs in kafka T190327 (duration: 00m 49s)
  • 08:51 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@63b30a6]: Enable videoscaler jobs in kafka T190327
  • 08:16 ema: libvmod-re2 1.3.1-1 uploaded to apt.w.o T196355
  • 08:10 gehel: rebooting elastic10(41|43) for plugin update - T193734
  • 06:42 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2059 after reimage (duration: 00m 50s)
  • 05:52 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1100 after alter table (duration: 00m 50s)
  • 05:44 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2059 for reimage (duration: 00m 51s)
  • 05:14 marostegui: Deploy schema change on db1100 - T191316 T192926 T89737 T195193
  • 05:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1100 for alter table (duration: 00m 51s)
  • 03:25 pnorman@deploy1001: Finished deploy [tilerator/deploy@9e40702] (cleartables): Re-try 074d01a (duration: 06m 02s)
  • 03:19 pnorman@deploy1001: Started deploy [tilerator/deploy@9e40702] (cleartables): Re-try 074d01a
  • 03:12 pnorman@deploy1001: Finished deploy [tilerator/deploy@9e40702] (cleartables): Re-try 074d01a (duration: 04m 50s)
  • 03:07 pnorman@deploy1001: Started deploy [tilerator/deploy@9e40702] (cleartables): Re-try 074d01a
  • 02:51 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): enable v3view (duration: 00m 06s)
  • 02:51 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): enable v3view
  • 02:49 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): enable v3view
  • 02:45 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Disable more sources (duration: 00m 26s)
  • 02:45 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Disable more sources
  • 02:43 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Disable style without labels (duration: 02m 22s)
  • 02:41 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Disable style without labels
  • 02:30 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Tue Jun 5 02:30:57 UTC 2018 (duration 10m 15s)
  • 02:20 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.6) (duration: 08m 21s)
  • 00:08 thcipriani: restarting jenkins to finalize updates

2018-06-04

  • 23:57 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Enable page creation log on Test Wikipedia" T196400 (duration: 00m 49s)
  • 23:50 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable page creation log on Test Wikipedia T196400 (duration: 00m 50s)
  • 23:21 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@825863f]: Potential mitigation for T194325 (duration: 09m 39s)
  • 23:11 smalyshev@deploy1001: Started deploy [wdqs/wdqs@825863f]: Potential mitigation for T194325
  • 22:57 smalyshev@deploy1001: Started deploy [wdqs/wdqs@825863f]: Potential mitigation for T194325
  • 22:35 pnorman@deploy1001: Finished deploy [kartotherian/deploy@a588bf4] (cleartables): Deploy var name parameters (duration: 00m 21s)
  • 22:35 pnorman@deploy1001: Started deploy [kartotherian/deploy@a588bf4] (cleartables): Deploy var name parameters
  • 21:08 pnorman@deploy1001: Finished deploy [kartotherian/deploy@d8dcba3] (cleartables): Redeploy kartotherian to test (duration: 00m 21s)
  • 21:07 pnorman@deploy1001: Started deploy [kartotherian/deploy@d8dcba3] (cleartables): Redeploy kartotherian to test
  • 20:58 awight@deploy1001: Finished deploy [ores/deploy@bf182e2]: roll back ores2001 (duration: 01m 11s)
  • 20:57 awight@deploy1001: Started deploy [ores/deploy@bf182e2]: roll back ores2001
  • 20:50 awight@deploy1001: Finished deploy [ores/deploy@65e979f]: ores2001 canary of drafttopic; T176336 (take 3 after bumping revision) (duration: 01m 47s)
  • 20:48 awight@deploy1001: Started deploy [ores/deploy@65e979f]: ores2001 canary of drafttopic; T176336 (take 3 after bumping revision)
  • 20:40 awight@deploy1001: Finished deploy [ores/deploy@d77e52c]: ores2001 canary of drafttopic; T176336 (take 2 after init'ing LFS)f (duration: 00m 22s)
  • 20:40 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@276ea43]: Update mobileapps to f579f0d (duration: 05m 54s)
  • 20:40 awight@deploy1001: Started deploy [ores/deploy@d77e52c]: ores2001 canary of drafttopic; T176336 (take 2 after init'ing LFS)f
  • 20:40 awight@deploy1001: Started deploy [ores/deploy@d77e52c]: l ores2001.codfw.wmnet ores2001 canary of drafttopic; T176336 (take 2 after init'ing LFS)
  • 20:34 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@276ea43]: Update mobileapps to f579f0d
  • 20:34 awight@deploy1001: Finished deploy [ores/deploy@d77e52c]: ores2001 canary of drafttopic; T176336 (duration: 02m 35s)
  • 20:34 arlolra@deploy1001: Finished deploy [parsoid/deploy@828034c]: Updating Parsoid to bd5a840 (duration: 10m 54s)
  • 20:31 awight@deploy1001: Started deploy [ores/deploy@d77e52c]: ores2001 canary of drafttopic; T176336
  • 20:30 andrew@deploy1001: Finished deploy [horizon/deploy@12aa2d3]: fix for T192179 (duration: 03m 35s)
  • 20:27 andrew@deploy1001: Started deploy [horizon/deploy@12aa2d3]: fix for T192179
  • 20:23 arlolra@deploy1001: Started deploy [parsoid/deploy@828034c]: Updating Parsoid to bd5a840
  • 19:10 gehel: elasticsearch cluster restart on eqiad completed - T193734
  • 19:06 ottomata: bouncing kafka2003 one more time for T196077
  • 19:04 otto@deploy1001: Started restart [eventlogging/eventbus@3a5c395]: bouncing eventbus after upgrading to python-kafka 1.4.3 for T196077
  • 19:00 ottomata: bouncing kafka2003 again to test T196077 with python-kafka 1.4.3
  • 18:52 thcipriani@deploy1001: Synchronized docroot/noc/index.html: SWAT: Fix wrong link to Server Admin Log on noc.wikimedia.org T193848 (duration: 00m 50s)
  • 18:44 ottomata: bouncing kafka on kafka2003 to test T196077
  • 18:36 otto@deploy1001: Finished deploy [eventlogging/eventbus@3a5c395]: T196077 (duration: 04m 07s)
  • 18:35 ottomata: deploying eventlogging-service-eventbus for T196077
  • 18:32 otto@deploy1001: Started deploy [eventlogging/eventbus@3a5c395]: T196077
  • 18:31 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Redeploy to test (duration: 00m 25s)
  • 18:30 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Redeploy to test
  • 18:01 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Enable osm source (duration: 00m 25s)
  • 18:00 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Enable osm source
  • 17:56 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Enable osm-intl source (duration: 00m 25s)
  • 17:55 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Enable osm-intl source
  • 17:53 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Enable osm-pbf source (duration: 00m 25s)
  • 17:52 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Enable osm-pbf source
  • 17:51 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Disable configurations after v3view (duration: 00m 25s)
  • 17:51 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Disable configurations after v3view
  • 17:48 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Disable configurations after v3view (duration: 00m 24s)
  • 17:47 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Disable configurations after v3view
  • 17:44 tgr: disabled SUL+wikitech 2FA for MarkAHershberger (T196370)
  • 17:37 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Disable configurations after v3view (duration: 00m 26s)
  • 17:37 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Disable configurations after v3view
  • 17:30 gehel@deploy1001: Finished deploy [wdqs/wdqs@fd534fa]: WDQS: new GUI and blazegraph versions (duration: 08m 25s)
  • 17:30 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Deploy new meddo to test (duration: 02m 54s)
  • 17:27 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Deploy new meddo to test
  • 17:22 gehel@deploy1001: Started deploy [wdqs/wdqs@fd534fa]: WDQS: new GUI and blazegraph versions
  • 17:22 pnorman@deploy1001: Finished deploy [tilerator/deploy@074d01a] (cleartables): Deploy new meddo to test (duration: 03m 24s)
  • 17:18 pnorman@deploy1001: Started deploy [tilerator/deploy@074d01a] (cleartables): Deploy new meddo to test
  • 15:09 _joe_: performing rolling restart of authdns servers to pick up ip change for the videoscalers
  • 15:05 herron: gzipped large rotated log files in analytics1003:/var/log/hive to clear icinga disk space warning
  • 14:49 _joe_: restarting low-traffic pybals in eqiad,codfw for adding the videoscaler VIP
  • 14:38 ppchelko@deploy1001: Started restart [cpjobqueue/deploy@c6dc83d]: (no justification provided)
  • 14:35 Amir1: ladsgroup@terbium:~$ foreachwikiindblist large deleteAutoPatrolLogs.php --sleep 2 --check-old
  • 14:34 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010 - T190704
  • 14:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool all sanitariums masters - T190704 (duration: 00m 49s)
  • 14:28 moritzm: installing wireshark security updates
  • 14:27 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=videoscaler,name=eqiad
  • 14:21 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=videoscaler,service=nginx,dc=codfw,name=mw211.*
  • 14:21 oblivian@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=videoscaler,service=nginx,dc=codfw
  • 14:20 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=.*,service=mathoid,cluster=kubernetes,name=.*
  • 14:19 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=videoscaler,service=nginx,dc=eqiad
  • 14:10 herron: lithium:~# systemctl restart rsyslog.service
  • 14:10 marostegui: Stop replication on all sanitarium masters to move labsdb1010 to another sanitarium host - T190704
  • 14:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool all sanitariums masters - T190704 (duration: 00m 49s)
  • 14:02 moritzm: installing wireshark security updates
  • 14:02 zeljkof: EU SWAT finished
  • 13:49 zfilipin@deploy1001: Synchronized static/images/project-logos: SWAT: Change bewikiquote logo (T196134) (duration: 00m 49s)
  • 13:44 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgMetaNamespace to "Вікіцытатнік" on bewikiquote (T196230) (duration: 00m 49s)
  • 13:39 anomie: Running populateExternallinksIndex60.php on group 2 for T59176. FYI: this will probably take until next Friday to complete.
  • 13:34 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgProofreadPagePageSeparator to empty string on zhwikisource (T194875) (duration: 00m 49s)
  • 13:27 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgProofreadPagePageSeparator to empty string for jawikisource (T195873) (duration: 00m 49s)
  • 13:16 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Temporarily enable MFMobileMainPageCss in ruwiki (T195905) (duration: 00m 50s)
  • 13:11 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Assign movefile to autoreviewrs and patrollers on zhwiki (T195247) (duration: 00m 52s)
  • 12:49 gehel: restart elastic2001 to enable G1 GC - T156137
  • 12:18 krinkle@deploy1001: Finished deploy [performance/navtiming@b229f75]: (no justification provided) (duration: 00m 05s)
  • 12:18 krinkle@deploy1001: Started deploy [performance/navtiming@b229f75]: (no justification provided)
  • 11:38 akosiaris: rebalance row_A, row_C nodegroups in ganeti01.svc.eqiad.wmnet cluster
  • 10:51 akosiaris: reimage ganeti1004, ganeti1008 to stretch
  • 10:39 _joe_: rolling restart of apache on the jobrunners to pick the changed privatetmp setting, rotating logs
  • 10:14 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 49s)
  • 10:13 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 51s)
  • 09:39 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010 - https://phabricator.wikimedia.org/T190704
  • 09:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 after alter table (duration: 00m 49s)
  • 09:04 addshore: addshore@terbium:~$ for i in {1..2500}; do echo Lexeme:L$i; done | mwscript purgePage.php --wiki wikidatawiki
  • 08:56 marostegui: Deploy schema change on db1097:3315 - T191316 T192926 T89737 T195193
  • 08:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 for alter table (duration: 00m 49s)
  • 08:53 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool pc2005 (duration: 00m 50s)
  • 08:10 jynus: restarting icinga due to ongoing check/downtime issues
  • 07:57 marostegui: Stop replication on db2094:3315 for testing
  • 07:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1082 after alter table (duration: 00m 51s)
  • 07:11 gehel: starting elasticsearch cluster restart on eqiad - T193734
  • 06:18 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2059, db2075 - T190704 (duration: 00m 49s)
  • 06:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1121 - T190704 (duration: 00m 49s)
  • 05:52 marostegui: Stop replication in sync on db1121 and db2051 - T190704
  • 05:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1121 - T190704 (duration: 00m 49s)
  • 05:29 marostegui: Deploy schema change on db1082 with replication (this will generate lag on labs for s5) - T191316 T192926 T89737 T195193
  • 05:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1082 for alter table (duration: 00m 53s)
  • 02:53 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Mon Jun 4 02:53:16 UTC 2018 (duration 10m 14s)
  • 02:43 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.6) (duration: 14m 33s)

2018-06-03

  • 02:18 andrewbogott: rebooting labservices1001; it seems to have crashed

2018-06-02

  • 07:36 legoktm@deploy1001: Synchronized php-1.32.0-wmf.6/skins/MonoBook/: Temporarily revert responsive MonoBook (T195625) (duration: 00m 58s)

2018-06-01

  • 19:21 ebernhar1son: enable query phase slow logging and increase thresholds for fetch phase slow logging for content/general indices on eqiad and codfw elasticsearch clusters
  • 19:14 mutante: zh.planet - fixed issue with corrupt state file and permissions - updated and using new design as well now
  • 17:34 mutante: deployment.eqiad/codfw DNS names switched from tin to deploy1001
  • 17:06 thcipriani@deploy1001: Synchronized README: noop test of new deployment server (duration: 00m 53s)
  • 16:39 mutante: deploy2001 - also fixing file permissions. files owned by 996 -> mwdeploy, files owned by 997 -> trebuchet
  • 16:21 mutante: deployment server has switched away from tin to deploy1001. set global scap lock on deploy1001, re-enabled puppet and ran puppet, disabled tin as deployment server (T175288)
  • 16:13 herron: enabled new logstash tcp input with TLS enabled for syslogs on port 16514 T193766
  • 15:51 gehel: elasticsearch cluster restart on codfw completed - T193734
  • 15:47 mutante: @deploy1001:/srv/deployment# find . -uid 997 -exec chown trebuchet {} \;
  • 15:41 mutante: root@deploy1001:/srv/mediawiki-staging# find . -uid 996 -exec chown mwdeploy {} \;
  • 15:17 mutante: [deploy1001:~] $ scap pull-master tin.eqiad.wmnet
  • 15:12 mutante: tin umask 022 && echo 'switching deploy servers' > /var/lock/scap-global-lock
  • 15:05 mutante: rsyncing /srv/mediawiki-staging to /srv/mediawiki-staging-before-backup/ on tin as a backup
  • 14:52 mutante: deploy1001 - scap pull
  • 14:30 elukey: killed pt-heartbear-wikimedia after https://gerrit.wikimedia.org/r/436748 on db1107
  • 14:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 fully (duration: 01m 02s)
  • 14:08 reedy@tin: Synchronized php-1.32.0-wmf.6/extensions/FlaggedRevs: T196139 (duration: 01m 08s)
  • 13:32 marostegui: Deploy schema change on dbstore1002:s5 - T191316 T192926 T89737 T195193
  • 13:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 (duration: 01m 03s)
  • 10:59 _joe_: disabling puppet on all hosts with role::mediawiki::common while installing mcrouter everywhere
  • 10:35 marostegui: Deploy schema change on db1096:3315 - T191316 T192926 T89737 T195193
  • 10:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 (duration: 01m 03s)
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 db1096:3315 (duration: 01m 02s)
  • 10:04 Amir1: ladsgroup@terbium:~$ foreachwikiindblist medium deleteAutoPatrolLogs.php --sleep 2 --check-old
  • 09:53 volans@tin: Finished deploy [debmonitor/deploy@fe8df6e]: Release v0.1.1 (duration: 00m 33s)
  • 09:53 volans@tin: Started deploy [debmonitor/deploy@fe8df6e]: Release v0.1.1
  • 09:37 marostegui: Stop replication in sync on db1113:3315 and db1096:3315 for data checks
  • 09:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 db1096:3315 (duration: 01m 03s)
  • 09:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 with low load (duration: 01m 03s)
  • 08:38 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): reenable v3view on 2004 (duration: 04m 53s)
  • 08:33 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): reenable v3view on 2004
  • 08:30 jynus: reimage db1083
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 after alter table (duration: 01m 03s)
  • 08:24 jynus: temporarily reducing s7-codfw-master consistency to aliviate lag (binlog_sync, flush_log)
  • 08:22 joal@tin: Finished deploy [analytics/refinery@7a72241]: Regular weekly deploy (duration: 12m 31s)
  • 08:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 (duration: 01m 05s)
  • 08:10 joal@tin: Started deploy [analytics/refinery@7a72241]: Regular weekly deploy
  • 06:15 marostegui: Stop MySQL on db2059 to clone db2075 - T190704
  • 06:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2059 (duration: 00m 56s)
  • 05:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2092 and db2062 in s1 (duration: 00m 59s)
  • 05:27 marostegui: Deploy schema change on db1113:3315 - T191316 T192926 T89737 T195193
  • 05:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 for alter table (duration: 00m 57s)
  • 01:34 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 00m 33s)
  • 01:33 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 01:27 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 02m 34s)
  • 01:24 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 01:03 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 02m 24s)
  • 01:00 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 00:59 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 02m 52s)
  • 00:56 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 00:53 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 02m 26s)
  • 00:51 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 00:48 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 03m 11s)
  • 00:45 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 00:03 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 04m 26s)


Archives