Server Admin Log/Archive 32

From Wikitech

2017-07-31

  • 23:34 thcipriani@tin: Synchronized dblists: SWAT: Revert "Make ptwikimedia a fishbowl wiki" T171501 (duration: 00m 42s)
  • 23:32 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Path for enwikiquote logo T171810 (duration: 00m 43s)
  • 23:17 thcipriani@tin: Synchronized dblists: SWAT: Make ptwikimedia a fishbowl wiki T171501 (duration: 00m 43s)
  • 23:11 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Cleanup old BC config for JsonUnitStorage T171107 (duration: 00m 42s)
  • 22:38 mobrovac@tin: Finished deploy [citoid/deploy@7ad598d]: Do not wait for PubMed requests to complete - T162886 (duration: 05m 52s)
  • 22:32 mobrovac@tin: Started deploy [citoid/deploy@7ad598d]: Do not wait for PubMed requests to complete - T162886
  • 20:49 mutante: restarting pdfrender service on sc1001 after icinga alert (T159922)
  • 20:33 cscott: Updated Parsoid to version 08114f35 (T43716, T154718, T166413)
  • 20:32 cscott@tin: Finished deploy [parsoid/deploy@c1cba48]: Updating Parsoid to 08114f35 (duration: 10m 50s)
  • 20:22 cscott@tin: Started deploy [parsoid/deploy@c1cba48]: Updating Parsoid to 08114f35
  • 19:43 ejegg: updated payments-wiki from 2d10807 to bd9f730
  • 19:29 ejegg: updated payments-wiki from c531e11 to 2d10807
  • 18:37 robh: we're migrating mr1-ulsfo, disregard mgmt icinga alerts
  • 18:30 ejegg: updated payments-wiki from 084d0f9 to c531e11
  • 18:19 thcipriani@tin: Synchronized php-1.30.0-wmf.11/extensions/OpenStackManager/special/SpecialNovaRole.php: SWAT: Do not clobber $out in local scope T172077 (duration: 00m 42s)
  • 18:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable archive search via Elastic everywhere except Wikidata T163235 (duration: 00m 42s)
  • 18:13 demon@tin: Pruned MediaWiki: 1.30.0-wmf.10 [keeping static files] (duration: 01m 18s)
  • 17:52 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(23).eqiad.wmnet
  • 17:52 gehel: un-banning and repooling elastic1023 - T168816
  • 17:25 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(20|21|22).eqiad.wmnet
  • 17:25 gehel: un-banning and repooling elastic102[012] - T168816
  • 17:12 gehel@tin: Finished deploy [wdqs/wdqs@bdf3494]: (no justification provided) (duration: 01m 48s)
  • 17:12 herron: scb1001 restarted pdfrender service - T159922
  • 17:10 gehel@tin: Started deploy [wdqs/wdqs@bdf3494]: (no justification provided)
  • 16:58 ejegg: updated Misc fundraising tools from 457bddb to 58bcbf3
  • 16:30 gehel: mistaken restart of elastic1030 as part of T168816
  • 16:15 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(20|21|22|23).eqiad.wmnet
  • 16:15 gehel: depooling and shutting down elastic102[0123] for thermal paste - T168816
  • 15:33 marostegui: Create index on u2041__ores_p.monthly_wp10_enwiki - T146718
  • 15:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Specify mariadb running version on db2072 (duration: 00m 43s)
  • 14:59 gehel: banning elastic10(22|23) - T168816
  • 14:37 zeljkof: EU SWAT finished
  • 14:37 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(17|18|19).eqiad.wmnet
  • 14:36 gehel: banning and repooling elastic10(20|21) - T168816
  • 14:36 gehel: un-banning and repooling elastic10(17|18|19) - T168816
  • 14:34 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add P279 to $wgPropertySuggesterClassifyingPropertyIds (T169060) (duration: 00m 42s)
  • 14:20 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove wm?gRevisionSliderBetaFeature (duration: 00m 42s)
  • 14:19 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Remove wm?gRevisionSliderBetaFeature (duration: 00m 42s)
  • 14:14 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Turn on reading from the term_full_entity_id in testwikidata (T165197) (duration: 00m 42s)
  • 14:08 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(17|18|19).eqiad.wmnet
  • 14:08 gehel: shutting down elastic10(17|18|19) for thermal paste - T168816
  • 14:04 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove wgRevisionSliderAlternateSlider (duration: 00m 42s)
  • 13:57 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Remove Wikibase vs Interwikisorting checks (T150183) (duration: 00m 43s)
  • 13:47 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove WMDE log Channel (T168635) (duration: 00m 43s)
  • 13:30 chasemp: disable puppet for cloud-y things
  • 12:56 gehel: un-banning elastic1020 since it seems to have impact on cluster performances - T168816
  • 12:39 gehel: banning elastic10(17|18|19|20) to prepare for thermal paste - T168816
  • 11:12 marostegui: Compress s6 on db1102 - T153743
  • 09:28 marostegui: Stop replication on labsdb1009 and labsdb1010 for maintenance - T153743
  • 08:55 elukey: update nodejs* on aqs100[56789] to 6.11 - T170790
  • 08:45 marostegui: Rename table click_tracking and click_tracking_user_properties on db1089 (s1) - T115982
  • 08:35 marostegui: Drop table old_growth on s1 - T115982
  • 07:17 marostegui: Stop replication on s7 on db1102 for maintenance - T153743
  • 07:12 marostegui: Deploy alter table on db1055 - T166204
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T166204 (duration: 00m 52s)
  • 02:31 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jul 31 02:31:39 UTC 2017 (duration 6m 40s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.11) (duration: 07m 41s)

2017-07-29

  • 06:15 _joe_: restarting pdfrender on scb1003

2017-07-28

  • 23:32 mutante: puppetmaster2001 - git pulled in /var/lib/git/operations/puppet to sync with puppetmaster1001 - accidentally interrupted puppet-merge
  • 20:21 foks: removing 2FA from User:SPoore (WMF)
  • 19:40 mutante: releases2001 - OS install worked this time, could not reproduce grub error, signing puppet cert, initial puppet run (T171917)
  • 18:48 chasemp: enable and force puppet on labtestservices2001,labtestvirt2001,labtestcontrol2001,labservices1002,labcontrol1002,labnet1002,labvirt1014 and labtestneutron2001 to see a newly installed host get the change instead of a noop
  • 18:33 chasemp: disabling puppet for labs things for trying out refactor rollout
  • 17:18 herron: cleaned up core files in mw1209:/var/tmp/core to clear disk alert
  • 16:21 andrewbogott: apt-get install apache2 on labcontrol1001 and labcontrol1002 for security updates
  • 16:19 andrewbogott: apt-get install apache2 on silver for security updates
  • 16:18 andrewbogott: apt-get install apache2 on californium for security updates
  • 15:23 jynus: upgrading and restarting db1102
  • 14:11 paravoid: upgrading rhenium to stretch via dist-upgrade
  • 13:04 jynus: upgrading and restarting db1095
  • 10:31 jynus: upgrading and restarting labsdb1009 and labsdb1011
  • 09:41 elukey: re-enable irc-echo on einstenium
  • 09:07 moritzm: installing apache security updates on puppet masters
  • 08:41 elukey: stop ircecho on einstenium as puppet-error-shower countermeasure
  • 07:56 elukey: update nodejs to 6.11 on aqs1004 (testing prod node after beta qa) - T170790
  • 07:52 gehel: repooling wdqs1001 (data import completed)
  • 07:52 elukey: forced mii-tool -r eth0 on analytics1034 to get 1G negotiated speed
  • 07:52 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1001.eqiad.wmnet
  • 06:35 moritzm: installing apache security updates on trusty systems
  • 02:26 mutante: scb1002 - systemctl restart pdfrender - was "connect to address 10.64.16.21 and port 5252: Connection refused" in Icinga since a couple hours (T159922) - recovered
  • 02:08 ottomata: stat1002: disabled puppet, umounted /tmp, /home and /a, poweroff
  • 00:51 mutante: releases1001 - rsynced reprepro db data from bromine
  • 00:27 mutante: bromine sudo -E reprepro clearvanished to deleted unused precise-mediawiki causing reprepro errors

2017-07-27

  • 23:48 catrope@tin: Synchronized php-1.30.0-wmf.11/resources/src/mediawiki.rcfilters/: T171368 (duration: 00m 42s)
  • 23:39 eileen: update process-control to 24c7bbe (renable omnirecipient)
  • 23:36 eileen: update process-control to 2c1c8a3bcb0186 - new frequency on receipient load
  • 23:35 catrope@tin: Synchronized wmf-config/: Enable emails for minor edits everywhere but keep default prefs (T29884, T142727) (duration: 00m 45s)
  • away: disabled Omnimail recipient load job
  • 23:14 catrope@tin: Finished scap: wmf-config/InitialiseSettings.php Enable experimental RCFilters on group2 too (duration: 02m 55s)
  • 23:11 catrope@tin: Started scap: wmf-config/InitialiseSettings.php Enable experimental RCFilters on group2 too
  • 22:46 ejegg: enabled omnimail recipient load job, throttling inserts to 15,000 every 60 sec
  • 21:46 ejegg: updated CiviCRM from ceff739 to 23f2bbf
  • {{safesubst:SAL entry|1=21:40 urandom: Restarting Cassandra, restbase-dev1004-{a,b} to apply updated data directories list}}
  • 21:08 herron: bast3002 repointed mdadm at null alias to clear systemd degraded state alert
  • 20:31 gwicke: restarting all pdfrender instances on scb in eqiad; one of them was hanging & causing user requests to fail
  • 20:24 Krinkle: Un-dirtying state of /srv/deployment/jobrunner/jobrunner on tin (from T129148). Checking-out https://gerrit.wikimedia.org/r/367743 instead.
  • 20:22 moritzm: installing apache security updates on cobalt
  • 20:13 ppchelko@tin: Finished deploy [restbase/deploy@cfb9c46]: Correctly escape the quotes in PDF names (duration: 07m 48s)
  • 20:05 ppchelko@tin: Started deploy [restbase/deploy@cfb9c46]: Correctly escape the quotes in PDF names
  • 19:50 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.11
  • 19:43 mutante: switching https://releases.wikimedia.org backend from bromine to releases1001 - all files have been rsynced (T164030)
  • 19:37 robh: cp4021 shutting down for relocation in rack, will put in maint mode for next 2 hours
  • 19:21 catrope@tin: Synchronized wmf-config: T171556 (duration: 00m 47s)
  • 19:20 catrope@tin: Synchronized dblists: T171556 (duration: 00m 46s)
  • 19:11 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable namespace and tag filters in RCFilters on group0 and group1 (duration: 00m 46s)
  • 19:07 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T171751 (duration: 00m 46s)
  • 19:06 catrope@tin: Synchronized php-1.30.0-wmf.11/autoload.php: (no justification provided) (duration: 00m 45s)
  • 19:05 catrope@tin: Synchronized php-1.30.0-wmf.11/includes/: T168501, take two (duration: 01m 27s)
  • 19:05 mutante: added new misc::cache director "releases" for releases* servers, releases moving away from bromine (T164030)
  • 18:44 catrope@tin: Synchronized php-1.30.0-wmf.11/includes/: Temporarily revert patch for T168501 while I fix it (duration: 01m 27s)
  • 18:32 mutante: labpuppetmaster1001 - restarted ferm twice, DNS lookup for AAAA worked, error gone on second time. then did same on labpuppetmaster1002 (T171880)
  • 18:29 catrope@tin: Synchronized php-1.30.0-wmf.11/includes/: T168501 and T163380 (duration: 01m 31s)
  • 18:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable WikidataPageBanner on euwiki (T171763) (duration: 00m 46s)
  • 18:15 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Limit FeaturedFeed on dewiki to 7 days (T159664) (duration: 00m 47s)
  • 18:05 herron: installed libmail-spf-perl on fermium to address spamassassin "module not installed: Mail::SPF ('require' failed)" error
  • 17:05 ejegg: restarted recurring Ingenico charge job
  • 16:01 ejegg: updated CiviCRM from e83c012 to ceff739
  • 15:44 jynus: stopping mysql, upgrading and restarting labsdb1010
  • 15:03 moritzm: stopping jobrunner/jobchron on mw1260 to investigate a few failing ffmpeg2theora invocations (T145742)
  • 13:31 ema: lvs1009, lvs1010: upgrade to pybal 1.13.10 (one-packet-scheduling) T104442
  • 13:19 moritzm: installing apache updates on mendelevium and terbium
  • 13:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mapframe for cswiki - T171588 (duration: 00m 47s)
  • 10:32 moritzm: reimaging mw2246 to jessie (T145742)
  • 10:31 jynus: shutting down and rebooting db2016
  • 10:14 _joe_: restarting puppetdb on nihal, T170740
  • 10:06 jynus@tin: Synchronized wmf-config/db-codfw.php: Promote db2048 as the new codfw-s1 master (duration: 00m 46s)
  • 10:05 jynus: starting actual master failover s1-codfw db2016->db2048
  • 09:48 ema: pybal 1.13.10 (one-packet-scheduling) built and uploaded to apt.w.o T104442
  • 09:23 jynus: starting s1-codfw database topology changes
  • 09:02 godog: copy python-conftool to stretch-wikimedia for scap dep
  • 08:54 jynus: stopping mysql and restarting db2048
  • 08:48 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2048 (duration: 00m 46s)
  • 08:28 jynus: disable puppet on db2016 and db2048 to prepare for switchover
  • 08:21 godog: upload scap 3.6.0-1 - T127762
  • 08:12 moritzm: installing apache security updates on graphite*
  • 07:37 moritzm: upgrading apache on planet1001
  • 07:02 moritzm: installing spice secuerity updates on trusty hosts (jessie already fixed)
  • 03:29 ejegg: disabled recurring Ingenico charges
  • 03:12 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jul 27 03:12:30 UTC 2017 (duration 7m 16s)
  • 03:05 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.11) (duration: 06m 00s)
  • 02:39 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.10) (duration: 14m 21s)

2017-07-26

  • 23:33 eileen: civicrm update from fb83798 to e83c012
  • 23:30 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Echo per-user blacklist on meta T150419 (duration: 00m 49s)
  • 23:29 mutante: tin rm /var/lock/scap.operations_mediawiki-config.lock
  • 22:46 ejegg: reactivated remaining fundraising queue consumers
  • 22:38 ejegg: reactivated antifraud / payment-init queue consumer
  • 22:34 ejegg: updated CiviCRM from 461900e to fb83798
  • 21:39 andrewbogott: restarting rabbitmq on labcontrol1001
  • 21:10 mobrovac@tin: Finished deploy [electron-render/deploy@8dd5f13]: (no justification provided) (duration: 01m 37s)
  • 21:08 mobrovac@tin: Started deploy [electron-render/deploy@8dd5f13]: (no justification provided)
  • 21:03 mobrovac@tin: Finished deploy [electron-render/deploy@8dd5f13]: (no justification provided) (duration: 02m 38s)
  • 21:00 mobrovac@tin: Started deploy [electron-render/deploy@8dd5f13]: (no justification provided)
  • 20:54 mforns@tin: Finished deploy [analytics/refinery@58176d0]: deploying refinery to use 0.0.49 jars (duration: 03m 12s)
  • 20:51 mforns@tin: Started deploy [analytics/refinery@58176d0]: deploying refinery to use 0.0.49 jars
  • 20:32 mobrovac@tin: Finished deploy [mathoid/deploy@44ea6d8]: Switch node_modules to Node v6.11 (duration: 03m 16s)
  • 20:31 mobrovac@tin: Started deploy [electron-render/deploy@8dd5f13]: Switch node_modules to node v6.11
  • 20:30 mobrovac@tin: Finished deploy [recommendation-api/deploy@e7adea0]: Switch node_modules to node v6.11 (duration: 02m 26s)
  • 20:29 mobrovac@tin: Started deploy [mathoid/deploy@44ea6d8]: Switch node_modules to Node v6.11
  • 20:28 mobrovac@tin: Finished deploy [eventstreams/deploy@a2a0f19]: Switch node_modules to Node v6.11 (duration: 02m 57s)
  • 20:27 mobrovac@tin: Started deploy [recommendation-api/deploy@e7adea0]: Switch node_modules to node v6.11
  • 20:27 mobrovac@tin: Finished deploy [trending-edits/deploy@22967f3]: Switch node_modules to node v6.11 (duration: 07m 01s)
  • 20:25 mobrovac@tin: Started deploy [eventstreams/deploy@a2a0f19]: Switch node_modules to Node v6.11
  • 20:24 mobrovac@tin: Finished deploy [changeprop/deploy@444223d]: Switch node_modules to Node v6.11 (duration: 01m 35s)
  • 20:22 mobrovac@tin: Started deploy [changeprop/deploy@444223d]: Switch node_modules to Node v6.11
  • 20:22 mobrovac@tin: Finished deploy [mobileapps/deploy@bb81d91]: Switch node_modules to Node v6.11 (duration: 04m 08s)
  • 20:20 mobrovac@tin: Started deploy [trending-edits/deploy@22967f3]: Switch node_modules to node v6.11
  • 20:19 mobrovac@tin: Finished deploy [graphoid/deploy@1707b3c]: Switch node_modules to node v6.11 (duration: 07m 50s)
  • 20:18 mobrovac@tin: Started deploy [mobileapps/deploy@bb81d91]: Switch node_modules to Node v6.11
  • 20:12 demon@tin: Finished scap: no-op, ideal timing scenario (duration: 03m 35s)
  • 20:12 mobrovac@tin: Finished deploy [citoid/deploy@43c2776]: Switch node_modules to Node v6.11 (duration: 02m 56s)
  • 20:11 mobrovac@tin: Started deploy [graphoid/deploy@1707b3c]: Switch node_modules to node v6.11
  • 20:10 mobrovac@tin: Finished deploy [cxserver/deploy@f43ef96]: Switch node_modules to node v6.11 (duration: 02m 36s)
  • 20:09 demon@tin: Started scap: no-op, ideal timing scenario
  • 20:09 mobrovac@tin: Started deploy [citoid/deploy@43c2776]: Switch node_modules to Node v6.11
  • 20:08 mobrovac@tin: Started deploy [cxserver/deploy@f43ef96]: Switch node_modules to node v6.11
  • 20:01 demon@tin: Finished scap: group1 to wmf.11 (duration: 13m 22s)
  • 19:48 demon@tin: Started scap: group1 to wmf.11
  • 19:47 demon@tin: scap failed: RuntimeError scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details) (duration: 12m 03s)
  • 19:47 demon@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 19:35 demon@tin: Started scap: group1 to wmf.11
  • 19:30 mutante: mx1001 - temp disable puppet to test adjusted sudo privileges for an icinga check
  • 19:24 ejegg: disabled queue consumers for CiviCRM update
  • 19:06 bblack: cp1074: run-no-puppet varnish-backend-restart (mailbox lag in icinga)
  • 19:01 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
  • 19:01 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.wmnet
  • 19:01 gehel: depooling wdqs1001 for data reload - T166244
  • 18:50 niharika29@tin: Synchronized php-1.30.0-wmf.11/resources/src/mediawiki.rcfilters/: RCFilters: Followup I78e23f85c3: Don't disable RCFilters system when fetching results https://gerrit.wikimedia.org/r/#/c/367850/ (duration: 00m 46s)
  • 18:35 niharika29@tin: Synchronized php-1.30.0-wmf.11/resources/src/mediawiki.rcfilters/: RCFilters: Improve loading animation https://gerrit.wikimedia.org/r/#/c/367833/, RCFilters UI: Unbreak limit and days widgets in non-experimental mode https://gerrit.wikimedia.org/r/#/c/367837/ (duration: 00m 45s)
  • 18:20 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Create 'rollbacker' user group in frwiki https://gerrit.wikimedia.org/r/#/c/365538/ (duration: 00m 47s)
  • 17:46 bblack: nitrogen: disabled puppet agent, manually hacked puppetdb.service unit file, restarted puppetdb.service...
  • 17:12 moritzm: restarting gerrit to pick up Java security update
  • 17:11 moritzm: installing openjdk-8 security updates on cobalt and removing unused openjdk-7 packages
  • 16:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 (duration: 00m 45s)
  • 16:27 jynus: upgrading and rebooting db2070
  • 16:22 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2069, depool db2070 (duration: 00m 45s)
  • 16:14 moritzm: upgraded nodejs on restbase*
  • 15:48 jynus: upgrade and reboot db2069
  • 15:45 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2068, depool db2069, pool db2072 with more weight (duration: 00m 46s)
  • 15:39 moritzm: rolling upgrade/service restarts of nodejs in eqiad
  • 15:32 andrewbogott: patching puppetmaster1001, possible puppet hiccups coming up
  • 15:29 moritzm: upgrade nodejs on remaining scb hosts (along with service restarts)
  • 15:20 moritzm: upgrade nodejs on scb2001 (currently depooled for testing)
  • 15:17 jynus: restarting and upgrading db2068
  • 15:15 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2068 (duration: 00m 46s)
  • 14:40 moritzm: installing spice security updates
  • 14:21 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2072 (duration: 00m 45s)
  • 13:44 gehel: restarting cassandra on maps clusters
  • 13:36 zeljkof: EU SWAT finished
  • 13:34 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert eswikisource paths due to oversized logos (T170604) (duration: 00m 46s)
  • 13:24 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: HD logos for eswikivoyage and added some missing paths to the config (T170604) (duration: 00m 46s)
  • 13:22 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: HD logos for eswikivoyage and added some missing paths to the config (T170604) (duration: 00m 54s)
  • 12:12 moritzm: installing xorg-server updates from jessie 8.9 point release
  • 11:02 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 09:58 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: mw2119.codfw.wmnet
  • 09:58 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: mw2119.codfw.wmnet
  • 09:58 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2119.codfw.wmnet
  • 09:57 moritzm: reimaging mw2152 to jessie (T145742)
  • 09:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after maintenance (duration: 00m 46s)
  • 08:47 elukey: rollout logster 0.0.10-2~jessie1 to the cache hosts
  • 08:46 elukey: upload logster 0.0.10-2~jessie1 to jessie-wikimedia
  • 08:42 jynus: upgrading and restarting db1066
  • 08:26 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for maintenance (duration: 00m 46s)
  • 08:26 moritzm: reimaging mw2119 to jessie (T145742)
  • 08:02 moritzm: installing Java security updates on jessie-based stat systems
  • 07:59 moritzm: restarting cassandra-metrics-collector on maps* to pick up openjdk security update
  • 07:56 moritzm: restarting cassandra-metrics-collector on restbase* to pick up openjdk security update
  • 07:53 jynus: start defragmenging on pc1* hosts T167784
  • 07:14 ema: cp1008: use sdb only in varnish.service, waiting for Chris to replace sda T171028
  • 05:53 _joe_: moving all conf* servers to the future puppet parser
  • 03:01 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed
  • 01:15 reedy@tin: Synchronized multiversion/: phpcs (duration: 01m 06s)
  • 01:13 reedy@tin: Synchronized phpcs.xml: phpcs (duration: 00m 46s)
  • 01:12 reedy@tin: Synchronized tests/multiversion/: phpcs (duration: 00m 46s)

2017-07-25

  • 23:13 reedy@tin: Synchronized wmf-config/abusefilter.php: Allow contentadmin/sysop to configure blocking AbuseFilters (duration: 00m 46s)
  • 22:14 reedy@tin: Synchronized php-1.30.0-wmf.11/includes/specials/SpecialUndelete.php: T171523 (duration: 00m 46s)
  • 22:12 reedy@tin: Synchronized php-1.30.0-wmf.10/includes/specials/SpecialUndelete.php: T171523 (duration: 00m 47s)
  • 21:13 urandom: Rolling restart of eqiad Cassandra instances (applying OpenJDK update)
  • 21:12 ejegg: updated SmashPig from 523d6dd to f4ca53c
  • 20:39 ejegg: updated fundraising process-control to adb3325
  • 19:31 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.11
  • 18:52 mobrovac@tin: Finished deploy [restbase/deploy@36ca85f]: (no justification provided) (duration: 03m 49s)
  • 18:48 mobrovac@tin: Started deploy [restbase/deploy@36ca85f]: (no justification provided)
  • 18:48 mobrovac@tin: Finished deploy [restbase/deploy@36ca85f]: (no justification provided) (duration: 00m 49s)
  • 18:47 mobrovac@tin: Started deploy [restbase/deploy@36ca85f]: (no justification provided)
  • 18:46 mobrovac@tin: Finished deploy [restbase/deploy@36ca85f]: Switch to Node v6.11 - T170548 (duration: 05m 17s)
  • 18:42 krinkle@tin: Synchronized wmf-config/InitialiseSettings.php: Enable jQuery 3 on testwikis - I37a68472cf (duration: 00m 50s)
  • 18:41 mobrovac@tin: Started deploy [restbase/deploy@36ca85f]: Switch to Node v6.11 - T170548
  • 18:28 mobrovac: restbase upgrading node to v6.11 - T170548
  • 18:14 demon@tin: Finished scap: bootstrap wmf.11 (x2) (duration: 19m 23s)
  • 17:55 demon@tin: Started scap: bootstrap wmf.11 (x2)
  • 17:54 demon@tin: scap failed: RuntimeError scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details) (duration: 16m 32s)
  • 17:53 demon@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 17:49 urandom: Rolling restart of codfw Cassandra instances (applying OpenJDK update)
  • 17:40 halfak@tin: Finished deploy [ores/deploy@835d848]: T171505 (duration: 34m 56s)
  • 17:37 demon@tin: Started scap: bootstrap wmf.11
  • 17:33 jynus: creating new database on m1 (rddmarc) T170158
  • 17:10 demon@tin: Pruned MediaWiki: 1.30.0-wmf.9 [keeping static files] (duration: 01m 39s)
  • 17:05 halfak@tin: Started deploy [ores/deploy@835d848]: T171505
  • 16:16 jynus: about to delete orfphan files on einstenium T149557
  • 15:49 moritzm: installing imagemagick security updates on trusty hosts (jessie already fixed)
  • 15:31 cmjohnson1: updating firmware lvs1007 T167299
  • 15:03 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1015.eqiad.wmnet
  • 14:58 _joe_: enabled hyperthreading on restbase1015.eqiad.wmnet T162735, rebooting the server
  • 14:52 _joe_: shutting down restbase1015, T162735
  • 14:51 oblivian@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1015.eqiad.wmnet
  • 14:36 urandom: draining restbase1015.eqiad.wmnet T162735
  • 14:35 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2002.codfw.wmnet
  • 14:29 _joe_: enabled hyperthreading on restbase2002.codfw.wmnet T162735, rebooting the server
  • 14:23 _joe_: shutting down restbase2002, T162735
  • 14:14 oblivian@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 14:04 dcausse: restarting elastic on relforge100x servers to test new config
  • 13:32 moritzm: rebooting hydrogen for kernel update
  • 13:15 moritzm: rebooting achernar for kernel update
  • 13:13 hashar: Purged project-logos for eswikisource/eswikiquote high density logos T170604  : find static/images/project-logos -maxdepth 1 -type f| sed -e 's%^%https://en.wikipedia.org/%'
  • 13:11 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: High density logos for es.wikisource - T170604 (duration: 00m 43s)
  • 13:10 hashar@tin: Synchronized static/images/project-logos: High density logos for es.wikisource - T170604 (duration: 00m 44s)
  • 13:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: High density logos for es.wikiquote - T170604 (duration: 00m 49s)
  • 13:07 hashar@tin: Synchronized static/images/project-logos: High density logos for es.wikiquote - T170604 (duration: 00m 46s)
  • 13:05 dcausse: restarting elastic relforge100x servers to pick up new version of the ltr plugin
  • 12:56 elukey: rolling restart of aqs* for jvm upgrades
  • 12:45 moritzm: enabling mw1260 (jessie-based video scaler) for job processing
  • 12:44 gehel: restarting cassandra on maps clusters
  • 12:37 godog: powercycle ms-be1016, couldn't get getty output from console
  • 12:09 godog: upgrade diamond to 4.0.515 in eqiad - T97635
  • 12:08 aude: ran rebuildTermSqlIndex.php on test.wikidata
  • 11:59 jynus: testing defragmenting pc2004 - if lag is created, ignore
  • 11:58 moritzm: installing binutils update from jessie point release
  • 11:20 marostegui: Start a run of "timeout 10h purgeParserCache.php" on terbium, which will be killed at around 21:00 UTC so it doesn't overlap with the normal cron run - T167784
  • 11:16 moritzm: upgrading/restarting logstash* for openjdk security update
  • 11:13 marostegui: Killing old running instances of purgeParserCache.php in terbium - https://phabricator.wikimedia.org/T167784
  • 11:02 moritzm: installing openjdk security updates on elastic*
  • 10:50 moritzm: installing openjdk security updates on restbase*
  • 10:41 marostegui: Run mwscript purgeParserCache.php --wiki=aawiki --age=1900800 --msleep 500 from terbium - T167784
  • 10:28 marostegui@tin: Synchronized wmf-config/InitialiseSettings.php: Parsercache: Reduce expiration time to 22 days - T167784 (duration: 00m 44s)
  • 10:27 moritzm: upgrade restbase2010 to latest OpenJDK security update
  • 09:20 godog: upgrade diamond to 4.0.515 in codfw - T97635
  • 09:15 moritzm: upgrade restbase-test* and restbase-dev* to latest OpenJDK security update
  • 09:14 godog: upgrade diamond to 4.0.515 in ulsfo and esams - T97635
  • 07:27 moritzm: installing apache security updates on app servers in eqiad
  • 05:29 mattflaschen@tin: Synchronized wmf-config/CommonSettings-labs.php: Article reminder: Beta Cluster only (duration: 00m 44s)
  • 03:00 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2017-07-24

  • 23:29 eileen1: update process-control from 915bbf9 to 4eb053d
  • 23:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES on sqwiki and rowiki (T170723) (duration: 00m 44s)
  • 22:35 ejegg: updated payments-wiki from c3be2bf to 084d0f9
  • 21:43 eileen1: update CiviCRM from 382a189 to 461900e
  • 21:11 dcausse: unbanning elastic1027/elastic1017
  • 21:04 reedy@tin: Synchronized wmf-config/CommonSettings.php: Add a global email blacklist (duration: 00m 43s)
  • 20:59 dcausse: banning elastic1027 after elastic1017 to move shards around
  • 20:26 bsitzmann@tin: Finished deploy [mobileapps/deploy@2b4ca3b]: Update mobileapps to b608ec8 (duration: 04m 06s)
  • 20:22 bsitzmann@tin: Started deploy [mobileapps/deploy@2b4ca3b]: Update mobileapps to b608ec8
  • 19:55 ebernhardson: ban elastic1031 from elasticsearch cluster, it's overloaded
  • 19:30 otto@tin: Finished deploy [eventlogging/analytics@41e3418]: unique index only for id columns (duration: 00m 02s)
  • 19:30 otto@tin: Started deploy [eventlogging/analytics@41e3418]: unique index only for id columns
  • 19:26 reedy@tin: Synchronized tests/: phpcs (duration: 00m 43s)
  • 19:26 reedy@tin: Synchronized wmf-config/: phpcs (duration: 00m 44s)
  • 18:36 reedy@tin: Synchronized composer.lock: phpunit (duration: 00m 43s)
  • 18:35 reedy@tin: Synchronized composer.json: phpunit (duration: 00m 43s)
  • 18:30 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes (duration: 00m 12s)
  • 18:30 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes
  • 18:29 reedy@tin: Synchronized wmf-config/CommonSettings.php: T153271 (duration: 00m 43s)
  • 18:28 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes (duration: 00m 10s)
  • 18:28 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes
  • 18:27 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes (duration: 00m 14s)
  • 18:27 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes
  • 18:27 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes (duration: 00m 39s)
  • 18:26 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: statsd dns fixes
  • 18:25 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2003 (duration: 00m 17s)
  • 18:25 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2003
  • 18:25 reedy@tin: Synchronized wmf-config/CommonSettings.php: T169478 T169481 (duration: 00m 42s)
  • 18:24 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: T169478 T169481 (duration: 00m 43s)
  • 18:20 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: T159895 (duration: 00m 43s)
  • 18:19 reedy@tin: Synchronized php-1.30.0-wmf.10/includes/specials/pagers/UsersPager.php: T171332 (duration: 00m 43s)
  • 18:18 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2002 (duration: 01m 06s)
  • 18:17 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2002
  • 18:16 reedy@tin: Synchronized wmf-config/Wikibase.php: T125500 (duration: 00m 43s)
  • 18:15 reedy@tin: Synchronized php-1.30.0-wmf.10/extensions/Echo/modules/styles/mw.echo.ui.NotificationBadgeWidget.less: T171302 (duration: 00m 45s)
  • 18:15 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2001 (duration: 00m 04s)
  • 18:14 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2001
  • 18:13 reedy@tin: Synchronized php-1.30.0-wmf.10/extensions/Thanks/extension.json: T170917 (duration: 00m 43s)
  • 18:11 reedy@tin: Synchronized wmf-config/unitConversionConfig.json: T168582 (duration: 00m 43s)
  • 18:06 otto@tin: Finished deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2001 (duration: 01m 39s)
  • 18:05 otto@tin: Started deploy [eventlogging/eventbus@c1c2c39]: test deploy with scap depool on kafka2001
  • 18:03 reedy@tin: Synchronized phpcs.xml: phpcs (duration: 00m 43s)
  • 18:02 reedy@tin: Synchronized tests: phpcs (duration: 00m 43s)
  • 18:01 reedy@tin: Synchronized w: phpcs (duration: 00m 43s)
  • 18:00 reedy@tin: Synchronized docroot/: phpcs (duration: 00m 44s)
  • 17:59 reedy@tin: Synchronized wmf-config/: phpcs (duration: 00m 45s)
  • 17:40 gehel@tin: Finished deploy [wdqs/wdqs@c1b5c27]: (no justification provided) (duration: 01m 58s)
  • 17:38 gehel@tin: Started deploy [wdqs/wdqs@c1b5c27]: (no justification provided)
  • 16:20 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: service=pdfrender
  • 15:29 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: service=pdfrender
  • 14:29 marostegui: Run maintain-views on labsdb1009, labsdb1010 and labsdb1011 for s2 wikis - T153743
  • 14:15 marostegui: Global rename of Carrotkit - T171474
  • 14:11 zeljkof: EU SWAT finished
  • 14:07 zeljkof: extending EU SWAT until https://gerrit.wikimedia.org/r/#/c/367384/ is deployed
  • 14:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: pagePreviews: Increase i13n sampling rate for ruwiki (T171325) (duration: 00m 43s)
  • 14:03 hashar: mwdebug1002 ran scap pull
  • 13:57 zfilipin@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 13:53 moritzm: installing bind security updates (we're using client-side libs/tools only)
  • 13:45 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add some import sources for tawikisource (T171395) (duration: 00m 43s)
  • 13:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow flooders to remove themselves from the flood group on zhwiki (171379) (duration: 00m 43s)
  • 13:28 paravoid: upgrading nagios-nrpe-server to 3.0.1-3+deb9u1 on all stretch hosts
  • 13:23 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: pagePreviews: Increase instrumentation sampling rate (T171325) (duration: 00m 43s)
  • 13:14 gehel: restarting elasticsearch on relforge for jmv upgrade
  • 13:12 aude@tin: Synchronized wmf-config/Wikibase.php: Bump cache epoch for wikidata (duration: 00m 43s)
  • 13:04 aude@tin: Synchronized php-1.30.0-wmf.10/extensions/Wikidata: Fix several Wikidata bugs (duration: 02m 10s)
  • 11:13 moritzm: updates for jessie 8.8 and stretch 9.1 point updates
  • 10:22 godog: roll restart thumbor to apply new memory limits
  • 10:09 moritzm: installing openjdk security updates on praseodymium/cerium/xenon
  • 09:30 moritzm: uploaded openjdk-8 8u145-b15 to apt.wikimedia.org/jessie-wikimedia
  • 09:04 godog: restart thumbor on thumbor1004 with MemoryLimit=8G
  • 08:29 godog: restart thumbor on thumbor1001 temporarily without memory cgroup limitations
  • 07:17 marostegui: Rename table old_growth on db1089 - T115982
  • 07:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Fix indenting (duration: 00m 43s)
  • 07:14 jynus@tin: Synchronized wmf-config/StartProfiler.php: Fix indenting (duration: 00m 43s)
  • 07:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Fix indenting (duration: 00m 45s)
  • 06:45 moritzm: installing apache security updates on appserver canaries
  • 05:53 eileen: CiviCRM updated from 38f246d to 382a189
  • 05:40 marostegui: Configure and start s2 replication on labsdb1011 - T153743
  • 03:05 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2017-07-23

  • 23:26 legoktm@tin: Synchronized php-1.30.0-wmf.10/includes/page/Article.php: [SECURITY] Restore ability to suppress pages while deleting - T171405 (duration: 00m 45s)
  • 12:06 hoo: Restarted hhvm and apache2 on mwdebug1001
  • 12:02 hashar: CI should self recover when the queue is processed. Will check again in an hour or so
  • 12:02 hashar: CI is overloaded due to a mass update of mediawiki-codesniffer to 0.10.1

2017-07-21

  • 22:21 eileen: civicrm updated from 74f9588 to 38f246d
  • 21:29 jynus_: dropping enwiki database from dbstore2002:3306 (default instance) - new s1 already imported on 3311
  • 19:58 hashar: Restarting Jenkins
  • 17:02 jynus: now that db2072 is compressed and fixed, stop it to finally clone it to dbstore2002 T171321
  • 16:36 Reedy: run namespaceDupes.php against tawikisource T165813
  • 15:16 jynus: restarting replication on db2072 after maintenance T151029
  • 15:02 moritzm: installation apache security updates on labmon1001 and netmon*
  • 14:59 moritzm: installation apache security updates on krypton and auth*
  • 14:54 hashar: Restarting Jenkins
  • 14:33 _joe_: ocg started again on ocg1003
  • 14:30 _joe_: stopping ocg temporarily on ocg1003, T162780
  • 13:39 moritzm: installation apache security updates on hafnium, bromine, krypton, rutherfordium
  • 13:29 moritzm: installing apache security updates on fermium/lists.wikimedia.org
  • 12:50 moritzm: rebooting cp* spares for kernel update
  • 11:51 godog: run compiler-update-facts
  • 10:55 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ocg1001.eqiad.wmnet
  • 09:42 godog: add 100G to graphite2002/graphite1003 vgs
  • 08:44 jynus: stopping replication on db2072 to fix some duplicate key errors
  • 08:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fix some indents (duration: 00m 43s)
  • 07:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T166204 (duration: 00m 44s)
  • 02:24 eileen: update CiviCRM from 2de7f2a to 74f9588
  • 00:48 reedy@tin: Synchronized search-redirect.php: phpcs (duration: 00m 43s)
  • 00:47 reedy@tin: Synchronized rpc/RunJobs.php: phpcs (duration: 00m 43s)
  • 00:46 reedy@tin: Synchronized w: phpcs (duration: 00m 43s)
  • 00:45 reedy@tin: Synchronized wmf-config/: phpcs (duration: 00m 44s)
  • 00:44 reedy@tin: Synchronized tests/: phpcs (duration: 00m 43s)
  • 00:22 reedy@tin: Synchronized wmf-config/wikitech.php: phpcs (duration: 00m 43s)
  • 00:21 reedy@tin: Synchronized docroot/noc/db.php: phpcs (duration: 00m 43s)
  • 00:11 reedy@tin: Synchronized phpcs.xml: phpcs (duration: 00m 43s)
  • 00:09 reedy@tin: Synchronized docroot/: phpcs (duration: 00m 44s)
  • 00:05 reedy@tin: Synchronized wmf-config/missing.php: phpcs (duration: 00m 43s)

2017-07-20

  • 23:57 reedy@tin: Synchronized wmf-config/: phpcs (duration: 00m 45s)
  • 23:56 reedy@tin: Synchronized w: phpcs (duration: 00m 43s)
  • 23:55 reedy@tin: Synchronized tests: phpcs.xml (duration: 00m 42s)
  • 23:54 reedy@tin: Synchronized errorpages/404.php: phpcs (duration: 00m 43s)
  • 23:53 reedy@tin: Synchronized docroot/: phpcs (duration: 00m 44s)
  • 23:36 reedy@tin: Synchronized php-1.30.0-wmf.10/extensions/Wikidata: Update Wikidata - fix uncaught exception in constraints (duration: 02m 09s)
  • 23:15 reedy@tin: Synchronized wmf-config/Wikibase-production.php: T169647 T168938 (duration: 00m 42s)
  • 23:10 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Author namespace for tawikisource T165813 (duration: 00m 43s)
  • 23:06 reedy@tin: Synchronized phpcs.xml: phpcs (duration: 00m 43s)
  • 23:05 reedy@tin: Synchronized w/health-check.php: phpcs (duration: 00m 43s)
  • 23:04 reedy@tin: Synchronized errorpages/hhvm-fatal-error.php: phpcs (duration: 00m 44s)
  • 23:03 reedy@tin: Synchronized docroot/search.wikimedia.org/index.php: phpcs (duration: 00m 43s)
  • 23:02 reedy@tin: Synchronized wmf-config/: phpcs (duration: 00m 45s)
  • 23:01 reedy@tin: Synchronized tests: phpcs (duration: 00m 44s)
  • 22:22 reedy@tin: Synchronized wmf-config/CommonSettings.php: phpcs (duration: 00m 43s)
  • 22:21 reedy@tin: Synchronized search-redirect.php: phpcs (duration: 00m 43s)
  • 22:20 reedy@tin: Synchronized phpcs.xml: phpcs (duration: 00m 43s)
  • 22:19 reedy@tin: Synchronized tests: phpcs (duration: 00m 44s)
  • 22:18 reedy@tin: Synchronized wmf-config/: phpcs (duration: 00m 46s)
  • 20:29 nuria@tin: Finished deploy [eventlogging/analytics@c1c2c39]: (no justification provided) (duration: 00m 02s)
  • 20:29 nuria@tin: Started deploy [eventlogging/analytics@c1c2c39]: (no justification provided)
  • 19:56 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1003.eqiad.wmnet
  • 19:14 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.10
  • 18:46 otto@tin: Finished deploy [eventlogging/analytics@36846d6]: auto add mysql indexes for meta style events (duration: 00m 04s)
  • 18:46 otto@tin: Started deploy [eventlogging/analytics@36846d6]: auto add mysql indexes for meta style events
  • 18:37 andrewbogott: upgraded mediawiki version on wikitech-static
  • 18:36 thcipriani@tin: Synchronized php-1.30.0-wmf.9/resources/src/mediawiki.widgets.visibleByteLimit/mediawiki.widgets.visibleByteLimit.js: SWAT: mw.widgets.visibleByteLimit: Temporarily disable whilst OOjs UI label bug is fixed T169982 (duration: 00m 47s)
  • 18:35 thcipriani@tin: Synchronized php-1.30.0-wmf.10/resources/src/mediawiki.widgets.visibleByteLimit/mediawiki.widgets.visibleByteLimit.js: SWAT: mw.widgets.visibleByteLimit: Temporarily disable whilst OOjs UI label bug is fixed T169982 (duration: 00m 48s)
  • 18:24 thcipriani@tin: Synchronized php-1.30.0-wmf.10/skins/MonoBook/main.css: SWAT: Revert "Remove `position: absolute` and z-index from #p-logo" T171195 (duration: 00m 47s)
  • 18:16 thcipriani@tin: Synchronized static/images/project-logos: SWAT: Fix hywiki big and medium logos (duration: 00m 47s)
  • 17:54 arlolra: Updated Parsoid to a89a9cc4 (T169293)
  • 17:48 arlolra@tin: Finished deploy [parsoid/deploy@97dbabb]: Updating Parsoid to a89a9cc4 (duration: 09m 09s)
  • 17:39 arlolra@tin: Started deploy [parsoid/deploy@97dbabb]: Updating Parsoid to a89a9cc4
  • 17:16 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: wikidata back to wmf.10
  • 17:14 ottomata: killed tranquility instances tranq-banners and tranq-netflow running on druid1003 in joal's screen sessions
  • 14:41 godog: upload diamond 4.0.515-4~bpo8+2 to jessie-wikimedia - T97635
  • 14:33 ema@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org
  • 14:31 ema@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org
  • 14:23 godog: upload diamond 4.0.515-4~bpo8+1 to jessie-wikimedia - T97635
  • 14:10 andrewbogott: upgrading apache on labs via "dpkg -s apache2 && apt-get -y install apache2"
  • 14:07 mobrovac@tin: Finished deploy [restbase/deploy@5aa7bc1]: Translation API bug fix (duration: 07m 58s)
  • 14:00 godog: test diamond 4.0.515-4~bpo8+1 on cp1008
  • 13:59 mobrovac@tin: Started deploy [restbase/deploy@5aa7bc1]: Translation API bug fix
  • 13:59 mobrovac@tin: Finished deploy [restbase/deploy@5aa7bc1] (staging): (no justification provided) (duration: 01m 31s)
  • 13:57 mobrovac@tin: Started deploy [restbase/deploy@5aa7bc1] (staging): (no justification provided)
  • 13:52 moritzm: uprading nodejs on wtp*
  • 13:42 ema: cp1050 stuck at 'Initializing firmware interfaces...', trying to powerdown/powerup
  • 13:37 zeljkof: EU SWAT finished
  • 13:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Revert "Stop RelatedArticles A/B test and clean up config"" (T169948) (duration: 00m 46s)
  • 13:29 cmjohnson1: downtimed restbase-dev100[1-3] to power off and move ssds to newly racked restbase-dev100[4-6] phab task: T166181
  • 13:29 ema: cp1050 stuck rebooting, power-cycling
  • 13:28 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Revert "Revert "Stop RelatedArticles A/B test and clean up config"" (T169948) (duration: 00m 47s)
  • 13:12 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T171146) (duration: 00m 48s)
  • 12:58 ema@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org
  • 12:55 ema@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org
  • 12:37 ema@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org
  • 12:25 ema@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org
  • 09:04 ema: eqiad cache_text/upload: upgrade to varnish 4.1.7-1wm1 and reboot for kernel updates
  • 09:04 hashar: Restored CI cache storage (castor) on a fresh new instance. Cache is empty though so jobs will be a bit slower until the cache is populated - T171148
  • 09:02 moritzm: uploaded apache2 2.4.10-10+deb8u10+wmf1 (rebase of WMF-specific patches on top of latest DSA) to apt.wikimedia.org/jessie
  • 08:34 marostegui: Force a BBU relearn on db1016 - T166344
  • 08:29 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3048.esams.wmnet
  • 08:25 hashar: CI is restored albeit in degraded mode (lack of Castor cache) - T171148
  • 08:01 marostegui: Stop replication on labsdb1011 for maintenance - T153743
  • 07:55 marostegui: Start importing s2 into labsdb1011 - T153743
  • 07:48 godog: restart diamond on serpens/seaborgium to pick up the updated CA
  • 07:41 elukey: powercycle cp3048 - mgmt reachable - T171145
  • 06:54 marostegui: Force a BBU relearn on db1016 - T166344
  • 06:24 mutante: netmon1002 - librenms: fix permissions on /srv/librenms/rrd data after rsyncing, mismatching UIDs vs netmon1001 and rsyncd in chroot-issue
  • 06:00 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=cp3048.esams.wmnet
  • 05:46 TimStarling: on contint1001 restarted zuul and zuul-merger
  • 05:30 TimStarling: on contint1001 restarted jenkins
  • 05:05 marostegui: Configure replication for s2 on labsdb1009 and labsdb1010 - T153743
  • 04:42 mutante: netmon1002 - restarted Apache for LDAP issue - librenms.wm.org switched back to it, after rsyncing rrd data, re-enabling puppet
  • 04:05 andrewbogott: restarting rabbitmq-server on labcontrol1001
  • 03:34 andrewbogott: service nova-network restart on labnet1001
  • 03:32 andrewbogott: service uwsgi-labspuppetbackend restart on labcontrol1001
  • 03:02 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed
  • 02:22 mutante: netmon1001 - rsyncing librenms rrd data to netmon1002 - T159756
  • 01:17 andrewbogott: restarting keystone on labcontrol1001
  • 01:14 twentyafterfour: phabricator upgrade complete
  • 01:10 twentyafterfour: begin (belated) phabricator upgrade, expect momentary downtime.
  • 00:09 dereckson@tin: Synchronized php-1.30.0-wmf.9/resources/src/mediawiki.widgets/mw.widgets.SearchInputWidget.js: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (3/3) (duration: 00m 46s)
  • 00:08 dereckson@tin: Synchronized php-1.30.0-wmf.9/resources/Resources.php: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (2/3) (duration: 00m 46s)
  • 00:08 dereckson@tin: Synchronized php-1.30.0-wmf.9/includes/widget/SearchInputWidget.php: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (1/3) (duration: 00m 46s)
  • 00:06 dereckson@tin: Synchronized php-1.30.0-wmf.10/resources/src/mediawiki.widgets/mw.widgets.SearchInputWidget.js: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (3/3) (duration: 00m 46s)
  • 00:04 dereckson@tin: Synchronized php-1.30.0-wmf.10/resources/Resources.php: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (2/3) (duration: 00m 46s)
  • 00:03 dereckson@tin: Synchronized php-1.30.0-wmf.10/includes/widget/SearchInputWidget.php: Revert "Make mw.widgets.SearchInputWidget extend OO.ui.SearchInputWidget" (1/3) (duration: 00m 46s)

2017-07-19

  • 23:29 dereckson@tin: Synchronized wmf-config/Wikibase-production.php: Use correct class name for JsonUnitStorage (T171107) (duration: 00m 48s)
  • 22:14 reedy@tin: Synchronized multiversion/: (no justification provided) (duration: 01m 11s)
  • 22:01 mutante: hafnium, labmon1001 - restarted apache
  • 22:00 demon@tin: Finished scap: all kinds of code style stuff for James_F & Reedy (duration: 05m 23s)
  • 21:59 mutante: bromine _transparency.wm.org - restarted apache
  • 21:59 mutante: dbmonitor2001 - restarted apache
  • 21:57 ejegg: re-enabled CiviCRM de-dupe job
  • 21:56 mutante: graphite200* - restarted apache
  • 21:54 demon@tin: Started scap: all kinds of code style stuff for James_F & Reedy
  • 21:52 mutante: netmon1003 - puppet run, restarted apache - fixed servermon.wikimedia.org
  • 21:50 mutante: tegmen - restarted apache
  • 21:47 mutante: netmon1001 - adding manual ferm rule for 80/443 - fixed librenms.wm.org
  • 21:44 mutante: netmon1001 (librenms) - re-enable puppet once to get new CA, restart Apache, disable puppet again
  • 21:39 jynus: reloading haproxy on dbproxy1005 for repooling db1009
  • 21:35 mutante: graphite1001 - restarted apache, ran puppet
  • 21:34 chasemp: labstore1004/1005 puppet agent --test && service nslcd restart
  • 21:31 RainbowSprinkles: running puppet & restarting gerrit/apache on cobalt/gerrit2001
  • 21:29 mutante: tungsten - restarted apache for CA change (xhgui)
  • 21:26 mutante: logstash1001/1002 - restarted apache for CA change (logstash/kibana back)
  • 21:25 RainbowSprinkles: Ran puppet and restarted apache on logstash100[1..3]
  • 21:23 madhuvishy: Ran puppet and restarted apache on thorium (Runs hue, yarn, and pivot)
  • 21:21 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: most of group1 back on wmf.10
  • 21:05 mutante: krypton - run puppet, restart apache, fixed grafana-admin
  • 20:19 andrewbogott: restaring slapd on seaborgium
  • 20:15 mobrovac@tin: Finished deploy [restbase/deploy@3bb90c9]: (no justification provided) (duration: 09m 19s)
  • 20:13 chasemp: seaborgium:~# service slapd restart
  • 20:12 chasemp: serpens:~# service slapd restart
  • 20:05 mobrovac@tin: Started deploy [restbase/deploy@3bb90c9]: (no justification provided)
  • 20:05 mobrovac@tin: Finished deploy [restbase/deploy@3bb90c9] (staging): (no justification provided) (duration: 01m 39s)
  • 20:03 mobrovac@tin: Started deploy [restbase/deploy@3bb90c9] (staging): (no justification provided)
  • 19:57 urandom: Restarting Cassandra; restbase-dev1001-a to apply additional data_file_directory (T170276)
  • 19:50 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: Abort wmf.10
  • 19:48 demon@tin: Finished scap: group1 to wmf.10 + symlink swap (duration: 21m 37s)
  • 19:38 ejegg: disabled civicrm dedupe job
  • 19:27 demon@tin: Started scap: group1 to wmf.10 + symlink swap
  • 19:16 mobrovac@tin: Finished deploy [restbase/deploy@c5938f4]: Expose the translation API end points and fix SwaggerUI - T107914 T170729 (duration: 08m 02s)
  • 19:08 mobrovac@tin: Started deploy [restbase/deploy@c5938f4]: Expose the translation API end points and fix SwaggerUI - T107914 T170729
  • 19:07 mobrovac@tin: Finished deploy [restbase/deploy@c5938f4] (staging): (no justification provided) (duration: 01m 42s)
  • 19:05 mobrovac@tin: Started deploy [restbase/deploy@c5938f4] (staging): (no justification provided)
  • 19:05 niharika29@tin: Synchronized php-1.30.0-wmf.10/maintenance/updateRestrictions.php: Set batch size to 1000 in updateRestrictions https://gerrit.wikimedia.org/r/#/c/366301/ (duration: 00m 47s)
  • 19:04 niharika29@tin: Synchronized php-1.30.0-wmf.9/maintenance/updateRestrictions.php: Set batch size to 1000 in updateRestrictions https://gerrit.wikimedia.org/r/#/c/366302/ (duration: 00m 46s)
  • 18:49 niharika29@tin: Synchronized php-1.30.0-wmf.10/includes/collation/IcuCollation.php: Update FIRST_LETTER_VERSION for rowiki changes https://gerrit.wikimedia.org/r/#/c/366299/ (duration: 00m 46s)
  • 18:47 niharika29@tin: Synchronized php-1.30.0-wmf.9/includes/collation/IcuCollation.php: Update FIRST_LETTER_VERSION for rowiki changes https://gerrit.wikimedia.org/r/#/c/366298/ (duration: 00m 46s)
  • 18:40 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Remove 'din' from wmgExtraLanguageNames [mediawiki-config] - https://gerrit.wikimedia.org/r/362876 (https://phabricator.wikimedia.org/T168523) (duration: 00m 47s)
  • 18:25 ariel@tin: Finished deploy [dumps/dumps@63705de]: write list of special dump files with no dump job content (duration: 00m 02s)
  • 18:25 ariel@tin: Started deploy [dumps/dumps@63705de]: write list of special dump files with no dump job content
  • 18:24 niharika29@tin: Synchronized php-1.30.0-wmf.10/includes/collation/IcuCollation.php: IcuCollation: Fix diacritic characters for Romanian (ro) headings https://gerrit.wikimedia.org/r/#/c/366296/ (duration: 00m 46s)
  • 18:23 niharika29@tin: Synchronized php-1.30.0-wmf.9/includes/collation/IcuCollation.php: IcuCollation: Fix diacritic characters for Romanian (ro) headings https://gerrit.wikimedia.org/r/#/c/366295/ (duration: 00m 47s)
  • 16:35 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1002.eqiad.wmnet
  • 16:27 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1003.eqiad.wmnet
  • 16:27 robh@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs1003.eqiad.wmnet
  • 16:23 robh@puppetmaster1001: conftool action : set/pooled=active; selector: name=wdqs1002.eqiad.wmnet
  • 16:16 marostegui: Compressing innodb on dbstore1002 for the following wikis: viwiki ukwiki kowiki huwiki hewiki frwiktionary fawiki eswiki cawiki arwiki - T168303
  • 16:10 robh@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1196.eqiad.wmnet
  • 16:04 robh: mw1196 has hardware failure and is being decommissioned
  • 16:04 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1196.eqiad.wmnet
  • 14:44 hashar: Restarting Jenkins
  • 14:31 moritzm: installing imagemagick security updates
  • 13:55 _joe_: running clear-host-cache.js for ocg1001 decommission T170886
  • 13:53 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: cluster=pdf,name=ocg1001.eqiad.wmnet
  • 13:46 marostegui: Compress database rowiki on dbstore1002 - T168303
  • 13:45 elukey: restart hive-server on analytics1003 - Java OOM issue due to a huge query
  • 13:34 hashar: European SWAT completed
  • 13:32 elukey: Limit the access to the conf* zookeeper ports via ferm rules - https://gerrit.wikimedia.org/r/366228
  • 13:29 hashar: Purged all 1685 project-logos ( find static/images/project-logos -maxdepth 1 -type f| sed -e 's%^%https://en.wikipedia.org/%'%7Cmwscript purgeList.php --wiki=enwiki )
  • 13:26 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Change timezone on nl.wikinews to Europe/Berlin - T170985 (duration: 00m 44s)
  • 13:24 marostegui: Optimize EditConflict_8860941_15423246 and Echo_7731316 on dbstore1002 - T168303
  • 13:17 hashar@tin: Synchronized static/images/project-logos/nlwikinews.png: Change logo on nl.wikinews - T170984 (duration: 00m 47s)
  • 13:11 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: nescio.wikimedia.org
  • 13:10 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Update wikiversity logos to 2017 - T160491 (duration: 00m 46s)
  • 13:09 hashar@tin: Synchronized static/images/project-logos: Update wikiversity logos to 2017 - T160491 (duration: 00m 48s)
  • 13:05 hashar@tin: Synchronized wmf-config/throttle.php: Extend throttle rule - T170844 (duration: 00m 48s)
  • 12:56 moritzm: rebooting nescio (DNS recursor) for kernel update
  • 12:55 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: nescio.wikimedia.org
  • 12:49 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: maerlant.wikimedia.org
  • 12:40 Reedy: running foreachwiki updateRestrictions.php T166184
  • 12:34 moritzm: rebooting maerlant (DNS recursor) for kernel update
  • 12:29 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: maerlant.wikimedia.org
  • 12:10 ema@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org
  • 12:09 ema@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org
  • 12:07 ariel@tin: Finished deploy [dumps/dumps@f95292e]: fix api call bug, page range query min pages (duration: 00m 03s)
  • 12:07 ariel@tin: Started deploy [dumps/dumps@f95292e]: fix api call bug, page range query min pages
  • 11:40 kartik@tin: Finished deploy [cxserver/deploy@1029833]: Update cxserver to d28ad0c (duration: 03m 01s)
  • 11:37 moritzm: rebooting acamar (DNS recursor) for kernel update
  • 11:37 kartik@tin: Started deploy [cxserver/deploy@1029833]: Update cxserver to d28ad0c
  • 10:58 marostegui: Global rename of user Moros - T170941
  • 10:09 ema: ulsfo cache_text/upload: upgrade to varnish 4.1.7-1wm1 and reboot for kernel updates
  • 10:06 marostegui: Deploy alter table on s1 - db1051 - T166204
  • 10:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T166204 (duration: 00m 47s)
  • 10:01 filippo@tin: Finished deploy [statsv/statsv@0a86be8]: (no justification provided) (duration: 00m 03s)
  • 10:01 filippo@tin: Started deploy [statsv/statsv@0a86be8]: (no justification provided)
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T166204 (duration: 00m 47s)
  • 09:16 ema: finish up codfw cache_text/upload varnish/kernel upgrades
  • 09:05 oblivian@neodymium: conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=(restbase-async|citoid)
  • 09:03 XioNoX: codfw repooled in dns - T170380
  • 09:01 hashar: restarting nodepool for upgrade 0.1.1-wmf7 -> 0.1.1-wmf8
  • 08:48 moritzm: uploaded nodepool 0.1.1+wmf8 to apt.wikimedia.org
  • 08:29 XioNoX: asw-c-codfw back online - T170380
  • 08:28 XioNoX: asw-c-codfw restarted 8min ago for switch upgrade - T170380
  • 08:09 akosiaris: disable librenms crons on netmon1002 for a while
  • 07:57 marostegui: Drop migrateuser_medium from s7 - T170310
  • 07:49 oblivian@neodymium: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=(restbase-async|citoid)
  • 05:26 _joe_: ran systemctl reset-failed on codfw jobrunners after the jobrunner process was activated by mistake running scap at 21.20 UTC yesterday
  • 03:03 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed
  • 01:27 mutante: netmon1001 - stopping all the services, killing snmpwalk, disarming keyholder
  • 00:35 reedy@tin: Synchronized wmf-config/CommonSettings.php: Remove rcs1001 and rcs1002 from CommonSettings wgRCFeeds. Stops a load of logspam T170157 (duration: 00m 48s)

2017-07-18

  • 23:53 mutante: netmon1002 - copied Letsencrypt cert/key for librenms from netmon1001 for migration after netmon1002 has been reinstalled and now has RAID. (T159756)
  • 23:40 thcipriani@tin: Synchronized wmf-config/InterwikiSortOrders.php: SWAT: Add din to InterwikiSortOrders T168518 (duration: 00m 46s)
  • 23:35 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Welsh mobile logo (just changes 'k' to 'c' PART II (duration: 00m 46s)
  • 23:34 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-cy.svg: SWAT: Add Welsh mobile logo (just changes 'k' to 'c' PART I (duration: 00m 47s)
  • 23:27 thcipriani@tin: Synchronized php-1.30.0-wmf.9/extensions/Thanks/extension.json: SWAT: Add missing jQueryMsg dependency for mobile diff view T170917 (duration: 00m 47s)
  • 23:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable OOjs UI EditPage buttons on all Wikipedias T162849 (duration: 00m 47s)
  • 23:13 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable CodeMirror on simplewiki for better testing and more exposure (duration: 00m 48s)
  • 22:58 thcipriani: restared jobrunner on mw1299.eqiad.wmnet mw1168.eqiad.wmnet mw1164.eqiad.wmnet mw1305.eqiad.wmnet mw1304.eqiad.wmnet mw1301.eqiad.wmnet mw1259.eqiad.wmnet mw1166.eqiad.wmnet mw1300.eqiad.wmnet
  • 22:42 krinkle@tin: Finished deploy [jobrunner/jobrunner@5f6099f]: (no justification provided) (duration: 08m 18s)
  • 22:34 krinkle@tin: Started deploy [jobrunner/jobrunner@5f6099f]: (no justification provided)
  • 22:02 krinkle@tin: Finished deploy [jobrunner/jobrunner@5f6099f]: (no justification provided) (duration: 07m 58s)
  • 21:54 krinkle@tin: Started deploy [jobrunner/jobrunner@5f6099f]: (no justification provided)
  • 21:43 Krinkle: Attempt to deploy mediawiki/services/jobrunner – https://gerrit.wikimedia.org/r/#/c/349364/ - failed.
  • 19:56 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2202.codfw.wmnet
  • 19:48 robh: starting wipe on cp400[1-4] per T169020
  • 19:15 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.10
  • 18:59 demon@tin: Synchronized php-1.30.0-wmf.9/extensions/MobileFrontend/extension.json: One (more) last thing (duration: 02m 49s)
  • 18:51 demon@tin: Synchronized php-1.30.0-wmf.9/extensions/MobileFrontend/extension.json: One last thing (duration: 02m 55s)
  • 18:42 mutante: netmon1002 - reinstall OS - didn't use the right partman recipe - didn't have md0 - revoke old puppet cert , salt-key, scheduled downtime, services over at netmon2001
  • 18:36 mutante: mw2202 - scheduled downtime - mainboard replacement
  • 18:36 ejegg: updated payments-wiki from bdc5226 to c3be2bf
  • 18:29 demon@tin: Finished scap: mobilefrontend wmf.9 + forced l10n rebuild (duration: 20m 53s)
  • 18:26 mutante: mw2202 - remove /etc/udev/rules.d/70-persistent-net.rules for mainboard replacement - to detect new NICs with new MACs (T170307)
  • 18:24 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw2202.codfw.wmnet
  • 18:08 demon@tin: Started scap: mobilefrontend wmf.9 + forced l10n rebuild
  • 18:02 ottomata: stopping kafka on kafka1012 again, i think we swapped the wrong disk T168927
  • 17:55 awight@tin: Finished deploy [ores/deploy@1d35aa5]: T170485 (duration: 35m 06s)
  • 17:47 mutante: smokeping - switched to netmon2001 - ping times to codfw hosts went down - ping times to eqiad hosts went up - since service is on both but data has been synced over
  • 17:41 demon@tin: Synchronized wmf-config/InitialiseSettings.php: labtest typofix for tgr (duration: 00m 46s)
  • 17:21 mobrovac@tin: Finished deploy [parsoid/deploy@1eaa07e]: Bring wtp2019 up to date and repool it - T146113 (duration: 01m 02s)
  • 17:20 mobrovac@tin: Started deploy [parsoid/deploy@1eaa07e]: Bring wtp2019 up to date and repool it - T146113
  • 17:20 awight@tin: Started deploy [ores/deploy@1d35aa5]: T170485
  • 17:18 demon@tin: Finished scap: testwiki to wmf.10 + l10n cache build (duration: 24m 23s)
  • 17:16 ottomata: stopping kafka broker on kafka1012 to replace disk T168927
  • 16:53 demon@tin: Started scap: testwiki to wmf.10 + l10n cache build
  • 16:45 oblivian@tin: Started deploy [search/MjoLniR@0140aed]: init
  • 16:44 oblivian@tin: Started deploy [search/MjoLniR@0140aed]: (no justification provided)
  • 16:40 demon@tin: Pruned MediaWiki: 1.30.0-wmf.7 [keeping static files] (duration: 06m 06s)
  • 16:31 godog: finish rollout of thumbor 1.1 in eqiad - T170677
  • 16:00 marostegui: Deploy alter table on s1 - labsdb1003 - T166204
  • 15:59 ema: power-cycle cp2017, stuck rebooting
  • 15:45 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T170863 deploy TemplateStyles to some non-content wikis (all target wikis) (duration: 00m 45s)
  • 15:37 tgr@tin: Finished scap: T170863 deploy TemplateStyles to some non-content wikis (first step: testwiki/labstestwiki only) (forcing; canary errors are unrelated) (duration: 10m 19s)
  • 15:26 tgr@tin: Started scap: T170863 deploy TemplateStyles to some non-content wikis (first step: testwiki/labstestwiki only) (forcing; canary errors are unrelated)
  • 15:14 marostegui: Stop MySQL and shutdown pc2006 for mainboard replacement - T170520
  • 15:08 tgr@tin: scap failed: RuntimeError scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details) (duration: 09m 42s)
  • 15:07 tgr@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 14:58 tgr@tin: Started scap: T170863 deploy TemplateStyles to some non-content wikis (first step: testwiki/labstestwiki only)
  • 14:55 godog: upload and roll-upgrade thumbor to 1.1 - T170677
  • 14:44 zeljkof: EU SWAT finished!
  • 14:42 awight@tin: Finished deploy [ores/deploy@1d35aa5]: T170485 (duration: 00m 26s)
  • 14:41 awight@tin: Started deploy [ores/deploy@1d35aa5]: T170485
  • 14:39 zfilipin@tin: Synchronized portals: (no justification provided) (duration: 00m 45s)
  • 14:38 zfilipin@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 44s)
  • 14:37 moritzm: installing apache updates on silver
  • 14:16 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Make maiwikimedias logo a little bit bigger (T170922) (duration: 00m 43s)
  • 14:10 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T170844) (duration: 00m 43s)
  • 14:07 zfilipin@tin: Synchronized static/images/project-logos/enwikiquote.png: SWAT: Update enwikiquotes logo (T170722) (duration: 00m 43s)
  • 14:01 zeljkof: continuing with EU SWAT
  • 13:51 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Provide HD logos for several Wikiquotes (T150618) (duration: 00m 43s)
  • 13:50 ema: codfw cache_text/upload: upgrade to varnish 4.1.7-1wm1 and reboot for kernel updates
  • 13:48 zfilipin@tin: Synchronized static/images/project-logos: SWAT: Provide HD logos for several Wikiquotes (T150618) (duration: 00m 44s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Enable mobile non-JavaScript editing on all MobileFrontend wikis (T125174) (duration: 00m 43s)
  • 13:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable mobile non-JavaScript editing on all MobileFrontend wikis (T125174) (duration: 00m 43s)
  • 13:16 zfilipin@tin: Synchronized wmf-config/mobile.php: SWAT: Enable mobile non-JavaScript editing on all MobileFrontend wikis (T125174) (duration: 00m 44s)
  • 12:32 marostegui: Run maintain-views on labsdb1001,1003,1009,1010 and 1011 - T168788
  • 12:10 akosiaris: remove oresrdb.svc.eqiad.wmnet in scb1001's /etc/hosts, but do not restart/reload uwsgi-ores and ores-celery-worker
  • 11:56 akosiaris: add oresrdb.svc.eqiad.wmnet in scb1001's /etc/hosts, restart uwsgi-ores and ores-celery-worker
  • 11:25 ema: powercycle cp3034, not rebooting properly
  • 11:00 ema: lvs200[12] upgrade pybal to 1.13.9 T82747 T154759
  • 10:59 ema: lvs200[45] upgrade pybal to 1.13.9 T82747 T154759
  • 10:54 ema: lvs400[12] upgrade pybal to 1.13.9 T82747 T154759
  • 10:52 ema: lvs400[34] upgrade pybal to 1.13.9 T82747 T154759
  • 10:43 ema: lvs100[12] upgrade pybal to 1.13.9 T82747 T154759
  • 10:33 moritzm: rebooting oresrdb1002 for kernel update
  • 10:32 ema: lvs100[45] upgrade pybal to 1.13.9 T82747 T154759
  • 10:28 moritzm: rebooting oresrdb2002 for kernel update
  • 09:54 ema: esams cache_text/upload: upgrade to varnish 4.1.7-1wm1 and reboot for kernel updates
  • 09:29 ema: cp3030: upgrade to varnish 4.1.7-1wm1 and reboot for kernel update
  • 09:15 ema: lvs300[12] upgrade pybal to 1.13.9 T82747
  • 09:13 ema: lvs300[34] upgrade pybal to 1.13.9 T82747
  • 09:08 elukey: reboot conf1003 for kernel updates
  • 09:00 elukey: reboot conf1002 for kernel updates
  • 07:52 moritzm: upgrade wtp1001 to nodejs 6.11
  • 07:32 elukey: moved /home to /srv/home on stat1006 to free disk space (created symling from /home -> /srv/home too) - T152712
  • 06:42 moritzm: upgrading restbase on the various test clusters to nodejs 6.11
  • 05:59 marostegui: Deploy alter table on s1 - db1065 - T166204
  • 05:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T166204 (duration: 00m 43s)
  • 05:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T166204 (duration: 00m 44s)
  • 04:22 Jamesofur: remove 2FA from NativeForeigner per T170911
  • 02:45 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jul 18 02:45:25 UTC 2017 (duration 6m 36s)
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.9) (duration: 13m 25s)
  • 02:13 mutante: nitrogen/nihal - rm /usr/lib/ganglia/python_modules/postgresql.py ; rm /etc/ganglia/conf.d/* ; restart gmond (T169953)

2017-07-17

  • 23:46 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2201.codfw.wmnet
  • 23:33 demon@tin: Synchronized wmf-config/InitialiseSettings.php: all wikis to minervaneue (duration: 00m 44s)
  • 23:19 demon@tin: Synchronized wmf-config/InitialiseSettings.php: testwiki to minervaneue (duration: 00m 44s)
  • 22:16 thcipriani@tin: Finished scap: Add missing Minerva skin description message key prep for MinervaNeue deployment (duration: 18m 57s)
  • 21:57 thcipriani@tin: Started scap: Add missing Minerva skin description message key prep for MinervaNeue deployment
  • 21:21 mutante: ocg1001 - Type: General Protection Fault (13) Source: Software (UEFI0011) - depooled
  • 21:20 dzahn@neodymium: conftool action : set/pooled=no; selector: name=ocg1001.eqiad.wmnet
  • 21:19 mutante: ocg1001 - dead - " Exception Inside the Exception Handler
  • 21:18 mutante: powercycling ocg1001 which went down and had no console output at all
  • 21:13 eileen1: update CiviCRM from 15831ac to 2de7f2a
  • 21:04 mutante: mw2202 - renew puppet cert that was accidentally revoked
  • 21:01 mutante: mw2201 - revoke old puppet cert, salt key, accept/sign news cert and key, initial pupet run .. T170307
  • 20:52 eileen1: revision for civicrm changed...
  • 20:39 eileen1: update civicrm from 8840b94 to e4824fb
  • 20:33 mutante: mw2201 - reinstalling OS after mainboard replacement (network interfaces became eth2/eth3 from eth0/eth1 so ferm failed etc) - T170307
  • 20:26 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.9
  • 20:19 thcipriani@tin: Synchronized php-1.30.0-wmf.9/extensions/RelatedArticles: Add limit via ResourceLoaderGetConfigVars (duration: 02m 38s)
  • 19:26 thcipriani@tin: Synchronized php-1.30.0-wmf.9/extensions/CirrusSearch: Add PoolCounter specifically for morelike T170648 (duration: 03m 02s)
  • 18:44 mutante: rebooting mw2201 for MAC address change
  • 18:36 niharika29@tin: Synchronized php-1.30.0-wmf.9/resources/src/mediawiki.rcfilters/: RCFilters: Allow experimental live update feature to be enabled with query string parameter https://gerrit.wikimedia.org/r/#/c/365413/ (duration: 02m 51s)
  • 18:20 mobrovac@tin: Finished deploy [restbase/deploy@f5ca520]: Activate dinwiki support (duration: 07m 39s)
  • 18:16 niharika29@tin: Synchronized wmf-config/PoolCounterSettings.php: Configure CirrusSearch-MoreLike pool counter [mediawiki-config] - https://gerrit.wikimedia.org/r/365406 (T170648) (duration: 02m 54s)
  • 18:12 mobrovac@tin: Started deploy [restbase/deploy@f5ca520]: Activate dinwiki support
  • 18:07 mobrovac@tin: Finished deploy [changeprop/deploy@f80c333]: (no justification provided) (duration: 01m 17s)
  • 18:06 mobrovac@tin: Started deploy [changeprop/deploy@f80c333]: (no justification provided)
  • 17:39 mobrovac@tin: Finished deploy [restbase/deploy@f5ca520]: Bringing restbase2001 up to date (duration: 01m 21s)
  • 17:38 mobrovac@tin: Started deploy [restbase/deploy@f5ca520]: Bringing restbase2001 up to date
  • 16:41 _joe_: trying to revive pdfrender on scb1002, the usual bug with its restarts
  • 16:12 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw2201.codfw.wmnet
  • 15:47 marostegui: Stop MySQL on pc2006 - T170520
  • 15:35 ema: restart pybal on lvs100[36] T169765
  • 15:32 jynus: starting table compressing at db2072 (lag is possible)
  • 15:29 zeljkof: EU SWAT finished! (updateCollation.php still running in the background)
  • 15:21 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set collation for Romanian wikis to uca-ro-u-kn (T168711) (duration: 00m 47s)
  • 15:07 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow uploads to autoconfirmed-only at huwiki (T169438) (duration: 00m 47s)
  • 15:03 moritzm: uploaded Linux 4.9.30-2+deb9u2 backport to jessie-wikimedia
  • 14:58 ema: restart pybal on lvs100[12] T169765
  • 14:57 ema: restart pybal on lvs100[45] T169765
  • 14:56 marostegui: Deploy, manually, alter tables on enwiki on db1047 - T166204
  • 14:47 zfilipin@tin: Synchronized static/images/: SWAT: Run optipng -o7 at all PNGs (T170569) (duration: 00m 47s)
  • 14:46 elukey: reboot conf1001 for kernel updates
  • 14:39 Dereckson: Created account "Biplab Anand" at bureaucrat level on mai.wikimedia (T168782)
  • 14:34 andrewbogott: changing nodepool rate to '6' and restarting nodepool
  • 14:34 marostegui: Run maintain-views on labsdb1009,10 and 11 for s6 - T153743
  • 14:13 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add HD logos for several Wiktionaries (T150618) (duration: 00m 46s)
  • 14:11 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add HD logos for several Wiktionaries (T150618) (duration: 00m 49s)
  • 14:11 ottomata: decommissioning rcs100[12] to spare::system: T170157
  • 14:02 zeljkof: Extending EU SWAT
  • 13:59 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Provide HD logos for several Wikiversities (T150618) (duration: 00m 46s)
  • 13:58 zfilipin@tin: Synchronized static/images/project-logos: SWAT: Provide HD logos for several Wikiversities (T150618) (duration: 00m 47s)
  • 13:46 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Provide HD logos for several Wikipedias (T150618) (duration: 00m 46s)
  • 13:43 marostegui: Deploy alter table on s4 - dbstore1001 - T168661
  • 13:37 zfilipin@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 13:36 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Provide HD logos for several Wikipedias (T150618) (duration: 00m 48s)
  • 11:35 marostegui: Deploy alter table on s4 - dbstore1002 - T168661
  • 11:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 - T168661 (duration: 00m 46s)
  • 10:16 moritzm: installing apache updates on graphite hosts
  • 10:02 moritzm: installing apache updates on logstash
  • 10:01 moritzm: installing apache updates on otrs.wikimedia.org
  • 09:48 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 (duration: 00m 47s)
  • 09:24 marostegui: Disable puppet on labsdb1010 for maintenance - T153743
  • 09:22 marostegui: Stop replication on labsdb1009 and labsdb1010 for maintenance - T153743
  • 09:05 marostegui: Disable puppet on labsdb1009 for maintenance - T153743
  • 08:28 akosiaris: reboot helium/heze for kernel upgrades
  • 08:23 marostegui: Deploy alter table s1 - labsdb1001 - T166204
  • 08:20 marostegui: Increase expire_logs_days on db1069:3311 from 7 to 14 temporarily - T166204
  • 08:17 ema: lvs100[39]: upgrade pybal to 1.13.9 T82747 T154759
  • 08:06 ema: lvs2003: upgrade pybal to 1.13.9 T82747 T154759
  • 07:57 ema@neodymium: conftool action : set/pooled=inactive; selector: name=wdqs1002.eqiad.wmnet
  • 07:55 akosiaris: upgrade nodejs to 6.11 on etherpad1001
  • 07:32 moritzm: updating ruthenium to nodejs 6.11
  • 07:12 marostegui: Stop slave s2 on db1102 for maintenance - T153743
  • 07:09 marostegui: Deploy alter table s4 - db1056 - T168661
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 - T168661 (duration: 00m 46s)
  • 07:05 marostegui@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 07:00 marostegui: Rename labsdb1011 main replication thread to an specific one - T153743
  • 06:50 marostegui: Stop replication on db1095 for maintenance - T153743
  • 06:48 marostegui: Deploy alter table on s1 - db1073 - T166204
  • 06:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T166204 (duration: 01m 04s)
  • 05:21 marostegui: Add 50G to /srv on db1069
  • 05:09 marostegui: Restart MySQL on labsdb1009 for maintenance - T170657
  • 03:13 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jul 17 03:13:51 UTC 2017 (duration 7m 16s)
  • 03:06 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.9) (duration: 12m 48s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 09m 13s)

2017-07-15

  • 10:42 elukey: puppetdb restarted on nitrogen, puppet agents re-enabled - T170740
  • 10:06 akosiaris: disable puppet on the entire fleet for puppetdb debugging on nitrogen
  • 02:53 mutante: servermon - switched to netmon1003 backend (jessie ganeti)
  • 02:46 mutante: netmon1001 - stopping "make_updates" cron , migration to netmon1003, flipping cache::backend to netmon1003 (T170653)

2017-07-14

  • 21:48 mutante: netmon1003 - reinstalled with jessie - saw nothing on ganeti console at all which was a bit confusing, but install finished anyways - adding to puppet / signing cert (T170655)
  • 20:47 bblack: mailbox lag: restarting cp1074 backend
  • 19:50 mutante: wikitech-static: re-enabled HSTS - line was commented out in Apache config, activated it again
  • 18:54 herron: added exim from/subject filter for spam observed from qq.com - T170601
  • 16:36 herron: lowered mailman/lists spam_score exim acl to 6 - T170601
  • 11:41 marostegui: Add 50G to /srv/ on dbstore1002 - T168303
  • 11:35 jynus: stop db2062 and db2072 for cloning
  • 10:43 jynus: altering wmde_analytics_betafeature_users_today table to ENGINE=InnoDB
  • 10:17 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 (duration: 00m 47s)
  • 09:57 moritzm: uploaded nodejs_6.11.0~dfsg-1+wmf to apt.wikimedia.org (for jessie and stretch) (T170548)
  • 07:22 marostegui: Stop replication on labsdb1011 for maintenance - T153743
  • 06:59 marostegui: Create views for dinwiki on labsdb1009, 1010 and 1011 - T169193
  • 05:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T166204 (duration: 00m 46s)
  • 04:21 mutante: netmon1002/netmon2001 - change UID/GID for rancid to universal 445/445, use find -exec to chown existing files, for unmessy data syncing, define UID on wikitech page UID (T166180)

2017-07-13

  • 23:47 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: revert all wikis to php-1.30.0-wmf.9, again
  • 23:19 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to php-1.30.0-wmf.9
  • 23:08 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: revert all wikis to php-1.30.0-wmf.9
  • 22:57 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.9
  • 22:04 bd808: Stashbot working after backend ElasticSearch cluster upgrade
  • 21:31] robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs2001.codfw.wmnet
  • 21:34] demon@tin: Locking from deployment [operations/mediawiki-config]: Nobody use this (planned duration: 60m 00s)
  • 21:36 demon@tin: Unlocked for deployment [operations/mediawiki-config]: Nobody use this (duration: 01m 23s)
  • 21:28 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs2002.codfw.wmnet
  • 21:28 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs2003.codfw.wmnet
  • 20:56 demon@tin: Synchronized wmf-config/InitialiseSettings.php: MinervaNeue on testwiki (duration: 00m 47s)
  • 20:01 smalyshev@tin: Finished deploy [wdqs/wdqs@a32dbeb]: Redeploy GUI due to breakage in T165228 (duration: 02m 19s)
  • 19:59 smalyshev@tin: Started deploy [wdqs/wdqs@a32dbeb]: Redeploy GUI due to breakage in T165228
  • 18:39 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2202.codfw.wmnet
  • 18:38 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2201.codfw.wmnet
  • 18:31 arlolra: Updated Parsoid to 71c07681 (T169293)
  • 18:29 bblack: upgrading nginx on +wmf1 hosts: conf[1001-1003].eqiad.wmnet,cp1048.eqiad.wmnet,cp3036.esams.wmnet,elastic2020.codfw.wmnet,hassaleh.codfw.wmnet,hassium.eqiad.wmnet
  • 18:22 arlolra@tin: Finished deploy [parsoid/deploy@d0041f2]: Updating Parsoid to 71c07681 (duration: 11m 12s)
  • 18:11 arlolra@tin: Started deploy [parsoid/deploy@d0041f2]: Updating Parsoid to 71c07681
  • 17:46 volans: re-enabling puppet and force run on 'R:Package = nginx-common'
  • 17:38 bblack: restarting varnish-be on cp1049 (mailbox lag)
  • 17:36 bblack: restarting puppetmasters, staggered
  • 17:06 volans: disabled puppet on nitrogen
  • 16:34 chasemp: labstore2001:~# systemctl disable lvm2-activation && systemctl disable lvm2-activation-early && systemctl reset-failed (slated to be reimaged by madhu -- this alert is non-actionable)
  • 16:19 urandom: Starting cassandra-a, restbase2007 (OOM)
  • 16:03 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw2202.codfw.wmnet
  • 16:03 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw2201.codfw.wmnet
  • 15:33 marostegui: Deploy alter table on s1 - labsdb1009 - T166204
  • 15:14 ejegg: updated civicrm from 0aa0f8f to 8840b94
  • 14:49 marostegui: Skip maiwikimedia database creation which is breaking dbstore2001 replication - T168788
  • 14:42 godog: roll-restart cassandra in services-test to pick up renewed certs
  • 14:21 marostegui: Run redact_sanitarium on db1069 and db1095 for maiwikimedia - T168788
  • 14:19 Reedy: `mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=maiwikimedia Translate` for T168782
  • 14:13 moritzm: rebooting graphite1* for kernel update
  • 13:11 zeljkof: EU SWAT finished
  • 13:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add enwiki as import source for specieswiki (T170094) (duration: 00m 47s)
  • 12:52 moritzm: installing nginx security updates on cp1*
  • 12:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1053 - T168661 (duration: 01m 03s)
  • 11:40 akosiaris: stop ircecho, icinga is misbehaving badly, no point it having it around
  • 11:28 akosiaris: restart icinga, it's reporting wrong stuff all over the place
  • 10:35 moritzm: installing nginx security updates on cp3*
  • 10:09 moritzm: rebooting graphite2* for kernel update
  • 10:04 ema: lvs[12]006: upgrade pybal to 1.13.9 T82747 T154759
  • 09:43 ema: lvs1010: upgrade pybal to 1.13.9 T82747 T154759
  • 09:41 ema: pybal 1.13.9 uploaded to apt.w.o
  • 09:21 moritzm: installing nginx security updates on cp2*
  • 08:32 moritzm: enabling jobrunner/jobchron on mw1260 (jessie video scaler)
  • 08:19 godog: upgrade grafana to 4.4.1 on krypton - T169773
  • 08:11 jynus: powercycle pc2006, was down
  • 08:09 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1260.eqiad.wmnet
  • 07:58 marostegui: Deploy alter table on s4 - db1053 - T168661
  • 07:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 - T168661 (duration: 00m 47s)
  • 07:33 moritzm: rebooting netmon1001
  • 06:57 moritzm: installing apache security updates on remaining mw1* hosts
  • 06:51 moritzm: installing nginx security updates on cp4*
  • 06:51 marostegui: Manually deploy some alter tables on dbstore1001 for enwiki - T166204
  • 06:42 _joe_: rolling restart of pybal on low-traffic balancers
  • 06:14 XioNoX: restricting ssh algorithms on network devices - T170369
  • 06:11 moritzm: fixed salt setup for reimaged stat1006
  • 03:09 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jul 13 03:09:35 UTC 2017 (duration 7m 8s)
  • 03:02 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.9) (duration: 07m 57s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 09m 54s)
  • 02:26 mutante: labtestpuppetmaster2001 - flapping icinga alerts about salt-minion starting and stopping constantly - there is an accepted salt-key but it was rejected by the master, server was reinstalled but still old key - deleted old key, accepted new key (T167157)

2017-07-12

  • 23:28 thcipriani@tin: Synchronized php-1.30.0-wmf.9/extensions/CirrusSearch/resources/ext.cirrus.explore-similar.js: SWAT: Adding full URLs to Explore Similar API calls T149809 T164856 (duration: 00m 47s)
  • 23:08 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Index deletes everywhere T163235 (duration: 00m 47s)
  • 22:08 mobrovac@tin: Finished deploy [recommendation-api/deploy@d5076c2]: (no justification provided) (duration: 01m 50s)
  • 22:06 mobrovac@tin: Started deploy [recommendation-api/deploy@d5076c2]: (no justification provided)
  • 21:54 andrewbogott: restarting nodepool to pick up a config change
  • 21:42 mobrovac@tin: Finished deploy [recommendation-api/deploy@ca816ac]: (no justification provided) (duration: 02m 24s)
  • 21:40 mobrovac@tin: Started deploy [recommendation-api/deploy@ca816ac]: (no justification provided)
  • 21:29 demon@tin: Synchronized wmf-config/mobile.php: MinervaNeue config (duration: 00m 46s)
  • 21:28 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: MinervaNeue config (duration: 00m 46s)
  • 21:27 demon@tin: Synchronized wmf-config/InitialiseSettings.php: MinervaNeue config (duration: 00m 47s)
  • 21:22 demon@tin: Finished scap: Rebuilding l10n cache for new skin (duration: 38m 47s)
  • 20:43 demon@tin: Started scap: Rebuilding l10n cache for new skin
  • 20:16 bsitzmann@tin: Finished deploy [mobileapps/deploy@3f90bf1]: Update mobileapps to d30dae2 (T169930, T170225) (duration: 05m 00s)
  • 20:11 bsitzmann@tin: Started deploy [mobileapps/deploy@3f90bf1]: Update mobileapps to d30dae2 (T169930, T170225)
  • 19:37 XioNoX: adding ignore-l3-incompletes to all peering/transit interfaces - T163542
  • 19:27 thcipriani@tin: Synchronized php: promote php symlink group1 wikis to 1.30.0-wmf.9 (duration: 00m 45s)
  • 19:25 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.9
  • 19:10 demon@tin: Synchronized php-1.30.0-wmf.9/skins/MinervaNeue/: Latest code (duration: 00m 47s)
  • 19:09 demon@tin: Synchronized php-1.30.0-wmf.7/skins/MinervaNeue/: Latest code (duration: 00m 48s)
  • 18:16 mobrovac@tin: Finished deploy [recommendation-api/deploy@7fd10f2]: Use the domain parameter as the target language - T170439 (duration: 00m 40s)
  • 18:15 mobrovac@tin: Started deploy [recommendation-api/deploy@7fd10f2]: Use the domain parameter as the target language - T170439
  • 18:15 andrewbogott: depooling labvirt1015, deleting a bunch of stuck contintcloud instances
  • 18:14 demon@tin: Synchronized static/images/project-logos/: Fixing srwikiquote logos (duration: 00m 48s)
  • 17:52 chasemp: labnodepool1001:~# sudo puppet agent --enable
  • 17:29 chasemp: labnodepool1001:~# service nodepool stop
  • 17:21 _joe_: rolling restart of pybal on low-traffic LVS in eqiad,codfw
  • 17:19 chasemp: labnet1001:~# service nova-api restart
  • 17:15 chasemp: labcontrol1001:~# service rabbitmq-server restart
  • 17:13 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs2002.codfw.wmnet
  • 17:13 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs2003.codfw.wmnet
  • 17:07 moritzm: upgrading nginx on mwdebug*
  • 16:52 godog: roll-restart and upgrade thumbor in eqiad
  • 16:47 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api
  • 16:36 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: service=recommendation-api,dc=eqiad
  • 16:15 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: service=recommendation-api,dc=codfw
  • 16:14 ema: downgrade pybal to 1.13.6 on lvs1010 T82747 T154759 (1.13.7 throwing exceptions)
  • 16:09 godog: upload thumbor 1.0-1 to install1002
  • 16:06 mobrovac@tin: Finished deploy [recommendation-api/deploy@eb2fef3]: (no justification provided) (duration: 00m 33s)
  • 16:06 ema: lvs1006, lvs1010: upgrade pybal to 1.13.7 T82747 T154759
  • 16:05 mobrovac@tin: Started deploy [recommendation-api/deploy@eb2fef3]: (no justification provided)
  • 15:57 mobrovac@tin: Finished deploy [recommendation-api/deploy@ed41fc4]: Initial deploy on canary scb2001, take #3 - T165760 (duration: 00m 46s)
  • 15:56 mobrovac@tin: Started deploy [recommendation-api/deploy@ed41fc4]: Initial deploy on canary scb2001, take #3 - T165760
  • 15:52 mobrovac@tin: Finished deploy [recommendation-api/deploy@ed41fc4]: Initial deploy on canary scb2001, take #2 - T165760 (duration: 00m 06s)
  • 15:52 mobrovac@tin: Started deploy [recommendation-api/deploy@ed41fc4]: Initial deploy on canary scb2001, take #2 - T165760
  • 15:51 mobrovac@tin: Finished deploy [recommendation-api/deploy@ed41fc4]: Initial deploy on canary scb2001 - T165760 (duration: 00m 15s)
  • 15:51 mobrovac@tin: Started deploy [recommendation-api/deploy@ed41fc4]: Initial deploy on canary scb2001 - T165760
  • 15:07 ema: lvs2006: upgrade pybal to 1.13.7 T82747 T154759
  • 14:56 marostegui: Run redact_sanitarium on db1069 and db1095 for maiwikimedia - T168788
  • 14:55 marostegui: Run redact_sanitarium on db1069 and db1095 for maiwikimedia - T169510
  • 14:37 moritzm: installing apache security updates on californium / horizon.wikimedia.org
  • 14:28 addshore@tin: Synchronized wmf-config/extension-list-labs: Add Newsletter to extension-list PT1/2 (duration: 00m 46s)
  • 14:27 addshore@tin: Synchronized wmf-config/extension-list: Add Newsletter to extension-list PT1/2 (duration: 00m 47s)
  • 14:17 jynus: restarting labsdb1005 (toolsdb)
  • 14:14 madhuvishy: Disable icinga notifications and event handler checks for labsdb1005
  • 14:07 marostegui: Run redact_sanitarium on db1069 for dinwiki - T169193
  • 13:23 marostegui: Deploy alter table s1 - db1072 - T166204
  • 13:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T166204 (duration: 00m 46s)
  • 13:17 zeljkof: EU SWAT finished!
  • 13:14 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Temporarily set $wgPropertySuggesterClassifyingPropertyIds to [ 31 ]. (T169058) (duration: 00m 46s)
  • 13:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T166204 (duration: 00m 46s)
  • 12:54 dereckson@tin: Synchronized wmf-config/interwiki.php: Interwiki map update (duration: 00m 46s)
  • 12:53 marostegui: Deploy alter table s1 - db1095 - T166204
  • 12:26 moritzm: reimage mw1260 (video scaler) to jessie
  • 12:20 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Logos for mai.wikimedia (T168782) (duration: 00m 46s)
  • 12:18 dereckson@tin: Synchronized static/images/project-logos: Logos for mai.wikimedia (T168782) (duration: 00m 46s)
  • 12:14 godog: upgrade nginx on thumbor and prometheus machines
  • 12:04 dereckson@tin: Synchronized multiversion/MWMultiVersion.php: +mai.wikimedia new subdomain (duration: 00m 46s)
  • 12:03 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for mai.wikimedia (T168782) (duration: 00m 46s)
  • 12:02 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: +maiwiki
  • 12:01 dereckson@tin: Synchronized dblists: +maiwikimedia (duration: 00m 46s)
  • 11:57 Dereckson: Run add wiki maintenance script for maiwikimedia database / mai.wikimedia.org (T168782)
  • 11:52 moritzm: installing apache security updates on mw*
  • 11:34 Dereckson: Run add wiki maintenance script for dinwiki database / din.wikipedia.org (T168518)
  • 11:28 moritzm: installing spice security updates
  • 11:21 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for din.wikipedia (thanks Urbanecm) (T168518) (duration: 00m 46s)
  • 11:20 dereckson@tin: Synchronized langlist: +din (duration: 00m 46s)
  • 11:18 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: +dinwiki
  • 11:11 moritzm: installing tomcat security updates
  • 11:04 dereckson@tin: Synchronized dblists: Create din.wikipedia (duration: 00m 49s)
  • 10:54 moritzm: installing nginx updates on ms1001/dataset1001
  • 10:46 _joe_: running namespaceDupes.php on eswiki, T170176
  • 10:45 moritzm: upgrading nginx on meitnerium/archiva.wikimedia.org
  • 10:31 jynus: stopping all mysql instances on dbstore2002 and doing an in-place upgrade
  • 10:04 moritzm: installing nginx security updates on mw* canaries
  • 09:19 oblivian@puppetmaster1001: conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=(citoid|restbase-async)
  • 08:58 moritzm: uploading nginx 1.11.10-1+wmf3 for jessie-wikimedia/stretch-wikimedia
  • 08:32 XioNoX: asw-b-codfw back up - T169345
  • 08:20 XioNoX: restarting asw-b-codfw for upgrade
  • 08:11 marostegui: Stop MySQL on db2033 (x1) - T169510
  • 07:54 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=(citoid|restbase-async)
  • 07:39 XioNoX: depooled codfw for T169345
  • 07:34 marostegui: Rename table migrateuser_medium on db1094 and db1079 - T170310
  • 07:29 marostegui: Drop table localisation_file_hash from testwiki and drop database l10nwiki on s3 - T119811
  • 07:27 marostegui: Drop table localisation_file_hash from enwiki - T119811
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 - T168661 (duration: 00m 44s)
  • 06:49 _joe_: saved the current state of mediawiki-staging (in detached head) in the branch "wtf-live"; saved what is in master on tin in "wtf-master"; reset master to the latest commit in origin/master
  • 03:10 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jul 12 03:10:42 UTC 2017 (duration 6m 54s)
  • 03:03 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.9) (duration: 13m 57s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 09m 25s)

2017-07-11

  • 23:25 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Config changes for LoginNotify (T107707) (duration: 00m 47s)
  • 21:20 bblack: varnish backend restart on cp1072 (mailbox lag)
  • 20:25 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.30.0-wmf.9
  • 20:23 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: Can't use NS_MODULE constant T170317 (duration: 00m 43s)
  • 19:54 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: revert group0 to 1.30.0-wmf.9
  • 19:53 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.30.0-wmf.9
  • 19:32 cmjohnson1: powering off mw1199 to reset idrac
  • 19:29 thcipriani@tin: Finished scap: testwiki to php-1.30.0-wmf.9 and rebuild l10n cache (duration: 25m 11s)
  • 19:17 paravoid: shutting down sodium for iDRAC reset (T169360)
  • 19:17 ejegg: updated payments-wiki from f935c06 to bdc5226
  • 19:04 thcipriani@tin: Started scap: testwiki to php-1.30.0-wmf.9 and rebuild l10n cache
  • 18:47 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2154.codfw.wmnet
  • 18:43 thcipriani@tin: Pruned MediaWiki: 1.30.0-wmf.6 [keeping static files] (duration: 06m 28s)
  • 18:28 dcausse: T169498: elastic@eqiad huge but short load spike on 24+ nodes (despite the workaround on token_count_router deployed)
  • 18:27 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw2154.codfw.wmnet
  • 18:25 mutante: mw2154 - depool for attempting IPMI fix
  • 18:17 mutante: ms2202 - repooled
  • 18:09 mutante: mw2201 - repooled
  • 17:20 mutante: mw2201, mw2202 - depool appservers for T169360 (drain flea power)
  • 17:19 thcipriani: starting branch cut for 1.30.0-wmf.9 T167893
  • 16:32 bblack: restarting varnish backend on cp1074 (mailbox lag)
  • 16:16 tzatziki: Removed 2FA for Arsog1985 SUL account (T168779)
  • 15:56 dcausse: restarting elastic on relforge100*.eqiad.wmnet to pickup a new version of the ltr plugin
  • 15:36 moritzm: rolling restart of thumbor to pick up tiff and expat security updates
  • 15:29 marostegui: Stop replication labsdb1010 for maintenance - T153743
  • 15:25 marostegui: Stop replication labsdb1009 for maintenance - T153743
  • 15:24 elukey: restart burrow on krypton
  • 15:21 moritzm: rebooting uranium for kernel update
  • 14:31 marostegui: Deploy alter table on db1064 - commonswiki and let it replicate to db1095 and labsdb1009, 1010 and 1011 - T168661
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 - T168661 (duration: 00m 43s)
  • 14:26 moritzm: installing apache security updates on mw2*
  • 14:24 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw2118.codfw.wmnet
  • 14:21 jynus: rebooting labsdb1004 for kernel upgrade T168584
  • 14:16 jynus: upgradem wmf-mariadb10 on labsdb1004
  • 14:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1079 original weight (duration: 00m 42s)
  • 14:13 madhuvishy: Disable event handler icinga checks for labsdb1004
  • 14:10 madhuvishy: disabled icinga notifications for host and services for labsdb1004
  • 13:44 dereckson@tin: Synchronized php-1.30.0-wmf.7/extensions/EventLogging/modules/ext.eventLogging.subscriber.js: Don't subscribe EventLogging twice if window.onload fires twice (T170018) (duration: 00m 42s)
  • 13:35 Dereckson: Purged https://en.wikipedia org/static/images/project-logos/eswikibooks.png
  • 13:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: High density logos for es.wikibooks (T170248, 2/2) (duration: 00m 42s)
  • 13:26 dereckson@tin: Synchronized static/images/project-logos/: High density logos for es.wikibooks (T170248, 1/2) (duration: 00m 43s)
  • 10:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 with 0 weight - T166204 (duration: 00m 41s)
  • 10:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1079 weight (duration: 00m 42s)
  • 10:37 akosiaris: enable puppet everywhere but on einsteinium (icinga host) for merge of https://gerrit.wikimedia.org/r/#/c/363295/5
  • 10:36 XioNoX: bump BFD timer from 300 to 600 on the eqiad-codfw link for T170131
  • 10:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 - T168661 (duration: 00m 42s)
  • 10:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1079 weight (duration: 00m 42s)
  • 10:11 marostegui: Drop table localisation_file_hash from commonswiki - T119811
  • 10:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1079 weight (duration: 00m 42s)
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 with low weight - T153743 (duration: 00m 42s)
  • 09:41 moritzm: installing tiff security updates
  • 09:40 marostegui: Stop slave s6 on db1102 for exporting its content - T153743
  • 09:12 moritzm: reboot sarin for kernel update
  • 08:50 marostegui: Deploy alter table on s4 - db1081 - T168661
  • 08:49 moritzm: rebooting mw1169 for kernel update
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T168661 (duration: 00m 42s)
  • 08:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 - T168661 (duration: 00m 42s)
  • 08:25 akosiaris: disable puppet everywhere but on einsteinium (icinga host) for merge of https://gerrit.wikimedia.org/r/#/c/363295/5
  • 08:24 akosiaris: disable puppet on einsteinium (icinga host) for merge of https://gerrit.wikimedia.org/r/#/c/363295/5
  • 08:18 marostegui: Drop localisation_file_hash table from dewiki (s5) - T119811
  • 07:57 marostegui: Drop localisation_file_hash table from frwiki and jawiki (s6) - T119811
  • 07:38 marostegui: Stop MySQL db1102 for maintenance - T153743
  • 07:35 volans: amending previous SAL, I meant ircecho ofc
  • 07:34 volans: bouncing icinga-wm (tcpircbot) on einsteinium to get back it's primary nick
  • 07:26 marostegui: Stop MySQL on db1079 for maintenance - T153743
  • 07:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T153743 (duration: 00m 41s)
  • 07:07 marostegui: Deploy alter table db1084 - T168661
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 - T168661 (duration: 00m 42s)
  • 06:58 marostegui: Deploy alter table on s1 - dbstore1002 - T166204
  • 06:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1059 - T168661 (duration: 00m 41s)
  • 05:14 marostegui: Deploy alter table on db1066 - T166204
  • 05:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T166204 (duration: 00m 43s)
  • 05:08 marostegui: Deploy alter table on enwiki - labsdb1011 - T166204
  • 02:32 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jul 11 02:32:17 UTC 2017 (duration 6m 39s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 08m 46s)
  • 00:07 legoktm: running mwscript refreshLinks.php --wiki=metawiki --namespace=2 on terbium (T145366)

2017-07-10

  • 23:25 thcipriani@tin: Synchronized php-1.30.0-wmf.7/extensions/Flow/Hooks.php: SWAT: Do not override other flags on enhanced recent changes T169181 (duration: 00m 42s)
  • 23:08 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-sr.svg: SWAT Compress srlogo for Wikipedia T165896 (duration: 00m 43s)
  • 22:28 twentyafterfour: reloaded apache2 config on iridium to activate the changes from https://gerrit.wikimedia.org/r/#/c/363356/
  • 22:28 bawolff@tin: Synchronized php-1.30.0-wmf.7/extensions/CentralAuth/includes/specials/SpecialCentralAutoLogin.php: T134931 (duration: 00m 44s)
  • 22:13 MaxSem: Re-ran cleanupTitles.php on Meta with live fix applied, works now (ref T61837)
  • 22:08 reedy@tin: Synchronized php-1.30.0-wmf.7/includes/: (no justification provided) (duration: 01m 33s)
  • 22:00 bblack: restart varnish backend on cp1099 (mailbox lag)
  • 21:06 volans: running IPMI auditing to update status of T150160
  • 19:48 MaxSem: Running cleanupTitles.php on Meta
  • 19:14 niharika29@tin: Synchronized php-1.30.0-wmf.7/extensions/CirrusSearch/: Ignore archive records with null page_id (T169977) (duration: 00m 52s)
  • 19:09 niharika29@tin: Synchronized wmf-config/: Stop disabling MFTidyMobileViewSections (T168671) and Logo changes for various wiki projects (T165896) (duration: 00m 21s)
  • 19:07 niharika29@tin: Synchronized static/images/mobile/: Logo changes for various wiki projects [mediawiki-config] - https://gerrit.wikimedia.org/r/364241 (https://phabricator.wikimedia.org/T165896) (duration: 00m 20s)
  • 19:07 niharika29@tin: scap failed: average error rate on 2/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:58 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Logo and favicon changes for arbcom_dewiki (T166947) (duration: 00m 20s)
  • 18:57 niharika29@tin: Synchronized static/: Logo and favicon changes for arbcom_dewiki (T166947) (duration: 00m 20s)
  • 18:56 niharika29@tin: scap failed: average error rate on 2/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:47 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Add wgMetaNamespace / wgMetaNamespaceTalk for lv.wiktionary [mediawiki-config] - https://gerrit.wikimedia.org/r/364197 (T170065) (duration: 00m 20s)
  • 18:45 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:43 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:41 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:40 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:36 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Logo updates for sr.wikiquote (T168444) (duration: 00m 40s)
  • 18:35 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:34 niharika29@tin: Synchronized static/images/project-logos: Logo updates for sr.wikiquote (T168444) (duration: 00m 40s)
  • 18:32 niharika29@tin: Synchronized static/images/project-logos/srwikiquote-1.5x.png: Logo updates for sr.wikiquote (T168444) (duration: 00m 41s)
  • 18:23 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Remove Programs and Participation namespaces from meta.wikimedia [mediawiki-config] - https://gerrit.wikimedia.org/r/363745 (T61837) (duration: 00m 42s)
  • 18:22 niharika29@tin: scap aborted: wmf-config/InitialiseSettings.php Remove Programs and Participation namespaces from meta.wikimedia [mediawiki-config] - https://gerrit.wikimedia.org/r/363745 (T61837) (duration: 06m 19s)
  • 18:15 niharika29@tin: Started scap: wmf-config/InitialiseSettings.php Remove Programs and Participation namespaces from meta.wikimedia [mediawiki-config] - https://gerrit.wikimedia.org/r/363745 (T61837)
  • 17:55 otto@tin: Finished deploy [eventstreams/deploy@3d37f5d]: Redirect routes for RCStream deprecation (duration: 02m 41s)
  • 17:55 ottomata: disabling RCStream varnish routing: T170157
  • 17:52 otto@tin: Started deploy [eventstreams/deploy@3d37f5d]: Redirect routes for RCStream deprecation
  • 17:35 ejegg: updated payments-wiki from 8bdd706 to f935c06
  • 17:07 gehel@tin: Finished deploy [wdqs/wdqs@1b3b73e]: (no justification provided) (duration: 01m 42s)
  • 17:06 gehel@tin: Started deploy [wdqs/wdqs@1b3b73e]: (no justification provided)
  • 17:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T166204 (duration: 00m 42s)
  • 16:14 nuria@tin: Finished deploy [eventlogging/analytics@5e16da1]: (no justification provided) (duration: 00m 04s)
  • 16:14 nuria@tin: Started deploy [eventlogging/analytics@5e16da1]: (no justification provided)
  • 15:56 elukey@tin: Finished deploy [analytics/refinery@6da2774]: Update stat1002 with the last refinery deployment (duration: 00m 04s)
  • 15:55 elukey@tin: Started deploy [analytics/refinery@6da2774]: Update stat1002 with the last refinery deployment
  • 15:47 milimetric@tin: Finished deploy [analytics/refinery@6da2774]: Update Sqoop fix python error (duration: 00m 07s)
  • 15:47 milimetric@tin: Started deploy [analytics/refinery@6da2774]: Update Sqoop fix python error
  • 15:43 milimetric@tin: Finished deploy [analytics/refinery@6da2774]: Update Sqoop fix python error (duration: 01m 36s)
  • 15:42 milimetric@tin: Started deploy [analytics/refinery@6da2774]: Update Sqoop fix python error
  • 15:37 milimetric@tin: Finished deploy [analytics/refinery@6da2774]: Update Sqoop fix python error (duration: 02m 06s)
  • 15:35 milimetric@tin: Started deploy [analytics/refinery@6da2774]: Update Sqoop fix python error
  • 15:24 andrewbogott: adding two new hosts (labvirt1014 and labvirt1015) to the nova-compute scheduling pool. Possible nodepool side-effects, maybe good ones?
  • 15:14 marostegui: Drop ukwikimedia_p views from labsdb hosts - T169488
  • 15:00 moritzm: installing apache security updates on app server canaries
  • 14:40 andrewbogott: rebooting labvirt1015-1018 for kernel updates
  • 14:23 marostegui: Deploy alter table on db1059 - T168661
  • 14:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1059 - T168661 (duration: 00m 41s)
  • 14:20 marostegui@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 14:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 - T168661 (duration: 00m 42s)
  • 14:13 moritzm: reimaging mw2118 (video scaler) to jessie
  • 14:12 TabbyCat: mwscript updateCollation.php --wiki=frwiktionary --previous-collation=uppercase is being running by zfilipin to finish T169810
  • 14:11 zeljkof: EU SWAT finished
  • 14:01 dcausse: elastic@eqiad unbanning elastic1018 & elastic1021
  • 13:49 chasemp: labstore2003:~# umount -fl /srv/backup/tools (for T169774 recovery)
  • 13:44 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgCategoryCollation to uca-default for fr.wiktionary (T169810) (duration: 00m 42s)
  • 13:43 dcausse: elastic@eqiad banning elastic1018 & elastic1021 to rebalance heavy shards
  • 13:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add import sources for specieswiki (T170094) (duration: 00m 43s)
  • 13:29 moritzm: installing graphite2 security updates (image lib)
  • 13:28 marostegui: Disable puppet on db1102 to run check_private_data - T153743
  • 13:20 zfilipin@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: [cirrus] Enable the token_count_router only for chinese (T169498) (duration: 00m 43s)
  • 13:06 milimetric@tin: Finished deploy [analytics/refinery@c22eb93]: Update Sqoop with better parallelism (duration: 02m 54s)
  • 13:04 milimetric@tin: Started deploy [analytics/refinery@c22eb93]: Update Sqoop with better parallelism
  • 12:58 marostegui: Run redact_sanitarium on s2 and s6 - db1102 - T153743
  • 12:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1079 to become master for sanitarium3 - T153743 (duration: 00m 41s)
  • 12:23 marostegui@tin: scap failed: average error rate on 2/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 12:22 moritzm: installing xorg-server security updates
  • 12:22 marostegui@tin: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 12:12 marostegui: Deploy alter table on db1091 - T168661
  • 12:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 - T168661 (duration: 00m 42s)
  • 12:02 marostegui: Upgrade db1102 to 10.1 and enable rbr triggers - T153743
  • 12:02 moritzm: installing bind security updates (we only have client libs/tools installed)
  • 11:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1097 - T168661 (duration: 00m 42s)
  • 11:36 marostegui: Stop MySQL on db1102 for maintenance - T153743
  • 11:16 moritzm: installing libgcrypt and expat security updates
  • 11:16 kartik@tin: Finished deploy [cxserver/deploy@c209bec]: Update cxserver to 3375da5 (duration: 02m 49s)
  • 11:13 kartik@tin: Started deploy [cxserver/deploy@c209bec]: Update cxserver to 3375da5
  • 10:28 addshore: WMDE Summer campaign deploy slot DONE
  • 10:28 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: WMDE Summer campaign - Add logging (fix spacing) NOOP (duration: 00m 43s)
  • 10:22 addshore@tin: Synchronized php-1.30.0-wmf.7/extensions/WikimediaEvents: WMDE Summer campaign - Add hook (duration: 00m 42s)
  • 10:21 addshore@tin: Synchronized php-1.30.0-wmf.7/extensions/CentralAuth: CentralAuth (undeployed patches) gerrit:363892, gerrit:363893, gerrit:363891 & revert gerrit:364182 T169261 (duration: 00m 47s)
  • 10:13 addshore: reverting https://gerrit.wikimedia.org/r/#/c/363891 as it is sitting on tin undeployed T169261
  • 09:59 moritzm: rebooting mc2* servers for kernel update
  • 09:54 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: WMDE Summer campaign - Add logging (duration: 00m 45s)
  • 09:10 marostegui: Compress innodb on wikidata on dbstore2001
  • 09:00 moritzm: rebooting mw1168 (video scaler) for kernel update
  • 08:52 moritzm: rebooting mwlog2001 for kernel update
  • 08:35 moritzm: rebooting ms1001 for kernel update
  • 08:29 moritzm: rebooting francium for kernel update
  • 08:17 marostegui: Deploy alter table on db1097 - T168661
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 - T168661 (duration: 00m 46s)
  • 08:16 marostegui@tin: scap failed: average error rate on 2/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 08:03 marostegui: Drop database l10nwiki on s2 - T119811
  • 07:53 moritzm: rebooting hafnium for kernel update
  • 07:18 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=cp3009.*
  • 07:13 moritzm: reboot netmon1001 for kernel update
  • 06:11 marostegui: Deploy alter table on s1 - db1080 and db1067 - T166204
  • 06:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080, depool db1067 - T166204 (duration: 00m 42s)
  • 05:59 marostegui@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 02:27 l10nupdate@tin: scap failed: average error rate on 2/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)

2017-07-08

  • 22:14 bd808: Deleted ukwikimedia records in CentralAuth localuser and localnames tables for T170005.

2017-07-07

  • 21:54 legoktm@tin: Synchronized php-1.30.0-wmf.7/extensions/CentralAuth/: Fix handling of password hash upgrade on login - T169261 (duration: 00m 45s)
  • 21:52 demon@tin: Synchronized wmf-config/interwiki.php: Updating interwiki cache, T169979 (duration: 00m 43s)
  • 15:07 marostegui: Stop MySQL on db1102 for MariaDB upgrade
  • 15:00 dcausse: deleting commonswiki_file_1499379383 on elastic@eqiad (failed reindex)
  • 12:20 elukey: restart mysql on dbstore1002 - high swap used
  • 11:40 moritzm: rebooting rdb* servers in codfw for kernel update
  • 10:30 gehel: restarting elastic1043 (corrupted statistics)
  • 09:42 gehel: unbanning elastic1020 and 1026 from elasticsearch eqiad
  • 09:37 gehel: restarting elastic1036 (corrupted statistics)
  • 09:30 TabbyCat: Global rename of Idh0854 → Garam has finished (T167031)
  • 09:24 moritzm: installing NTP security updates on trusty hosts
  • 09:23 akosiaris: schedule a month's worth of downtime for ores100X
  • 08:56 moritzm: restarting HHVM on app server canaries to pick up libgcrypt and expat updates
  • 08:54 _joe_: reenabling puppet across the fleet
  • 08:52 _joe_: restarting apache on all puppetmaster, after a successful puppet run
  • 08:39 _joe_: disabling puppet across the fleet for enabling directory environments in puppet
  • 08:32 moritzm: installing expat security updates
  • 08:27 TabbyCat: Starting global rename of Idh0854 → Garam (T167031)
  • 08:23 gehel: banning elastic1020 and elastic1026 from elasticsearch eqiad cluster
  • 07:55 moritzm: installing libgcrypt security updates
  • 07:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 - T166204 (duration: 00m 42s)
  • 07:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2056 - T169510 (duration: 00m 43s)
  • 07:37 marostegui@tin: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 06:49 moritzm: rebooting bast3002 for kernel update

2017-07-06

  • 23:26 demon@tin: Synchronized php-1.30.0-wmf.7/extensions/MobileFrontend/includes/MobileFrontend.hooks.php: Only message box styles should be loaded on editor (duration: 00m 43s)
  • 23:15 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Wikivoyage projects can show more than 3 related articles (duration: 00m 43s)
  • 21:52 ppchelko@tin: Finished deploy [changeprop/deploy@e1230e6]: Extend automatic blacklisting T169911 (duration: 01m 09s)
  • 21:51 ppchelko@tin: Started deploy [changeprop/deploy@e1230e6]: Extend automatic blacklisting T169911
  • 20:27 ejegg: turned all SmashPig jobs back on
  • 19:51 ejegg: updated SmashPig from d4458fa to 523d6dd
  • 19:46 ejegg: rolled back SmashPig to d4458fa
  • 19:44 ejegg: updated SmashPig from d4458fa to 523d6dd
  • 19:40 ejegg: disabled smashpig jobs and donation queue consumer
  • 19:23 chasemp: labstore2003 time bash restore.sh &> /tmp/restore_7_6_2017v1.log for T169774
  • 19:22 demon@tin: Finished scap: Forcing l10n rebuild for James_F, plus some wmf-config cleanup (duration: 17m 22s)
  • 19:05 demon@tin: Started scap: Forcing l10n rebuild for James_F, plus some wmf-config cleanup
  • 18:51 niharika29@tin: Synchronized php-1.30.0-wmf.7/extensions/CodeMirror: (no justification provided) (duration: 00m 43s)
  • 18:38 niharika29@tin: Synchronized php-1.30.0-wmf.7/extensions/CirrusSearch/: Fix metastore.php notices https://gerrit.wikimedia.org/r/#/c/363637/ (duration: 00m 53s)
  • 18:31 niharika29@tin: Synchronized php-1.30.0-wmf.7/extensions/CirrusSearch/: Fix metastore.php notices https://gerrit.wikimedia.org/r/#/c/363637/ (duration: 00m 54s)
  • 18:11 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Add CodeMirror as a beta feature [mediawiki-config] - https://gerrit.wikimedia.org/r/363497 (duration: 00m 43s)
  • 18:03 herron: moved ununpentium to exim4-daemon-light - T169794
  • 16:50 demon@tin: Synchronized README: Testing testing 1 2 3 (duration: 00m 44s)
  • 16:45 godog: manually create mwdeploy's new home
  • 16:12 godog: bounce thumbor to apply https://gerrit.wikimedia.org/r/#/c/363626/
  • 16:02 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp4014.ulsfo.wmnet,service=varnish-be
  • 15:12 herron: extend mx[1,2]001 exim log retention to 60 days - T167333
  • 14:50 moritzm: rebooting prometheus1003/1004 for kernel update
  • 14:39 moritzm: rebooting prometheus2004 for kernel update
  • 14:39 ema: repool cp4006
  • 14:29 ema: restart pybal on lvs4001 T169765
  • 14:26 moritzm: rebooting prometheus2003 for kernel update
  • 14:20 ema: restart pybal on lvs4003 T169765
  • 14:16 godog: upgrade labmon to grafana 4.4.1 - T169773
  • 14:08 ema: restart pybal on lvs4002 T169765
  • 14:06 ema: restart pybal on lvs4004 T169765
  • 14:02 ema: depool cp4006
  • 13:59 moritzm: rebooting restbase2001 for kernel update
  • 13:29 ema: cp4013: upgrade to varnish 4.1.7-1wm1 and reboot for kernel update
  • 13:28 marostegui: Stop MySQL on db2056 for maintenance - T169510
  • 13:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2056 - T169510 (duration: 00m 44s)
  • 12:58 herron: changed lists.wikimedia.org spf to soft fail (~all) - T167703
  • 12:48 moritzm: rebooting mc* servers in codfw for kernel update
  • 12:20 moritzm: reboot lithium for kernel update
  • 12:14 marostegui: Deploy alter table db1083 - https://phabricator.wikimedia.org/T166204
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T166204 (duration: 00m 44s)
  • 11:15 elukey: reboot conf2003 for kernel updates
  • 11:07 moritzm: rebooting restbase1017 for kernel update
  • 10:52 elukey: reboot conf2002 for kernel update
  • 10:34 moritzm: rebooting ocg1001/1002 for kernel update
  • 10:18 moritzm: rebooting ocg1003 for kernel update
  • 08:52 moritzm: rebooting restbase-test cluster for kernel updates
  • 08:36 moritzm: rebooting restbase1014 for kernel update
  • 08:04 moritzm: rebooting restbase1013 for kernel update
  • 07:48 jynus@tin: Synchronized wmf-config/db-eqiad.php: Revert parsercaches to pc100[456] (duration: 00m 43s)
  • 07:42 moritzm: reboot wasat for kernel update
  • 07:15 marostegui: Stop MySQL on dbstore2002 for maintenance - T169510
  • 07:11 marostegui: Disable puppet on dbstore2002 - T169510
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T168661 (duration: 00m 44s)
  • 06:42 moritzm: rebooting wtp1* for kernel update
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083, depool db1089 - T168661 (duration: 00m 43s)
  • 06:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T168661 (duration: 00m 42s)
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T168661 (duration: 00m 42s)
  • 06:19 marostegui@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 06:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T168661 (duration: 00m 42s)
  • 05:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T168661 (duration: 00m 42s)
  • 05:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T168661 (duration: 00m 42s)
  • 05:16 marostegui: Stop mysql on db2056 for maintenance - T148507 T169510
  • 05:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 - T168661 (duration: 00m 43s)
  • 05:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T168661 (duration: 00m 43s)
  • 04:56 marostegui: Deploy alter table on s1 eqiad hosts - T168661
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jul 6 02:37:44 UTC 2017 (duration 6m 39s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 09m 25s)
  • 01:26 ejegg: re-enabled fundraising jobs
  • 00:46 mutante: labcontrol1002 has multiple IPs, 208.80.154.102 (no DNS name) and 208.80.154.12 (labservices1002). labservices1002 is another host that ALSO has the 208.80.154.12 IP and 208.80.154.20 (lab-recursor1). Can the duplicate IP be removed from one of them? T169039
  • 00:21 twentyafterfour: phabricator deployment really finished this time. really.
  • 00:18 twentyafterfour: diffusion fatals resolved by restarting apache and clearing phabricator's bytecode cache
  • 00:16 twentyafterfour: restarting apache and clearing phabricator caches
  • 00:04 twentyafterfour: phabricator update completed.
  • 00:01 twentyafterfour: preparing to deploy phabricator release/2017-07-05/1 (Milestone: https://phabricator.wikimedia.org/project/view/2881/ )

2017-07-05

  • 23:34 eileen: civicrm updated from a9e3e0c to 8914782
  • 22:47 Reedy: running `mwscript updateArticleCount.php --wiki=commonswiki --update` on screen on terbium T169822
  • 22:29 mutante: subra/suhail: re-enabled puppet, now with role::spare, no more poolcounter, scheduled icinga downtimes for decom (T169506)
  • 22:27 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Syncing InitialiseSettings for CodeMirror deployment (take 4) (duration: 00m 42s)
  • 22:26 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 22:21 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Syncing InitialiseSettings for CodeMirror deployment (take 2) (duration: 00m 42s)
  • 22:20 demon@tin: Synchronized wmf-config/CommonSettings.php: apache_request_headers protection (duration: 00m 42s)
  • 22:16 mutante: subra/suhail: disabling puppet, stopping poolcounterd, stopping other services, first step of decom, replaced by poolcounter200[12] (T169506)
  • 22:14 niharika29@tin: Finished scap: Deploying Codemirror on testwiki- full scap (T169284) (duration: 20m 43s)
  • 22:09 eileen: upgrade CiviCRM from ea9e3af to a9e3e0c
  • 21:53 niharika29@tin: Started scap: Deploying Codemirror on testwiki- full scap (T169284)
  • 21:53 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Syncing InitialiseSettings for CodeMirror deployment (duration: 00m 42s)
  • 21:51 niharika29@tin: scap aborted: Deploying Codemirror on testwiki- full scap (T169284) (duration: 03m 10s)
  • 21:47 niharika29@tin: Started scap: Deploying Codemirror on testwiki- full scap (T169284)
  • 21:38 niharika29@tin: Synchronized php-1.30.0-wmf.7/extensions/CodeMirror/: Deploying CodeMirror to testwiki (T169284) (duration: 00m 44s)
  • 21:03 chasemp: add madhuvishy to wmf-nda phab group
  • 20:58 eileen: update CiviCRM from e53d621 to ea9e3af
  • 20:14 RainbowSprinkles: commonswiki: nevermind that article count thing
  • 20:07 mutante: phab2001 - deleted /etc/systemd/system/phd.service (base::service_unit uses /lib/systemd/system/phd.service both have DIFFERENT content and conflicted, causing systemd degradation after reboot)
  • 19:48 RainbowSprinkles: commonswiki: running updateArticleCount.php (against the vslow slave)
  • 19:31 mutante: phab2001 - rebooting for kernel upgrade
  • 19:19 niharika29@tin: Synchronized php-1.30.0-wmf.7/extensions/MobileApp/config/android.json: Syncing in hopes of invalidating cache (duration: 00m 42s)
  • 18:59 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Fix nowikisource template namespace subpages [mediawiki-config] - https://gerrit.wikimedia.org/r/362272 (duration: 00m 42s)
  • 18:46 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Add 'WP' namespace alias to ruwiki [mediawiki-config] - https://gerrit.wikimedia.org/r/362267 (duration: 00m 42s)
  • 18:41 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Add H as wgNamespaceAlias to NS_HELP for en.wikisource [mediawiki-config] - https://gerrit.wikimedia.org/r/362508 (duration: 00m 42s)
  • 18:36 niharika29@tin: Synchronized php-1.30.0-wmf.7/extensions/MobileApp/: Enable description editing for all wikis except enwiki. (T146705) (duration: 00m 43s)
  • 18:26 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgCategoryCollation to 'numeric' at he.wikisource [mediawiki-config] - https://gerrit.wikimedia.org/r/362592 (duration: 00m 43s)
  • 18:14 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable OOjs UI EditPage buttons on es/fr/it/ja/ru-wiki and meta [mediawiki-config] - https://gerrit.wikimedia.org/r/360370 (duration: 00m 45s)
  • 18:08 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mobile non-JavaScript editing on ptwiki [mediawiki-config] - https://gerrit.wikimedia.org/r/361455 (duration: 00m 45s)
  • 18:00 moritzm: rebooting tungsten for kernel update
  • 17:53 moritzm: rebooting osmium for kernel update
  • 17:46 gehel: cleaning /srv/wdqs/import on all wdqs servers
  • 17:41 apergos: re-enabled puppet on stat1003 (last dataset nfs client), manually mounted /mnt/data because puppet run has an unrelated error
  • 16:33 jynus: restart mysql on db2062
  • 16:04 ema: restart pybal on lvs200[12] to make them reconnect to conf2001
  • 16:03 ema: restart pybal on lvs200[45] to make them reconnect to conf2001
  • 15:54 jynus: restart mysql on db2072
  • 15:30 apergos: re-enabled puppet on stat1002, did a manual run, dataset filesystem available again there
  • 15:09 apergos: re-enabled puppet on snapshot6,7, still watching dataset1001 performance
  • 15:09 ema: restart pybal on lvs2003 to make it reconnect to conf2001
  • 14:45 ema: bounce pybal on lvs2006, not synced with etcd information
  • 14:40 moritzm: rebooting restbase1012 for kernel update
  • 14:19 moritzm: rebooting logstash100[4-6] for kernel update
  • 14:00 moritzm: rebooting logstash100[1-3] for kernel update
  • 13:59 ema: cache_misc: upgrade to varnish 4.1.7-1wm1 and reboot for kernel update
  • 13:48 apergos: re-enabling puppet on snapshot1001, 1005 for testing
  • 13:46 moritzm: rebooting restbase1011 for kernel update
  • 13:44 zeljkof: EU SWAT finished!
  • 13:43 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Set Wikibase readFullEntityIdColumn setting to false (duration: 00m 42s)
  • 13:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable WikiLove for ckbwiki (T169563) (duration: 00m 43s)
  • 13:24 zfilipin@tin: Synchronized dblists/closed.dblist: SWAT: Reopen nlwikinews (T168764) (duration: 02m 50s)
  • 13:21 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1196.eqiad.wmnet
  • 13:18 apergos: power cycled dataset1001, crashed, unresponsivle on mgmt console
  • 13:18 zfilipin@tin: Synchronized dblists/closed.dblist: SWAT: Reopen nlwikinews (T168764) (duration: 02m 50s)
  • 13:16 elukey: reboot conf2001 for kernel updates
  • 13:09 moritzm: rebooting restbase1010 for kernel update
  • 12:49 marostegui: Force BBU relearn on db1016 - T166344
  • 12:36 marostegui: Move labsdb1010 main general replication thread to a named replication thread called db1095 - T153743
  • 12:33 marostegui: Stop all replication threads on db1095 for maintenance - T153743
  • 12:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 - T153743 (duration: 02m 49s)
  • 12:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T168661 (duration: 02m 50s)
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T168661 (duration: 02m 51s)
  • 12:11 apergos: puppet is currently disabled again on snapshots 1,5,6,7 and on dataset1001; we saw the same nfs issue shortly after reboot, with no dump processes going, as snapshots 5,6,7 had not remounted the filesystem
  • 11:20 moritzm: rebooting wtp2* servers for kernel update
  • 11:14 moritzm: rebooting restbase1009 for kernel update
  • 10:56 hashar: restarting Jenkins for plugin upgrades
  • 10:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T168661 (duration: 02m 59s)
  • 10:41 marostegui: Run redact_sanitarium on s6 databases db1102 - T153743
  • 10:41 moritzm: rebooting wtp1001 for kernel update
  • 10:37 moritzm: rebooting restbase1008 for kernel update
  • 10:32 apergos: rebooting snapshot hosts to clean up hung nfs client processes
  • 10:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T168661 (duration: 02m 51s)
  • 10:24 apergos: rebooted dataset1001 to unstick nfsd and pick up new kernel, re-enabled puppet
  • 10:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T168661 (duration: 02m 50s)
  • 10:11 moritzm: rebooting restbase1007 for kernel update
  • 10:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T168661 (duration: 02m 50s)
  • 09:57 marostegui: Deploy alter table on s1 eqiad hosts - T168661
  • 09:48 godog: move 'instances' graphite hierarchy out of the way, do not delete yet - T143405
  • 09:27 marostegui: Stop MySQL on db1085 for maintenance - T153743
  • 09:21 godog: upload nginx_1.11.10-1+wmf2 to jessie-wikimedia and nginx_1.11.10-1+wmf2~stretch1 to stretch-wikimedia
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 - T153743 (duration: 02m 50s)
  • 08:44 apergos: puppet disabled and processes accessing dataset1001 exported filesystem shot, on: stat1002,3, snapshot1001,5,6,7, while investigation continues
  • 07:27 moritzm: rebooting restbase-dev* for kernel update
  • 07:13 moritzm: rebooting notebook* hosts
  • 05:18 marostegui: Deploy alter table on s3 master - db1075 - T168661
  • 05:13 marostegui: Deploy alter table on s7 master - db1062 - T168661
  • 05:08 marostegui: Force a relearn on db1046's BBU - T166141
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 10m 23s)

2017-07-04

  • 21:40 volans: ACK'ed puppet not running on stat100[2-3],snapshot100[1,5-7] due to NFS overloaded on dataset1001 - T169680
  • 16:54 jynus: dropping ukwikimedia from several labsdbhosts
  • 16:10 moritzm: rebooting radium for kernel update
  • 15:09 mobrovac@tin: Finished deploy [citoid/deploy@9d22567]: Fallback to crossRef (T165105) and use MarcXML (T165105) (duration: 02m 52s)
  • 15:06 mobrovac@tin: Started deploy [citoid/deploy@9d22567]: Fallback to crossRef (T165105) and use MarcXML (T165105)
  • 15:02 godog: set operations/debs/nginx as hidden and update description
  • 14:57 ema: pybal 1.13.7 uploaded to apt.w.o, testing it on pybal-test2001 T82747 T154759
  • 14:31 godog: copy nginx from jessie-wikimedia to stretch-wikimedia
  • 14:15 paravoid: reset db2038's iLO
  • 13:06 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2005.codfw.wmnet
  • 11:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove comments from db1039 status - T166208 (duration: 02m 50s)
  • 11:25 joal@tin: Finished deploy [analytics/refinery@88cbb9e]: Regular weekly deploy (2) - Bug patch (duration: 03m 38s)
  • 11:21 joal@tin: Started deploy [analytics/refinery@88cbb9e]: Regular weekly deploy (2) - Bug patch
  • 11:15 elukey: powercycle elastic1018, host unreachable
  • 11:02 joal@tin: Finished deploy [analytics/refinery@12c5f57]: Regular weekly deploy (duration: 04m 47s)
  • 11:00 moritzm: rebooting kubernetes workers for kernel update
  • 10:58 godog: copy wikimedia-lvs-realserver from jessie-wikimedia to stretch-wikimedia
  • 10:57 joal@tin: Started deploy [analytics/refinery@12c5f57]: Regular weekly deploy
  • 10:53 gehel: killing stuck wmf-reimage on puppetmaster1001 for maps-test2001
  • 10:40 marostegui: Stop replication on db1102 (sanitarium3) on s2 shard for maintenance - T153743
  • 10:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 - T153743 (duration: 02m 49s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1035 - T168661 (duration: 02m 49s)
  • 10:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1035 - T168661 (duration: 02m 50s)
  • 09:58 marostegui: Move labsdb1009 main general replication thread to a named replication thread called db1095 - T153743
  • 09:54 marostegui: Stop replication on db1095 for maintenance - T153743
  • 09:38 moritzm: rebooting restbase2002-restbase2004 for kernel updates
  • 09:27 moritzm: rebooting thumbor1001/1002 for kernel updates
  • 08:54 marostegui: Run redact_sanitarium on db1102 (sanitarium3) - T153743
  • 08:39 moritzm: rebooting sca2* for kernel update
  • 08:25 elukey: restart redis 6380 (slave) jobqueue instance on rdb1004/2003 to force resync with master
  • 08:12 moritzm: powercycling mw1260, stuck in reboot
  • 07:56 moritzm: powercycling mw1259, stuck in reboot
  • 07:52 gehel: restart of relforge for kernel upgrade
  • 07:42 moritzm: rebooting video scalers in eqiad for kernel update
  • 07:15 marostegui: Deploy alter table on s3 hosts (eqiad) - T168661
  • 06:05 marostegui: Stop MySQL on db1060 for maintenance - T153743
  • 05:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 - T153743 (duration: 02m 51s)
  • 05:26 marostegui: Deploy alter table on s5 directly on s5 master (db1063) - T168661
  • 05:20 marostegui: Deploy alter table on s6 directly on s6 master (db1061) - T168661
  • 05:08 marostegui: Deploy alter table on s2 directly on s2 master (db1054) - T168661
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 10m 14s)
  • 01:30 mutante: releases1001: switching GID of reprepro and promemetheus-node-exporter group (1000 vs 1001), changing reprepro UID to 13927. using find -exec to fix all the permissions and make it identical to bromine. prevent permissions snafu when rsyncing (T164030)

2017-07-03

  • 20:46 gehel: unbanning elastic1018 from elasticsearch eqiad cluster
  • 20:24 gehel: banning elastic1018 from elasticsearch eqiad clsuter
  • 19:29 hashar: restarting jenkins
  • 19:10 nuria@tin: Finished deploy [eventlogging/analytics@328dea6]: (no justification provided) (duration: 00m 03s)
  • 19:09 nuria@tin: Started deploy [eventlogging/analytics@328dea6]: (no justification provided)
  • 17:35 chasemp: labvirt1003:~# service nova-compute restart
  • 16:55 bd808: Running maintain-views --all-databases --clean --replace-all --debug on labsdb1001
  • 16:51 mobrovac@tin: Finished deploy [mobileapps/deploy@58a5b19]: (no justification provided) (duration: 00m 41s)
  • 16:51 mobrovac@tin: Started deploy [mobileapps/deploy@58a5b19]: (no justification provided)
  • 16:02 chasemp: labvirt1002:~# service nova-compute restart
  • 15:43 mobrovac@tin: Finished deploy [mobileapps/deploy@58a5b19]: Remove pronunciation from the spec - T169299 (duration: 09m 30s)
  • 15:33 mobrovac@tin: Started deploy [mobileapps/deploy@58a5b19]: Remove pronunciation from the spec - T169299
  • 15:30 ema: cp1099: restart varnish-be
  • 15:16 chasemp: labcontrol1001 clean out admin-monitoring leaks
  • 15:12 chasemp: labvirt1001 service nova-compute restart
  • 14:40 elukey: running EventLogging alter tables on dbstore1002 (script in /home/elukey/dbstore1002.sql) - T167162
  • 14:33 akosiaris: set enable_notification=0 in icinga
  • 13:54 moritzm: rebooting ms-be2028 to ms-be2035 for kernel update
  • 13:43 marostegui: Global rename of Antero de Quintal → JMagalhĂŁes - T169527
  • 13:39 moritzm: uploaded apache2 2.4.10+deb8u9+wmf1 to apt.wikimedia.org/jessie-wikimedia
  • 12:40 marostegui: Compress innodb on db2056 - T169510
  • 12:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add comments about db2056 status - T169510 (duration: 02m 50s)
  • 12:26 elukey: reimage stat1005 with Debian Stretch
  • 12:03 moritzm: rebooting scb1004 for kernel update
  • 10:39 moritzm: rebooting mw1298 for kernel update
  • 09:37 marostegui: Global rename of Markos90 → Mαρκος - T169396
  • 09:24 marostegui: Deploy alter table on s1 directly on codfw master (db2016) and let it replicate - T166204
  • 09:07 _joe_: restarting the passenger app on puppetmasters in codfw serially with a sleep of 3 seconds for T169493
  • 08:58 _joe_: restarting the passenger app on puppetmaster1002 for T169493
  • 08:49 gehel: unbanning elastic1020 from elasticsearch eqiad
  • 08:30 marostegui: Compress dewiki on dbstore2001 - T168354
  • 08:25 gehel: banning elastic1020 from elasticsearch eqiad waiting for its recovery
  • 08:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add comments about db1039 status - T166208 (duration: 02m 49s)
  • 07:51 marostegui: Deploy alter table db1039 - s7 - T166208
  • 07:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 - T166208 (duration: 03m 00s)
  • 07:35 marostegui: Drop alter table s7 - labsdb1003 - T166208
  • 07:24 volans: bounced uwsgi-graphite-web on graphite1003, log stopped since Jul 2 10:23:45
  • 05:44 marostegui: Run redact sanitarium on db1069 - T160869
  • 05:31 marostegui: Run redact sanitarium on db1095 - T160869
  • 02:40 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 14m 45s)

2017-07-02

2017-07-01

  • 13:08 ppchelko@tin: Finished deploy [restbase/deploy@8ea07d6]: Manual blacklist for russian wiki (duration: 07m 59s)
  • 13:00 ppchelko@tin: Started deploy [restbase/deploy@8ea07d6]: Manual blacklist for russian wiki
  • 00:54 mutante: APT - importing php-net-ipv4 to stretch (for librenms) T159756

2017-06-30

  • 23:16 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw1196.eqiad.wmnet
  • 23:12 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Limit thanks for new users at pl.wikipedia to 3 per day - T169268 (duration: 02m 49s)
  • 23:05 mutante: librenms has been deployed on netmon1002 - works on stretch now - except Letsencrypt part, expected. not switched yet
  • 20:03 ariel@tin: Finished deploy [dumps/dumps@02c71bc]: permit batching of abstract jobs, fix a dryrun reporting typo, smaller stub/abstract queries (duration: 00m 03s)
  • 20:03 ariel@tin: Started deploy [dumps/dumps@02c71bc]: permit batching of abstract jobs, fix a dryrun reporting typo, smaller stub/abstract queries
  • 19:41 cmjohnson1: powering off mw1196 for unresponsive idrac
  • 19:37 cmjohnson1: powering off mw1191 for unresponsive idrac
  • 19:31 cmjohnson1: powering off mw1190 to reestablish idrac connection
  • 19:21 cmjohnson1: mw1182 powering down to due to unresponsive idrac
  • 17:01 paravoid: rebooting mw1196
  • 15:39 gehel: unbanning elastic1019 from cluster and keeping an eye on it
  • 14:51 gehel: banning elastic1019 from cluster to move heavy shards around
  • 14:16 bblack: reboot cp4021
  • 13:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1028 - T166208 (duration: 00m 42s)
  • 13:32 bawolff: Reset 2FA of wikitech User:Samtar (T169332)
  • 12:47 jynus: just upgraded wmf-mariadb101-client on mariadb::client hosts
  • 12:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add comments to db1060 about its future usage as a sanitarium master - T153743 (duration: 00m 42s)
  • 12:08 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: Replace subra/suhail as poolcounters (duration: 00m 43s)
  • 12:07 akosiaris: replace subra and suhail as poolcounters in codfw
  • 11:22 _joe_: rebooting copper for kernel upgrade
  • 11:17 _joe_: purging varnish, varnish-dbg from copper
  • 11:08 jynus: removing leftover data on tegmen T149557
  • 10:55 elukey: deploy kafkatee 0.1.6-1 to oxygen
  • 10:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 - T168661 (duration: 00m 42s)
  • 10:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 - T168661 (duration: 00m 42s)
  • 09:54 elukey: uploaded kafkatee 0.1.6-1 to reprepro - T151748
  • 09:10 marostegui: Deploy alter table on s5 all eqiad hosts (primary master not included) - T168661
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1037 - T168661 (duration: 00m 42s)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1037 - T168661 (duration: 00m 42s)
  • 08:34 ayounsi@tin: Finished deploy [librenms/librenms@3f407a7]: (no justification provided) (duration: 00m 05s)
  • 08:34 ayounsi@tin: Started deploy [librenms/librenms@3f407a7]: (no justification provided)
  • 08:29 ayounsi@tin: Finished deploy [librenms/librenms@b10cc7c]: (no justification provided) (duration: 00m 02s)
  • 08:29 ayounsi@tin: Started deploy [librenms/librenms@b10cc7c]: (no justification provided)
  • 08:16 akosiaris: poweroff labcontrol1003. It was in the deian installer
  • 07:20 akosiaris: restart pdfrender on scb1002
  • 06:45 _joe_: started manually burrow on krypton, could not start due to a stale pidfile
  • 06:38 marostegui: Deploy alter table on s6 all eqiad hosts (primary master not included) - T168661
  • 06:12 marostegui: Deploy alter table on db1018 on s2 - T168661
  • 06:12 marostegui: Deploy alter table on db1090 on s2 - T168661
  • 06:11 marostegui: Deploy alter table on db1076 on s2 - T168661
  • 06:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 - T168661 (duration: 00m 42s)
  • 06:10 marostegui: Deploy alter table on db1074 on s2 - T168661
  • 06:07 marostegui: Deploy alter table on db1060 on s2 - T168661
  • 06:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 - T168661 (duration: 00m 42s)
  • 05:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1036 - T168661 (duration: 00m 43s)
  • 05:56 marostegui: Deploy alter table on db1047 on s2 - T168661
  • 05:56 marostegui: Deploy alter table on db1036 on s2 - T168661
  • 05:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 - T168661 (duration: 00m 42s)
  • 05:47 marostegui: Deploy alter table on db1021 on s2 - T168661
  • 05:44 marostegui: Deploy alter table on dbstore1002 on s2 - T168661
  • 05:43 marostegui: Deploy alter table on dbstore1001 on s2 - T168661
  • 05:37 marostegui: Deploy alter table on db1069 (and let it replicate) on s2 - T168661
  • 02:29 chasemp: labstore1005 start drbd
  • 02:14 chasemp: reboot labstore1005 (5m ago)
  • 01:25 chasemp: reboot labstoer1005
  • 01:23 chasemp: fail nfs from labstore1005 to labstore1004 (I failed to log a previous failover to 1004 and back)

2017-06-29

  • 23:32 RoanKattouw: Sorry I meant T169163
  • 23:31 catrope@tin: Synchronized php-1.30.0-wmf.7/resources/src/mediawiki.rcfilters/: RCFilters fixes (T169169, T169107, T169042) (duration: 00m 42s)
  • 23:28 catrope@tin: Synchronized php-1.30.0-wmf.7/extensions/WikimediaEvents/: Add event logging for explode-similar on SRP (T149809) (duration: 00m 42s)
  • 23:26 catrope@tin: Synchronized php-1.30.0-wmf.7/extensions/CirrusSearch/: "Explore similar" widget for CirrusSearch (T149809) (duration: 00m 54s)
  • 23:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Stop reader surveys (T131949) (duration: 00m 43s)
  • 22:16 chasemp: set cfq scheduler on labstore1005
  • 21:51 mutante: APT - uploading python-django-south from jessie to wikimedia-stretch for servermon on stretch (T159756)
  • 21:40 chasemp: reboot labstore1004 with grub set to gnulinux-advanced-1773f282-5a1b-441e-865c-8b70a0ebc925>gnulinux-4.4.0-3-amd64-advanced-1773f282-5a1b-441e-865c-8b70a0ebc925
  • 21:10 mobrovac@tin: Finished deploy [restbase/deploy@bcb83f4]: (no justification provided) (duration: 04m 21s)
  • 21:06 mobrovac@tin: Started deploy [restbase/deploy@bcb83f4]: (no justification provided)
  • 21:06 mobrovac@tin: Finished deploy [restbase/deploy@bcb83f4]: (no justification provided) (duration: 01m 02s)
  • 21:05 mobrovac@tin: Started deploy [restbase/deploy@bcb83f4]: (no justification provided)
  • 21:04 mobrovac@tin: Finished deploy [restbase/deploy@bcb83f4]: Fix special char handling in PDF back-end requests - T169223 (duration: 03m 14s)
  • 21:00 mobrovac@tin: Started deploy [restbase/deploy@bcb83f4]: Fix special char handling in PDF back-end requests - T169223
  • 20:46 mutante: APT - reprepro copy stretch-wikimedia jessie-wikimedia prometheus-snmp-exporter (to make it available on stretch for netmon1002) (T159756)
  • 20:39 ppchelko@tin: Finished deploy [changeprop/deploy@350076c]: Config: Enable red links processing. T133221 (duration: 01m 01s)
  • 20:38 ppchelko@tin: Started deploy [changeprop/deploy@350076c]: Config: Enable red links processing. T133221
  • 19:23 demon@tin: Synchronized scap/plugins/clean.py: Because I need to learn basic python syntax before trying stuff (duration: 00m 42s)
  • 19:21 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.7 refs T167536
  • 19:16 twentyafterfour: deploying wmf/1.30.0-wmf.7 to all wikis refs T167536
  • 19:06 demon@tin: Pruned MediaWiki: 1.30.0-wmf.5 [keeping static files] (duration: 01m 16s)
  • 18:30 chasemp: restart nfs on labstore1004 (primary)
  • 18:10 demon@tin: Synchronized php-1.30.0-wmf.7/extensions/TextExtracts/extension.json: T107206 (duration: 00m 47s)
  • 17:28 arlolra: Updated Parsoid to b4187f18 (T168900, T168675, T168404, T153203)
  • 17:21 arlolra@tin: Finished deploy [parsoid/deploy@717df08]: Updating Parsoid to b4187f18 (duration: 09m 41s)
  • 17:12 arlolra@tin: Started deploy [parsoid/deploy@717df08]: Updating Parsoid to b4187f18
  • 17:08 mobrovac: scb2005 repooling back the services - T167763
  • 16:21 godog: temporarily stop ircecho, puppet spam
  • 16:05 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 00m 46s)
  • 15:40 akosiaris: disable puppet on all of eqiad/esams, problems with ganeti and puppetdb
  • 15:38 chasemp: restart nfs-exportd on labstore1004
  • 15:34 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 02m 54s)
  • 15:26 mobrovac: scb2005 depooled all services for T167763
  • 15:09 chasemp: set downtimes for labstore1004/1005 failover see https://etherpad.wikimedia.org/p/labstore_reboots
  • 15:02 akosiaris: purge d-i-test from puppet/salt
  • 14:57 akosiaris: reboot aluminium.wikimedia.org bromine.eqiad.wmnet etherpad1001.eqiad.wmnet d-i-test.eqiad.wmnet kubestagetcd1001.eqiad.wmnet mx1001.wikimedia.org seaborgium.wikimedia.org for kernel upgrades
  • 14:47 jynus: several restarts of db2072 services and host on the following hour
  • 14:30 ema: varnish 4.1.7-1wm1 uploaded to apt.w.o, cp1008 upgraded T164768
  • 14:08 marostegui: Deploy alter table on s7 on dbstore1001 - T166208
  • 13:54 godog: kick sdb out of mdadm arrays on bast3002 - T169035
  • 12:56 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 00m 46s)
  • 12:47 akosiaris: reboot argon.eqiad.wmnet, darmstadtium.eqiad.wmnet, dbmonitor1001.wikimedia.org, etcd1001.eqiad.wmnet, etcd1006.eqiad.wmnet, krypton.eqiad.wmnet, mendelevium.eqiad.wmnet, mwdebug1001.eqiad.wmnet, roentgenium.eqiad.wmnet, sca1003.eqiad.wmnet for kernel upgrades
  • 12:41 akosiaris: reboot poolcounter1001 for kernel upgrades
  • 12:38 marostegui: Stop replication on dbstore1002 - x1 - T169050
  • 12:29 akosiaris: reboot nitrogen for kernel upgrades
  • 12:23 gehel: forcing reindex of cirrus / elasticsearch after switch upgrade
  • 12:23 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 03m 05s)
  • 12:20 akosiaris: depool poolcounter1001 for kernel upgrades
  • 12:18 marostegui: Re-enable event scheduler on dbstore1001 - T169050
  • 11:58 marostegui: Stop replication on the same position for: dbstore1001 (s6) and db1050 - T169050
  • 11:51 godog: create xfs filesystems on fourth partition on ms-be machines - T151648
  • 11:48 ema: cp4015: restart varnish-be
  • 11:32 ema: route ulsfo back to codfw T168462
  • 11:09 ema@neodymium: conftool action : set/ttl=300; selector: dnsdisc=(citoid|restbase-async)
  • 11:06 ema: repool codfw in DNS after T168462
  • 11:03 ema@neodymium: conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=citoid
  • 11:02 ema@neodymium: conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=restbase-async
  • 11:02 ema@neodymium: conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=restbase-async
  • 10:57 elukey@tin: Finished deploy [analytics/refinery@f6cccf9]: Weekely refinery deployment (duration: 02m 56s)
  • 10:54 elukey@tin: Started deploy [analytics/refinery@f6cccf9]: Weekely refinery deployment
  • 10:53 elukey@tin: Finished deploy [analytics/refinery@f6cccf9]: Weekely refinery deployment (duration: 00m 11s)
  • 10:53 elukey@tin: Started deploy [analytics/refinery@f6cccf9]: Weekely refinery deployment
  • 10:47 ema@neodymium: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=citoid
  • 10:46 ema@neodymium: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=restbase-async
  • 10:45 ema: switching citoid and restbase-async back to codfw after T168462
  • 10:34 ema: re-enable puppet and start pybal on lvs2001-2003 T168462
  • 10:30 ema@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 10:30 ema: repooling acamar T168462
  • 09:29 godog: silence paging alerts for *.svc.codfw.wmnet for two hours - T168462
  • 08:34 marostegui: Shutdown MySQL and reboot db1034 for maintenance
  • 08:29 XioNoX: asw-a-codfw upgrade started - T168462
  • 08:25 ema: failover codfw LVSs to secondaries T168462
  • 08:19 elukey: restart pdfrender on scb1004 - xpra issue
  • 08:16 volans@neodymium: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=restbase-async
  • 08:15 volans@neodymium: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=citoid
  • 08:14 volans@neodymium: conftool action : set/ttl=60; selector: dnsdisc=restbase-async
  • 08:14 volans@neodymium: conftool action : set/ttl=60; selector: dnsdisc=citoid
  • 08:08 volans@neodymium: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=citoid
  • 08:07 volans@neodymium: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=restbase-async
  • 08:05 ema: bounce pybal on codfw secondary LVSs (lvs2004-2006)
  • 07:57 volans: switching citoid and restbase-async temporarily to eqiad for T168462
  • 07:47 XioNoX: Route cache traffic around codfw - T168462
  • 07:46 ema@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 07:44 XioNoX: codfw depooled from DNS - T168462
  • 07:36 elukey: depooled kafka2001.codfw.wmnet for T168462
  • 07:19 elukey@tin: Finished deploy [analytics/refinery@f6cccf9]: Updated stat1002 with the last refinery deployment (duration: 02m 55s)
  • 07:18 marostegui: Disable event scheduler temporarily on dbstore1001 - T169050
  • 07:16 elukey@tin: Started deploy [analytics/refinery@f6cccf9]: Updated stat1002 with the last refinery deployment
  • 07:13 marostegui: Deploy alter table on s7 - db1028 - T166208
  • 07:12 elukey@tin: Finished deploy [analytics/refinery@f6cccf9]: Updated stat1002 with the last refinery deployment (duration: 02m 36s)
  • 07:10 elukey@tin: Started deploy [analytics/refinery@f6cccf9]: Updated stat1002 with the last refinery deployment
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1028 - T166208 (duration: 00m 47s)
  • 07:09 joal@tin: Finished deploy [analytics/refinery@f6cccf9]: (no justification provided) (duration: 05m 55s)
  • 07:03 joal@tin: Started deploy [analytics/refinery@f6cccf9]: (no justification provided)
  • 04:01 Krinkle: 'service hhvm restart' on mwdebug1001 and mwdebug1002 (T168540)
  • 03:56 Krinkle: 'service hhvm restart' on mwdebug1001 and mwdebug1002 to help investigate T168540
  • 03:09 kartik@tin: Finished deploy [cxserver/deploy@6f0e9a7]: Update cxserver to e69353b (duration: 02m 28s)
  • 03:07 kartik@tin: Started deploy [cxserver/deploy@6f0e9a7]: Update cxserver to e69353b
  • 02:52 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jun 29 02:52:57 UTC 2017 (duration 6m 52s)
  • 02:46 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 07m 43s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.6) (duration: 09m 24s)
  • 01:08 mutante: mwlog1001 - deleted /srv/xenon/logs from 2015 and 2016 as requested by Krinkle. Also merged https://gerrit.wikimedia.org/r/#/c/362114/ so now logs are retained for 14 days
  • 00:23 krinkle@tin: Synchronized wmf-config/InitialiseSettings.php: I8ce28a4ce7 - test2wiki config cleanup (duration: 00m 47s)

2017-06-28

  • 23:44 thcipriani@tin: Synchronized php-1.30.0-wmf.7/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Adding ssclick events for sister-search results T168916 (duration: 00m 46s)
  • 23:36 thcipriani@tin: Synchronized php-1.30.0-wmf.6/extensions/MobileFrontend/includes/specials/SpecialMobileDiff.php: SWAT: Revert "Run DiffViewHeader in mobile mode, too" T169024 (duration: 00m 46s)
  • 23:35 thcipriani@tin: Synchronized php-1.30.0-wmf.7/extensions/MobileFrontend/includes/specials/SpecialMobileDiff.php: SWAT: Revert "Run DiffViewHeader in mobile mode, too" T169024 (duration: 00m 47s)
  • 22:10 demon@tin: Synchronized wmf-config/InitialiseSettings.php: rm more stupid logging, wow this stuff has piled up (duration: 00m 46s)
  • 22:09 ppchelko@tin: Finished deploy [eventstreams/deploy@ba71a84]: redeploy to pick up config changes (duration: 02m 01s)
  • 22:07 ppchelko@tin: Started deploy [eventstreams/deploy@ba71a84]: redeploy to pick up config changes
  • 22:06 demon@tin: Synchronized wmf-config/InitialiseSettings.php: kill temp-debug (duration: 00m 46s)
  • 21:50 robh: wtp1025-1048 are having icinga reporting errors, they are new installs on stretch
  • 21:48 demon@tin: Synchronized wmf-config/InitialiseSettings.php: kill weird testwiki logging (duration: 00m 47s)
  • 21:38 ppchelko@tin: Finished deploy [eventstreams/deploy@05bcc8f]: redeploy to pick up config changes (duration: 00m 20s)
  • 21:37 ppchelko@tin: Started deploy [eventstreams/deploy@05bcc8f]: redeploy to pick up config changes
  • 21:34 demon@tin: Synchronized wmf-config/InitialiseSettings.php: kill oai logging channel (duration: 00m 47s)
  • 20:17 twentyafterfour@tin: Synchronized php-1.30.0-wmf.7/extensions/VisualEditor/VisualEditor.hooks.php: sync https://gerrit.wikimedia.org/r/#/c/361941/ refs T169132 T167536 (duration: 00m 47s)
  • 20:08 mutante: migrating servermon to stretch on netmon1002 is currently blocked by "python-django-south" package not existing anymore
  • 19:36 robh: puppet suspended on install1002 for robh to livehack the dhcp file for a single reboot of wtp1025
  • 19:26 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.7 refs T167536
  • 19:26 twentyafterfour@tin: Synchronized php-1.30.0-wmf.7/extensions/LoginNotify/includes/Hooks.php: deploy https://gerrit.wikimedia.org/r/#/c/361935/ to wmf.7 refs T168899 + T167536 (duration: 00m 45s)
  • 19:17 twentyafterfour: cherry-picked https://gerrit.wikimedia.org/r/#/c/361935/ to wmf.7 refs T168899 + T167536
  • 19:00 ebernhardson: starting load testing of elasticsearch in codfw
  • 18:31 joal@tin: Finished deploy [analytics/refinery@f6cccf9]: Regular deploy - One week late- Big changes (duration: 04m 49s)
  • 18:26 joal@tin: Started deploy [analytics/refinery@f6cccf9]: Regular deploy - One week late- Big changes
  • 18:13 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable autopatrol flag on ptwikivoyage T168981 (duration: 00m 47s)
  • 18:05 aaron@tin: Synchronized wmf-config/CommonSettings.php: Set $wgTrxProfilerLimits[PostSend] to avoid notices for now (duration: 00m 47s)
  • 18:04 kartik@tin: Finished deploy [cxserver/deploy@894e3fe]: (no justification provided) (duration: 02m 03s)
  • 18:02 kartik@tin: Started deploy [cxserver/deploy@894e3fe]: (no justification provided)
  • 15:52 marostegui: Temporary ignore jawiki.watchlist table during replication on dbstore1001 - T169050
  • 15:47 kartik@tin: Finished deploy [cxserver/deploy@894e3fe]: (no justification provided) (duration: 02m 47s)
  • 15:44 kartik@tin: Started deploy [cxserver/deploy@894e3fe]: (no justification provided)
  • 15:29 jynus: slowly enabling puppet on pending database hosts, checking diff on each one
  • 14:42 hashar: pypi.python.org is back again - T169091
  • 14:06 hashar: pypi.python.org has an issue with its CDN . That would affect any CI jobs relying on tox/python - See https://status.python.org for updates and T169091
  • 14:03 hashar: pypi.python.org has an issue with its CDN . That would affect any CI jobs relying on tox/python - See https://status.python.org for updates
  • 13:51 XioNoX: tigntening BGP configuration on cr* routers - T169048
  • 13:44 gehel: start reimage of the maps-test cluster - T169011
  • 13:30 akosiaris: renumber install1002
  • 12:47 marostegui: Deploy alter table on s3 directly on codfw master (db2018) and let it replicate - T168661
  • 12:42 jynus: starting enabling puppet on db2* hosts
  • 12:37 XioNoX: restricted inbound BGP to configured neighbors on pfw - T169048
  • 12:18 marostegui: Deploy alter table on s7 directly on codfw master (db2029) and let it replicate - T168661
  • 11:48 akosiaris: renumber dubnium fermium meitnerium ununpentium
  • 11:14 elukey: stop eventlogging_sync on db1047 - alter tables running
  • 11:04 jynus: restarting db2062's mysql
  • 10:52 jynus: restarting db2072's mysql for testing of new config
  • 09:05 legoktm@tin: Synchronized php-1.30.0-wmf.7/includes/parser/ParserCache.php: Add debug logging for T168040 (duration: 00m 46s)
  • 08:46 legoktm@tin: Synchronized php-1.30.0-wmf.6/includes/parser/ParserCache.php: Add debug logging for T168040 (duration: 00m 48s)
  • 07:49 jynus: disable puppet on all database hosts for deployment of gerrit:361456
  • 07:33 marostegui: Re-enable event scheduler on dbstore2001 - T168354
  • 07:01 elukey: stop jobrunner/jobchron on mw130[4,5,6] and reboot them for kernel updates
  • 06:43 elukey: stop jobrunner/jobchron on mw130[2,3] and reboot them for kernel updates
  • 06:37 elukey: restart pdfrender.service on scb1003 - xpra race condition
  • 06:35 elukey: executed sudo -u _graphite find /var/lib/carbon/whisper/eventstreams/rdkafka -type f -mtime +10 -delete on graphite1001 to free space
  • 06:34 marostegui: Stop Replication in sync on db2033 and dbstore2001 (x1) - T168354
  • 05:55 marostegui: Temporarily disable event scheduler on dbstore2001 - https://phabricator.wikimedia.org/T168354
  • 05:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove comments from db1033 status - T166208 (duration: 00m 47s)
  • 05:24 marostegui: Stop MySQL and reboot db1034 for maintenance - T166208
  • 03:05 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jun 28 03:05:57 UTC 2017 (duration 7m 0s)
  • 02:58 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.7) (duration: 14m 50s)
  • 02:46 eileen: Update civicrm from d558df2 to e53d621
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.6) (duration: 07m 55s)
  • 01:43 demon@tin: Synchronized README: profiling (duration: 00m 47s)

2017-06-27

  • 23:26 demon@tin: Synchronized php-1.30.0-wmf.6/extensions/RelatedArticles/: Hygene and stuff (duration: 00m 46s)
  • 23:22 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Only enable logging on enwiki for MobileFormatter#moveFirstParagraphBeforeInfobox (duration: 00m 46s)
  • 23:20 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Removing wgMFContentNamespace (duration: 00m 46s)
  • 23:14 demon@tin: Synchronized portals: (no justification provided) (duration: 00m 47s)
  • 23:13 demon@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 47s)
  • 23:05 demon@tin: Synchronized dblists/: ukwikimedia swapped from closed to deleted (duration: 00m 46s)
  • 22:44 demon@tin: Synchronized README: force co-master sync (duration: 00m 47s)
  • 21:58 bblack: pybal restarts on lvs4004,lvs4002 for misc@ulsfo
  • 21:50 bblack: removing cp4001-4 (cache_misc@ulsfo), except a few minor related alerts from race conditions
  • 21:24 bblack: cp1074: restart backend (mailbox lag)
  • 21:03 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.30.0-wmf.7 refs T167536
  • 20:46 twentyafterfour@tin: Finished scap: sync 1.30.0-wmf.7 and promote to test wikis - refs T167536 (duration: 30m 44s)
  • 20:16 twentyafterfour@tin: Started scap: sync 1.30.0-wmf.7 and promote to test wikis - refs T167536
  • 18:41 godog: switch thumbor back on with a fix for T168949
  • 18:35 godog: upgrade thumbor to 0.1.41
  • 18:25 gehel: reduce cluster_concurrent_rebalance to 8 and node_concurrent_recoveries to 4 on elasticsearch eqiad
  • 18:05 hashar: Some CI jobs are broken with "tidy.so: cannot open shared object file: No such file or directory" see T169004
  • 17:52 twentyafterfour: branching 1.30.0-wmf.7 - T167536
  • 17:44 bblack: restart pybal on lvs4004
  • 16:37 mutante: releases1001 - setting boot parameters to network, rebooting
  • 16:26 mutante: rebooting ganeti instance releases1001 - which is down network-wise but was running
  • 16:23 godog: revert back to imagescalers for thumbs - T168949
  • 16:22 twentyafterfour: restarted apache on iridium, phabricator was running an old version of libphutil
  • 14:22 elukey: stop jobcron/jobrunner on mw1300 and mw1301 and reboot the hosts for kernel updates
  • 13:52 marostegui: Rename table enwiki.localisation_file_hash on db1089 - T119811
  • 12:35 marostegui: Deploy alter table on s4 directly on codfw master (db2019) to let it replicate - T168661
  • 12:19 marostegui: Deploy alter table on s5 directly on codfw master (db2023) to let it replicate - T168661
  • 12:06 elukey: stop jobcron/jobrunner on mw1167 and mw1299 and reboot the hosts for kernel updates
  • 11:58 marostegui: Deploy alter table on s6 directly on codfw master (db2028) to let it replicate - T168661
  • 11:54 elukey: stop nova-spiceproxy and neutron-metadata-agent on labtestnet2001 to avoid root partition to fill up
  • 11:48 akosiaris: upload apertium-spa-cat_2.1.0~r79717-1 to apt.wikimedia.org/jessie-wikimedia/main
  • 11:36 elukey: stop jobcron/jobrunner on mw116[56] and reboot the hosts for kernel updates
  • 11:36 akosiaris: upload apertium-spa_1.1.0~r79716-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 11:36 akosiaris: upload apertium-cat_2.2.0~r79715-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 10:29 elukey: stop jobcron/jobrunner on mw116[34] and reboot the hosts for kernel updates
  • 10:25 elukey: re-enabled puppet and eventlogging_sync on db1047
  • 09:49 marostegui: executing alter tables to the log database on dbstore1002 for https://phabricator.wikimedia.org/T167162#3340421
  • 09:43 bawolff@tin: Synchronized php-1.30.0-wmf.6/api.php: Use redirect for api requests with pathinfo (duration: 00m 43s)
  • 09:24 gehel: restart of maps eqiad cluster completed
  • 08:59 elukey: stop puppet and eventlogging_sync on db1047
  • 08:46 elukey: executing alter tables to the log database on db1047 for https://phabricator.wikimedia.org/T167162#3340421
  • 08:44 gehel: reboot maps eqiad cluster
  • 08:33 gehel: restart of maps codfw cluster completed
  • 08:25 akosiaris: upload etherpad-lite_1.6.0-3 to apt.wikimedia.org/jessie-wikimedia/main
  • 08:18 elukey: stop jobcron/jobrunner on mw116[12] and reboot the hosts for kernel updates
  • 08:14 marostegui: Re-enable event scheduler on dbstore2001 - T168354
  • 08:08 godog: roll-restart swift-proxy on ms-fe1* to pick up thumbor changes
  • 07:57 gehel: reboot maps codfw cluster
  • 07:16 marostegui: Temporarily disable event scheduler on dbstore2001 - T168354
  • 07:11 marostegui: Deploy alter table db1034 - T166208
  • 06:48 marostegui: Deploy alter table s7 on labsdb1001 - T166208
  • 06:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 - T166208 (duration: 00m 43s)
  • 06:40 marostegui: Deploy alter table s7 - dbstore1002 - T166208
  • 05:58 elukey: restored rdb2004 as slave of rdb2003 (end of experiment)
  • 05:08 marostegui: Global rename of Green Cardamom → GreenC - T168776
  • 05:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T166208 (duration: 00m 43s)
  • 03:43 mutante: smokeping on stretch means 2.6.11-3 vs 2.6.9-1 we had before
  • 03:35 mutante: smokeping - stop/rsync/fix permissions/start one more time to minimize gaps in graphs - now fully migrated netmon1001->netmon1002, historic data has been copied (T159756)
  • 03:28 mutante: netmon1002 - ganglia apache_status.py broken in stretch (?), ganglia deprecated, stopping gmond, aggregator role got removed, was for torrus
  • 03:03 mutante: netmon1002 - fixing permissions on /var/lib/smokeping rrd files (rsynced, inconstent UIDs )
  • 02:29 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jun 27 02:29:22 UTC 2017 (duration 6m 25s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.6) (duration: 07m 46s)
  • 00:39 mutante: netmon1001 - rsyncing smokeping data (/var/lib/smokeping) over to netmon1002

2017-06-26

  • 23:51 maxsem@tin: Synchronized php-1.30.0-wmf.6/extensions/Kartographer/: https://gerrit.wikimedia.org/r/#/c/361584/ (duration: 00m 44s)
  • 23:38 maxsem@tin: Synchronized fonts/: https://gerrit.wikimedia.org/r/361195 (duration: 00m 45s)
  • 23:24 twentyafterfour@tin: Synchronized php-1.30.0-wmf.6/extensions/Scribunto/engines/LuaSandbox/Engine.php: deploy https://gerrit.wikimedia.org/r/#/c/361508 (duration: 00m 43s)
  • 23:23 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/361508
  • 22:56 halfak@tin: Finished deploy [ores/deploy@82dfd56]: Unscheduled/urgent deploy (T168099) (duration: 30m 55s)
  • 22:49 bd808: Updated LDAP loginShell to /bin/bash for 969 accounts that were still set to /usr/local/bin/sillyshell (T86668)
  • 22:34 legoktm@tin: Synchronized php-1.30.0-wmf.6/extensions/Linter/includes/ApiRecordLint.php: Add debug logging for missing 'dsr' - T168900 (duration: 00m 43s)
  • 22:32 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Enable 'Linter' debug log channel (duration: 00m 44s)
  • 22:27 mutante: netmon1001 - deactivate rancid crons - now running on netmon1002 instead - avoid duplicate mails (T159756)
  • 22:25 halfak@tin: Started deploy [ores/deploy@82dfd56]: Unscheduled/urgent deploy (T168099)
  • 21:50 robh: shutting down and decommissioning mw117[0-9] per T168271
  • 21:27 bawolff: deployed patch for T128209
  • 21:00 robh: attempting firmware update on lvs1007, which is currently offline
  • 20:38 bsitzmann@tin: Finished deploy [mobileapps/deploy@07066c7]: Update mobileapps to 0b05026 (duration: 03m 41s)
  • 20:34 bsitzmann@tin: Started deploy [mobileapps/deploy@07066c7]: Update mobileapps to 0b05026
  • 19:56 herron: updated ops list accept_these_nonmembers regex (T168903)
  • 19:41 hashar: Restarted Jenkins to lower console log spam ( https://gerrit.wikimedia.org/r/#/c/359116/ )
  • 19:35 urandom: T160570: Upgrading restbase-dev1003 to Cassandra 3.11.0 (release)
  • 19:30 urandom: T160570: Upgrading restbase-dev1002 to Cassandra 3.11.0 (release)
  • 19:05 mobrovac@tin: Finished deploy [restbase/deploy@3975ab2]: Update Parsoid HTML version to 1.5.0 - T39902 (duration: 06m 16s)
  • 18:59 mobrovac@tin: Started deploy [restbase/deploy@3975ab2]: Update Parsoid HTML version to 1.5.0 - T39902
  • 18:51 arlolra: Updated Parsoid to b59045f2 (T39902, T149794)
  • 18:32 urandom: T160570: Upgrading restbase-dev1001 to Cassandra 3.11.0 (release)
  • 18:31 arlolra@tin: Finished deploy [parsoid/deploy@70538a6]: Updating Parsoid to b59045f2 (duration: 11m 13s)
  • 18:20 arlolra@tin: Started deploy [parsoid/deploy@70538a6]: Updating Parsoid to b59045f2
  • 18:18 niharika29@tin: Finished scap: wmf-config/InitialiseSettings.php Deploy Quiz extension on huwikibooks (https://gerrit.wikimedia.org/r/#/c/361084) (duration: 03m 14s)
  • 18:15 niharika29@tin: Started scap: wmf-config/InitialiseSettings.php Deploy Quiz extension on huwikibooks (https://gerrit.wikimedia.org/r/#/c/361084)
  • 18:14 niharika29@tin: scap failed: RuntimeError scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details) (duration: 02m 15s)
  • 18:14 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:11 niharika29@tin: Started scap: wmf-config/InitialiseSettings.php Deploy Quiz extension on huwikibooks (https://gerrit.wikimedia.org/r/#/c/361084)
  • 17:46 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.6
  • 17:36 twentyafterfour: Deploying 1.30.0-wmf.6 to all wikis refs T167535
  • 17:35 twentyafterfour: resuming the train for wmf.6 which was blocked at group 1
  • 17:12 gehel@tin: Finished deploy [wdqs/wdqs@f8b9294]: (no justification provided) (duration: 03m 42s)
  • 17:09 gehel@tin: Started deploy [wdqs/wdqs@f8b9294]: (no justification provided)
  • 16:59 elukey: EXPERIMENT - T163337 - set slaveof no one on rdb2004 to remove its dependency to rdb2003 (puppet disabled on rdb2004, to rollback just systemctl unmask redis-instance-tcp_6380.service, enable/run puppet and start redis if it is not up)
  • 16:55 elukey: stop neutron-server on labtestnet2001 to avoid the root partition to fill up
  • 15:41 marostegui: Deploy alter table s7 - db1079 - T166208
  • 15:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T166208 (duration: 00m 46s)
  • 15:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T166208 (duration: 00m 46s)
  • 14:47 marostegui: Deploy alter table on silver and labtestweb2001 - T168661
  • 13:49 marostegui: Deploy alter table s7 - db1033 - T166208
  • 13:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add comments to db1033 status - T166208 (duration: 00m 48s)
  • 13:08 elukey: truncate /var/log/upstart/neutron-server.log on labtestnet2001 (root filled up, spam in logs for 'ERROR neutron.service OperationalError: (sqlite3.OperationalError) no such table:')
  • 12:58 marostegui: Deploy alter table on db2062 and db2055 - T168661
  • 12:55 elukey: reboot mw129[5,6,7,8] for kernel update (mw imagescalers, two at the time)
  • 12:02 marostegui: Deploy alter table on s2 codfw master (db2017) and let it replicate - T168661
  • 11:05 godog: roll-restart pybal in codfw to pick up thumbor.svc.codfw.wmnet
  • 10:28 elukey: reboot mw1288->90 for kernel updates (last batch of api-appservers)
  • 10:18 elukey: reboot mw128[4,5,6,7] for kernel updates (api-appservers)
  • 10:03 godog: roll-restart nginx on thumbor to disable te: chunked
  • 09:34 elukey: reboot mw128[0,1,2,3] for kernel updates (api-appservers)
  • 09:04 elukey: reboot mw127[6,7,8,9] for kernel updates (api-appservers)
  • 08:58 elukey: reboot mw127[3,4,5] for kernel updates (appservers)
  • 08:50 gehel: starting restart of elasticsearch codfw for kernel upgrade
  • 08:48 elukey: reboot mw1269 -> mw1272 for kernel updates (appservers)
  • 08:37 godog: roll-restart swift-proxy to use thumbor for commons
  • 08:28 elukey: reboot mw1258, 126[6,7,8] for kernel updates (appservers)
  • 08:11 elukey: reboot mw125[4,5,6,7] for kernel updates (appservers)
  • 07:55 marostegui: Stop replication on db1069:3313 (s3) and db1044 in the same position - T166546
  • 07:15 elukey: restart pdfrender on scb1002 for the xpra issue
  • 07:08 elukey: powercycle elastic1017 (stuck in console, no ssh access)
  • 06:57 marostegui: Drop table wikilove_image_log from silver - T127219
  • 06:56 elukey: truncated neutron-server.log files in /var/log on labtestnet2001 to free some space in root
  • 06:55 marostegui: Drop table wikilove_image_log from s1 - T127219
  • 06:51 marostegui: Drop table wikilove_image_log from s3 - T127219
  • 06:50 elukey: execute sudo -u _graphite find /var/lib/carbon/whisper/eventstreams/rdkafka -type f -mtime +15 -delete on graphite1001 to free some space for /var/lib/carbon
  • 06:49 marostegui: Drop table wikilove_image_log from s7 - T127219
  • 06:47 marostegui: Drop table wikilove_image_log from s2 - T127219
  • 06:45 marostegui: Drop table wikilove_image_log from s4 - T127219
  • 06:44 marostegui: Drop table wikilove_image_log from s6 - T127219
  • 06:36 marostegui: Deploy alter table s7 - db1086 - T166208
  • 06:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T166208 (duration: 00m 46s)
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove comments from db1041 long running alter status - T166208 (duration: 00m 47s)
  • 03:01 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jun 26 03:01:35 UTC 2017 (duration 6m 52s)
  • 02:54 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.6) (duration: 08m 04s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 08m 03s)

2017-06-25

  • 09:00 elukey: Executing 'sudo -u _graphite find /var/lib/carbon/whisper/eventstreams/rdkafka -type f -mtime +15 -delete' on graphite1001 to free some space (/var/lib/carbon filling up) - T1075

2017-06-23

  • 23:42 akosiaris: bounce celery-ores-worker on scb1004
  • 19:38 ppchelko@tin: Finished deploy [changeprop/deploy@ffabd13]: Re-enable ORES rules back (duration: 01m 07s)
  • 19:37 ppchelko@tin: Started deploy [changeprop/deploy@ffabd13]: Re-enable ORES rules back
  • 19:34 akosiaris: restart celery-ores-workers on scb1001, scb1002, scb1003, leave scb1004 alone
  • 18:39 godog: roll restart celery-ores-worker in codfw
  • 17:01 mobrovac@tin: Finished deploy [changeprop/deploy@1f45fae]: Temporary disable ORES (ongoing outage) (duration: 01m 19s)
  • 16:59 mobrovac@tin: Started deploy [changeprop/deploy@1f45fae]: Temporary disable ORES (ongoing outage)
  • 16:44 mobrovac: scb1001 disabling puppet
  • 16:34 akosiaris: restart celery ores worker on scb1003
  • 15:54 hashar_: Restarted Jenkins
  • 15:45 godog: bounce celery-ores-worker on scb1001 with logging level INFO
  • 13:51 akosiaris: issue flashdb on oresrdb1001:6379
  • 13:21 akosiaris: issue flashdb on oresrdb1001:6379
  • 13:13 akosiaris: bump uwsgi-ores and celery-ores-worker on scb100*
  • 12:38 akosiaris: disable changeprop due to ORES issues
  • 12:26 Amir1: restarting celery and uwsgi on all scb nodes in eqiad
  • 11:55 Amir1: restarted uwsgi-ores and celery-ores-worker services in scb1003
  • 11:45 ema: scb1001: restart pdfrender.service
  • 09:55 elukey: reboot mw1250-53 for kernel updates
  • 09:27 jynus: reapplying dns change - small downtime on tendril until puppet deploy and run
  • 08:38 jynus: deploying dns change to tendril
  • 06:17 mutante: releases1001 - systemctl reset-failed to clear Icinga systemd status CRIT - service puppet
  • 06:17 marostegui: Deploy alter table on db1041 - s7 - T166208
  • 06:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add comments to db1041 long running alter status - T166208 (duration: 00m 46s)
  • 06:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 - T168354 (duration: 00m 46s)
  • 05:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 - T166207 (duration: 00m 47s)
  • 00:15 mutante: RT (ununpentium) installing pending package upgrades

2017-06-22

  • 23:15 Dereckson: kbp.wikipedia wiki creation done.
  • 23:11 dereckson@tin: Synchronized wmf-config/interwiki.php: Add kbp.wikipedia to interwiki map (T160868) (duration: 00m 46s)
  • 23:07 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add kbp.wikipedia to interwiki map (T160868) (duration: 00m 47s)
  • 22:56 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for kbp.wikipedia (T160868) (duration: 00m 45s)
  • 22:54 dereckson@tin: Synchronized langlist: +kbp (T160868) (duration: 00m 46s)
  • 22:53 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: +kbpwiki (T160868)
  • 22:52 dereckson@tin: Synchronized dblists: (no justification provided) (duration: 00m 48s)
  • 22:51 Dereckson: Create tables for kbpwiki (T160868)
  • 21:43 RainbowSprinkles: gerrit: Stopping momentarily, reindexing accounts
  • 21:03 andrewbogott: restarting rabbitmq-server on labcontrol1001
  • 20:34 mutante: icinga - re-enabling disabled notifications for IPMI temp checks on some mc* and mw* hosts where check is fine and OK
  • 20:21 andrewbogott: labtestnet2001 turning neutron debug logs off because they're flooding the (very small) '/' partition
  • 19:52 twentyafterfour: the train is currently blocked by https://phabricator.wikimedia.org/T168681
  • 19:31 thcipriani@tin: Finished scap: SWAT: Translation updates for QuickSurveys T131949 (duration: 22m 10s)
  • 19:09 thcipriani@tin: Started scap: SWAT: Translation updates for QuickSurveys T131949
  • 19:04 thcipriani@tin: Synchronized wmf-config: SWAT: Create a FeaturedFeed for the Wikimag bulletin on frwiki T168005 (duration: 00m 54s)
  • 18:51 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Grant the "movefile" right to the "autopatrolled" group on rowiki T168192 (duration: 00m 48s)
  • 18:39 thcipriani@tin: Synchronized php-1.30.0-wmf.6/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Switch to data-attribute for sister-search sidebar results T164854 (duration: 00m 50s)
  • 18:29 thcipriani@tin: Synchronized wmf-config: SWAT: relatedArticles: SamplingRate -> BucketSize PART II (duration: 00m 48s)
  • 18:27 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: relatedArticles: SamplingRate -> BucketSize PART I (duration: 00m 53s)
  • 18:24 jynus: restart db2062
  • 17:51 jynus: testing in-place upgrade from jessie to stretch of db2062
  • 17:34 bsitzmann@tin: Finished deploy [mobileapps/deploy@7bfe571]: Update mobileapps to 21f771d (duration: 02m 54s)
  • 17:31 bsitzmann@tin: Started deploy [mobileapps/deploy@7bfe571]: Update mobileapps to 21f771d
  • 17:24 gehel: restarting logstash on logstash1001 to validate plugin deplyoment with scap3
  • 17:23 gehel@tin: Finished deploy [logstash/plugins@720b648]: (no justification provided) (duration: 00m 02s)
  • 17:23 gehel@tin: Started deploy [logstash/plugins@720b648]: (no justification provided)
  • 17:14 gehel: moving to scap for logstash plugin deployment
  • 17:13 jynus: disable puppet on db2062 before maintenance
  • 17:05 andrewbogott: rebooting labsdb1007
  • 17:04 bd808: Log events between 15:46 and 17:03 missed due to stashbot downtime
  • 17:03 andrewbogott: rebooting labsdb1007
  • 15:46 moritzm: repooling scb1003 after hardware maintenance
  • 15:31 otto@tin: Finished deploy [eventlogging/analytics@328dea6]: inserting eventlogging events into mysql based on topic name if it exists, falling back to schema name (duration: 00m 03s)
  • 15:31 otto@tin: Started deploy [eventlogging/analytics@328dea6]: inserting eventlogging events into mysql based on topic name if it exists, falling back to schema name
  • 15:21 moritzm: rebooting restbase2005 for kernel update
  • 14:37 gehel: restarting maps-test cluster for kernel upgrade
  • 14:22 gehel: restart wdqs servers completed
  • 13:55 gehel: restart wdqs servers for kernel upgrade
  • 13:45 akosiaris: reboot planet1001 for kernel upgrades and renumbering
  • 13:21 moritzm: rebooting restbase2006 for kernel update
  • 13:09 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Reader Survey using QuickSurveys - T131949 (duration: 01m 04s)
  • 12:17 moritzm: rebooting restbase2008 for kernel update
  • 11:21 moritzm: rebooting ms-be2026 to ms-be2030 for kernel update
  • 11:12 moritzm: rebooting restbase2009 for kernel update
  • 10:45 ema: cp1074: restart varnish backend
  • 10:25 moritzm: rebooting ms-be2022 to ms-be2025 for kernel update
  • 10:19 moritzm: rearmed keyholder on tin
  • 10:12 moritzm: rebooting restbase2010 for kernel update
  • 10:00 moritzm: depooled mw1228, broken disk cause boot failure
  • 09:50 moritzm: rebooting tin for kernel update
  • 09:46 jynus: reimage db2072
  • 09:42 moritzm: powercycling mw1228, stuck in reboot
  • 09:36 akosiaris: rebooting chlorine.eqiad.wmnet etcd1004.eqiad.wmnet etcd1005.eqiad.wmnet mwdebug1002.eqiad.wmnet neon.eqiad.wmnet sca1004.eqiad.wmnet for kernel upgrades
  • 09:25 moritzm: rebooting mw1221-mw1235 for kernel update
  • 09:15 moritzm: rebooting restbase2011 for kernel update
  • 09:11 marostegui: Deploy alter table s5 - labsdb1003 - T166207
  • 09:06 elukey: rebooting kafka100[23] for kernel updates (evenbus eqiad)
  • 09:01 moritzm: rebooting rhenium for kernel update
  • 08:55 marostegui: Stop MySQL and reboot labsdb1011 - T168584
  • 08:50 moritzm: rebooting restbase2012 for kernel update
  • 08:44 marostegui: Stop MySQL and reboot labsdb1010 - T168584
  • 08:40 moritzm: rearmed keyholder on naos
  • 08:32 akosiaris: reboot etcd1002 for kernel upgrades
  • 08:20 moritzm: rebooting naos for kernel update
  • 08:20 marostegui: Stop MySQL and reboot labsdb1009 - T168584
  • 08:07 moritzm: powercycling labtestservices2001 (didn't come up after reboot)
  • 07:26 moritzm: rebooting suhail/subra for kernel update
  • 07:24 elukey: reboot kafka1001 for kernel updates (eventbus eqiad)
  • 07:24 marostegui: Deploy alter table s5 - db1026 - T166207
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 - T166207 (duration: 00m 44s)
  • 07:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 - T166207 (duration: 01m 03s)
  • 07:12 marostegui: Deploy alter table s5 - dbstore1001 - T166207
  • 06:53 moritzm: rebooting mw1205-mw1208 for kernel update
  • 06:38 moritzm: rebooting bast2001 for kernel update
  • 05:34 moritzm: rebooting mw1238-mw1249 for kernel update
  • 05:02 moritzm: rebooting ms-be2015-ms-be2020 for kernel update
  • 03:40 mutante: regarding my last log message: this is just true for stretch! ah!
  • 03:35 mutante: netmon1002 - installed psmisc to have 'killall' - will clean it up, but also suggest we add psmisc to base packages. it provides killall, fuser, pstree...
  • 02:49 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jun 22 02:49:58 UTC 2017 (duration 6m 53s)
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.6) (duration: 07m 23s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 08m 10s)
  • 00:15 twentyafterfour: finished phabricator deployments
  • 00:13 twentyafterfour: deploying https://phabricator.wikimedia.org/D687

2017-06-21

  • 23:19 twentyafterfour@tin: Synchronized static/images/project-logos/wikimania2017wiki.png: swat (duration: 00m 45s)
  • 23:10 twentyafterfour@tin: Synchronized static/images/project-logos/wikimania2017wiki.png: swat (duration: 00m 45s)
  • 22:38 mutante: new language din.wikipedia.org has been created in DNS - Dinka is a Nilotic dialect cluster spoken by the Dinka people, the major ethnic group of South Sudan. (T168518) - https://en.wikipedia.org/wiki/Dinka_language
  • 22:34 mutante: DNS - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones to trigger template recreation after edit to langs.tmpl
  • 22:31 chasemp: remove manual 10.64.37.26 definition from eth1 on labstore1005 in /etc/network/interfaces
  • 22:27 chasemp: reboot labstore1004 to reset network config from boot
  • 21:44 RainbowSprinkles: cobalt: updated to 2.13.8-11-gde96955fb2 (T168360, T161206)
  • 21:40 RainbowSprinkles: gerrit2001: updated to 2.13.8-11-gde96955fb2 (T168360, T161206)
  • 21:14 mutante: apt.wm.org - reprepro copy stretch-wikimedia jessie-wikimedia gerrit - make gerrit available in stretch
  • 21:05 mutante: apt.wm.org - reprepro, include gerrit_2.13.8+git1-wmf.6 for jessie-wikimedia
  • 21:02 mutante: install1002 - rsynced gerrit packages from copper, closed firewall again, cleaned up rsyncd config from old unused things
  • 20:57 arlolra: Updated Parsoid to 881ade32 (T127421, T167933, T167714)
  • 20:50 mutante: install1002 - allow rsync from copper (build host) to /srv/wikimedia/incoming , temp for package upload
  • 20:49 arlolra@tin: Finished deploy [parsoid/deploy@2c4c0de]: Updating Parsoid to 881ade32 (duration: 12m 02s)
  • 20:37 arlolra@tin: Started deploy [parsoid/deploy@2c4c0de]: Updating Parsoid to 881ade32
  • 20:35 mutante: install1002 - removing rsyncd config fragments from carbon migration, running puppet
  • 20:25 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.6
  • 20:15 bearND: rolled back deploy since scap could not connect to scb1003
  • 20:14 twentyafterfour@tin: Synchronized php-1.30.0-wmf.6/includes/gallery/ImageGalleryBase.php: deploy https://gerrit.wikimedia.org/r/#/c/360695/ refs T168479 to unblock the train (duration: 00m 56s)
  • 20:13 bsitzmann@tin: Finished deploy [mobileapps/deploy@7bfe571]: Update mobileapps to 21f771d (duration: 08m 43s)
  • 20:13 andrewbogott: deleting the old IAD wikitech-static server so we stop paying rackspace for it
  • 20:11 ppchelko@tin: Finished deploy [changeprop/deploy@63e6a7b]: Actually start black-listing and rate-limiting articles. T161710 (duration: 01m 16s)
  • 20:09 ppchelko@tin: Started deploy [changeprop/deploy@63e6a7b]: Actually start black-listing and rate-limiting articles. T161710
  • 20:04 bsitzmann@tin: Started deploy [mobileapps/deploy@7bfe571]: Update mobileapps to 21f771d
  • 19:41 mutante: copper: building gerrit_2.13.8+git1-wmf.6 for stretch (experimental)
  • 19:39 mutante: copper: building gerrit_2.13.8+git1-wmf.6 for jessie
  • 19:30 twentyafterfour: The train for wmf.6 (T167535) is currently blocked by T168479
  • 19:13 madhuvishy: Rebooting labstore1004 (secondary in drbd pair)
  • 18:55 andrewbogott: rebooting labnet1001, which will cause a labs-wide network outage
  • 18:46 gehel: restarting wdqs-updater on all wdqs servers
  • 18:38 krinkle@tin: Synchronized static/images/: I737e6f9fce (duration: 00m 46s)
  • 18:20 gehel@tin: Finished deploy [wdqs/wdqs@d67d4a4]: (no justification provided) (duration: 01m 50s)
  • 18:18 gehel@tin: Started deploy [wdqs/wdqs@d67d4a4]: (no justification provided)
  • 18:17 gehel: deploying wdqs to fix missing lib
  • 18:14 andrewbogott: rebooting labnet1002
  • 18:09 andrewbogott: rebooting labnodepool1001
  • 18:02 andrewbogott: rebooting labcontrol1001
  • 18:02 andrewbogott: rebooting labservices1001
  • 18:02 andrewbogott: rebooting silver
  • 18:02 andrewbogott: rebooting californium
  • 17:59 andrewbogott: rebooting labservices1001
  • 17:58 andrewbogott: disabling the openstack scheduler so that we don't get new inconsistent VMs during some reboots
  • 17:53 andrewbogott: rebooting labcontrol1002
  • 17:53 andrewbogott: rebooting labservices1002
  • 17:37 twentyafterfour: phabricator is back online
  • 17:36 andrewbogott: rebooting labvirt1013
  • 17:35 herron: iridium - upgraded exim packages and rebooted to apply kernel upgrade
  • 17:35 ottomata: beginning reboots of kafka10(14|18|20|22) for kernel upgrade
  • 17:34 twentyafterfour: phabricator will be offline momentarily while iridium reboots
  • 17:25 andrewbogott: rebooting labvirt1012
  • 17:12 andrewbogott: rebooting labvirt1011
  • 16:59 andrewbogott: rebooting labvirt1010
  • 16:57 herron: reboot fermium (lists) for kernel upgrade
  • 16:42 andrewbogott: rebooting labvirt1009
  • 16:41 moritzm: rebooting video scalers in codfw for kernel update
  • 16:35 moritzm: rebooting mw1293/mw1294 for kernel update
  • 16:32 andrewbogott: rebooting labvirt1008
  • 15:53 godog: upgrade ms-be10[31-39] to swift 2.10
  • 15:46 ema: reboot lvs[4001-4002] (ulsfo primaries) for kernel update
  • 15:45 moritzm: upgrade ms-be2013/ms-be2014 to final stretch release and reboot for kernel update
  • 15:34 ema: reboot lvs[4003-4004] (ulsfo secondaries) for kernel update
  • 15:32 moritzm: reboot image scalers in codfw for kernel update
  • 15:32 andrewbogott: rebooting labvirt1007
  • 15:13 andrewbogott: rebooting labvirt1006
  • 15:04 moritzm: rebooting ruthenium for kernel update
  • 15:01 moritzm: reboot job runners in codfw for kernel update
  • 15:01 elukey: reboot kafka200[23] for kernel updates (eventbus codfw)
  • 14:53 andrewbogott: rebooting labvirt1005
  • 14:40 moritzm: reboot remaining scb* hosts for kernel update
  • 14:38 andrewbogott: rebooting labvirt1004
  • 14:32 ema: reboot lvs[3001-3002] (esams primaries) for kernel update
  • 14:25 andrewbogott: rebooting labvirt1003
  • 14:21 andrewbogott: rebooting labvirt1002
  • 14:18 herron: rebooting mx1001 for kernel upgrade
  • 14:08 ema: reboot lvs[3003-3004] (esams secondaries) for kernel update
  • 14:03 elukey: reboot eventlog2001 for kernel update
  • 14:02 andrewbogott: rebooting labvirt1001
  • 14:01 gehel: restarting wdqs1001 for kernel upgrade
  • 14:01 godog: reimage ms-be1020 / ms-be1021 with stretch
  • 13:52 gehel: install analysis-kuromoji plugin on relforge
  • 13:52 herron: install exim security updates on fermium (lists)
  • 13:51 elukey: rebooting eventlog1001 for kernel update (eventlogging host)
  • 13:50 moritzm: pruning old kernels on prometheus*
  • 13:48 addshore@tin: Synchronized php-1.30.0-wmf.6/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js: SWAT: Fix errors leading to wrong slider scroll postions T168299 (duration: 00m 44s)
  • 13:47 addshore@tin: Synchronized php-1.30.0-wmf.5/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js: SWAT: Fix errors leading to wrong slider scroll postions T168299 (duration: 00m 46s)
  • 13:44 elukey: reboot aqs100[89] for kernel updates
  • 13:39 ema: reboot lvs[2001-2003] (codfw primaries) for kernel update
  • 13:29 elukey: reboot aqs1007 for kernel update
  • 13:22 marostegui: Deploy alter table on s7 - directly on codfw master (db2029) - this will generate lag on codfw - T166208
  • 13:21 elukey: reboot kafka1013 for kernel updates
  • 13:16 marostegui: Deploy alter table s5 - labsdb1001 - T166207
  • 13:15 marostegui: Deploy alter table s5 - db1045 - T166207
  • 13:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1045 - T166207 (duration: 00m 44s)
  • 13:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 - T166207 (duration: 00m 46s)
  • 13:05 elukey: reboot analytics1003 (Hue, Camus, Oozie, Hive master) for kernel upgrade
  • 12:32 gehel: deploying T167871 and restarting kartotherian / tilerator on maps eqiad
  • 12:32 moritzm: rebooting mw1189-mw1199 for kernel update
  • 12:10 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=sca1004.eqiad.wmnet
  • 12:09 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mwdebug1002.eqiad.wmnet
  • 11:59 moritzm: rebooting mw1209-mw1220 for kernel update
  • 11:45 moritzm: rebooting mediawiki api servers in codfw for kernel update
  • 11:42 akosiaris: rollback change in asw-a-eqiad for ganeti interface range due to alerts
  • 11:23 akosiaris: reboot ganeti1007 for insertion into ganeti cluster
  • 11:14 elukey: reboot aqs1006 for kernel update
  • 11:04 moritzm: rebooting mw1180-mw1188 for kernel update
  • 11:02 akosiaris: starting up all instances on ganeti01.svc.codfw.wmnet
  • 11:01 godog: reimage ms-be1018 / 1019 with stretch
  • 10:58 ema: reboot lvs[2004-2006] (codfw secondaries) for kernel update
  • 10:50 akosiaris: rebooting all ganeti200X nodes
  • 10:47 akosiaris: shutdown all VMs on the ganeti01.svc.codfw.wmnet cluster
  • 10:43 elukey: reboot analytics1001 (Hadoop master) for kernel update
  • 10:35 akosiaris: rebooting the entire codfw ganeti cluster for kernel upgrades. Silenced hosts in icinga already. T167643
  • 10:30 moritzm: rebooting bast4001 for kernel update
  • 10:21 ema: reboot lvs[1001-1003] (eqiad primaries) for kernel update
  • 10:17 elukey: running a script in tmux on rdb[12]003 called "check" to dump periodically LLEN enwiki:jobqueue:enqueue:l-unclaimed and stopped the one on rdb2004
  • 10:07 ema: reboot lvs[1004-1006] (eqiad secondaries) for kernel update
  • 10:01 elukey: reboot analytics1002 (Hadoop master standby) for kernel update
  • 10:01 moritzm: rebooting auth* servers for kernel update
  • 09:48 ema: reboot lvs[1010-1012] for kernel update
  • 09:48 elukey: reboot aqs1005 for kernel update
  • 09:10 elukey: reboot kafka2001 for kernel update (eventbus codfw)
  • 09:06 moritzm: rebooting restbase1017 for kernel update
  • 08:52 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=restbase2001.codfw.wmnet,dc=codfw,service=restbase
  • 08:49 _joe_: correction: restarting pybal
  • 08:49 _joe_: restarting etcd on lvs2003/2006, connection lost to etcd
  • 08:34 elukey: reboot kafka1012 for kernel upgrades
  • 08:34 marostegui: Deploy alter table db1070 s5 - T166207
  • 08:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 - T166207 (duration: 00m 44s)
  • 08:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T166207 (duration: 00m 45s)
  • 08:26 godog: reimage ms-be1014 / 1015 with jessie
  • 07:37 marostegui: Stop and reset slave s5 on dbstore2001 - T168354
  • 06:23 mutante: planet2001 wget missing unpuppetized logo file from https://en.planet.wikimedia.org/images/planet-wm2.png - should fix puppet run
  • 06:19 marostegui: Stop replication and puppet on db2066 for maintenance - T168354
  • 06:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 - T168354 (duration: 00m 43s)
  • 06:08 elukey: reboot thorium for kernel upgrades (outage to all the analytics websites)
  • 06:05 marostegui: Deploy alter table s5 - db1082 - T166207
  • 06:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T166207 (duration: 00m 44s)
  • 06:04 marostegui: Deploy alter table s5 - dbstore1002 - T166207
  • 05:59 elukey: reboot stat100[2,3,4] for kernel upgrades
  • 05:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T166207 (duration: 00m 44s)
  • 05:54 marostegui: Deploy alter table s5 - labsdb1011 - T166207
  • 05:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1021 - T166205 (duration: 01m 00s)
  • 05:41 marostegui: Start relearn BBU cycle on db1016 - T166344
  • 03:13 mutante: planet - copying HTML files from docroot from planet1001 to planet2001 - (don't serve Debian default page)
  • 03:03 mutante: planet1001 - remove/purge all php5* packages
  • 02:57 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jun 21 02:57:19 UTC 2017 (duration 6m 41s)
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.6) (duration: 06m 06s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 06m 52s)
  • 01:45 mutante: planet1001 - remove php5 package
  • 00:34 mutante: planet2001 - revoke old puppet cert, salt-key, re-add new cert/key after reinstall
  • 00:24 mutante: planet2001 - scheduled downtime, reinstall with stretch
  • 00:06 mutante: tin (deployment): manually remove l10nupdate cron, let puppet re-create it after gerrit:350749. stops l10nupdate cron from running on weekends. naos didn't need an action. (T164035).

2017-06-20

  • 23:06 aude@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Remove temp wiktionary site link settings (duration: 00m 43s)
  • 23:05 aude@tin: Synchronized wmf-config/Wikibase-labs.php: Remove temp wiktionary site link settings (duration: 00m 44s)
  • 23:03 aude@tin: Synchronized wmf-config/Wikibase-production.php: Remove temp wiktionary site link settings for test wikidata (duration: 00m 43s)
  • 22:59 aude@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Wikibase (phase 1) on Wiktionary wikis (duration: 00m 44s)
  • 22:49 aude: created wbc_entity_usage table and updated sites table on wiktionary wikis
  • 21:36 legoktm@tin: Synchronized wmf-config: touch (duration: 00m 45s)
  • 21:29 arlolra@tin: Started restart [parsoid/deploy@4b60bf9]: (no justification provided)
  • 21:17 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Linter to all wikis (try #2) - T148609 (duration: 00m 44s)
  • 21:17 andrewbogott: rebooting labvirt1014 as practice for tomorrow's security reboots
  • 21:13 mutante: labtestpuppetmaster2001 - install-console, activate puppet, sign cert, initial puppet run, add salt key (T167157)
  • 20:54 twentyafterfour: Finished train deployment for group0, train will resume tomorrow as scheduled.
  • 20:53 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: Group0 to 1.30.0-wmf.6 refs T167535
  • 20:44 twentyafterfour@tin: Synchronized php-1.30.0-wmf.6/includes/changes/EnhancedChangesList.php: deploy bad7bde refs T167535 (duration: 00m 53s)
  • 20:37 twentyafterfour@tin: Finished scap: sync 1.30.0-wmf.6 refs T167535 (duration: 29m 16s)
  • 20:08 twentyafterfour@tin: Started scap: sync 1.30.0-wmf.6 refs T167535
  • 19:17 twentyafterfour: Prepping 1.30.0-wmf.6 - T167535
  • 18:09 mutante: netmon1002 - arm keyholder with rancid key
  • 18:06 ema: route ulsfo back to codfw T167274
  • 18:02 chasemp: ssh labsdb101[0|1].eqiad.wmnet 'sudo maintain-meta_p --all-databases --debug'
  • 17:53 mutante: cobalt (gerrit) - re-enabling puppet, running it. nothing should change, the system unit file mentioned in T168360#3362314 does not get installed by puppet, it comes from the deb
  • 17:49 subbu: Since arlolra noticed some unexpected warnings from the canaries, the Parsoid deploy was rolled back, so Parsoid was not updated to e2e2b5f6 (contrary to what scap said above).
  • 17:48 gehel@tin: Finished deploy [wdqs/wdqs@b60d224]: (no justification provided) (duration: 01m 41s)
  • 17:47 XioNoX: repool codfw - T167274
  • 17:46 gehel@tin: Started deploy [wdqs/wdqs@b60d224]: (no justification provided)
  • 17:45 gehel: deploying wdqs blazegraph and GUI updates
  • 17:43 mutante: RT - ununpentium - upgraded rt4-db-mysql
  • 17:42 arlolra@tin: Finished deploy [parsoid/deploy@4b60bf9]: Updating Parsoid to e2e2b5f6 (duration: 07m 57s)
  • 17:40 mutante: mwreleases1001 - puppet node clean, puppet node deactivate - was reinstalled as releases1001
  • 17:34 arlolra@tin: Started deploy [parsoid/deploy@4b60bf9]: Updating Parsoid to e2e2b5f6
  • 17:29 elukey: running a script in tmux on rdb200[34] called "check" to dump periodically LLEN enwiki:jobqueue:enqueue:l-unclaimed
  • 17:21 elukey: restart redis-instance-tcp_6380.service on rdb2003 to force sync with its master
  • 17:16 elukey: restart redis-instance-tcp_6380.service on rdb2004 to force sync with its master
  • 17:04 XioNoX: re-enable igmp-snooping on asw-d-codfw
  • 17:01 bd808: Ran maintain-meta_p --all-databases on labsdb1003
  • 16:55 bd808: Ran maintain-meta_p --all-databases on labsdb1001
  • 16:53 paravoid: updating the d-i image for stretch in puppet volatile
  • 16:09 chasemp: openstack server delete admin-monitoring openstack project instances (we have leaked 7)
  • 16:05 elukey: reboot kafka1013 for kernel upgrade
  • 15:08 XioNoX: starting asw-d-codfw switch upgrade - T167274
  • 14:47 elukey: rolling restart of druid100[123] for kernel upgrades
  • 14:32 XioNoX: depooled codfw - T167274
  • 14:27 moritzm: rebooting scb1001 for kernel update
  • 14:17 hashar: CI is fully backup (following reboot of contint1001 / labnodepool1001 )
  • 14:16 hashar: Upgraded Jenkins plugins
  • 14:05 hashar: Starting Jenkins on contint1001
  • 14:05 elukey: reboot kafka2001 for kernel upgrade
  • 14:02 hashar: Rebooting contint1001
  • 14:00 hashar: Stopping Nodepool service to prevent new builds
  • 13:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T166207 (duration: 01m 41s)
  • 13:55 marostegui: Deploy alter table db1087 - s5 - T166207
  • 13:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1071 - T166207 (duration: 00m 41s)
  • 13:44 aude@tin: Synchronized wmf-config/Wikibase-production.php: Enable Wiktionary site links on test.wikidata (duration: 00m 43s)
  • 13:42 _joe_: manually started nrpe on ms-be1016
  • 13:39 marostegui: Deploy alter table on db1049 - s5 - T166207
  • 13:39 moritzm: rebooting labnodepool1001 for kernel update
  • 13:37 hashar: Restarting Jenkins
  • 13:36 akosiaris@puppetmaster1001: conftool action : set/pooled=true; selector: name=sca1004.eqiad.wmnet
  • 13:36 akosiaris@puppetmaster1001: conftool action : set/pooled=true; selector: name=mwdebug1002.eqiad.wmnet
  • 13:33 godog: pool thumbor100[34] into service - T168297
  • 13:26 marostegui: Deploy alter table labsdb1010 - s5 - T166207
  • 13:14 moritzm: rebooting restbase staging cluster (cerium/praseodymium/xenon) for kernel update
  • 12:09 gehel: starting cluster restart elasticsearch eqiad
  • 12:00 elukey: reboot analytics1029 -> analytics1069 for kernel upgrades (Hadoop worker nodes)
  • 11:36 moritzm: installing libgcrypt security updates
  • 11:29 moritzm: rebooting mediawiki app servers in codfw for kernel update
  • 11:13 akosiaris: renumber sca1004, mwdebug1002. Downtime should be a few minutes
  • 11:08 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mwdebug1002.eqiad.wmnet
  • 10:56 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=sca1004.eqiad.wmnet
  • 10:07 moritzm: rebooting mwdebug servers for kernel update
  • 10:03 elukey: reboot kafka1012, analytics1028, aqs1004 for kernel upgrades (canary hosts)
  • 10:00 godog: reimage ms-be1016 with stretch
  • 09:53 godog: reset ms-be1014 idrac via ipmitool
  • 09:46 moritzm: rebooting app server canaries for kernel update
  • 09:40 godog: roll-restart thumbor to increase swift timeout
  • 09:29 marostegui: Rename table on db1089 enwiki.wikilove_image_log - T127219
  • 08:46 marostegui: Drop table titlekey from s1 - T164949
  • 08:35 godog: roll restart swift-proxy on ms-fe* to pick up thumbor changes
  • 08:30 _joe_: restarting gerrit T168360
  • 08:25 _joe_: manually patching gerrit's systemd unit file to allow more open files
  • 08:22 marostegui: Drop table titlekey from s3 - T164949
  • 08:15 marostegui: Drop table titlekey from s4 - T164949
  • 08:06 marostegui: Drop table titlekey from s7 - https://phabricator.wikimedia.org/T164949
  • 07:45 marostegui: Drop table titlekey from s5 - T164949
  • 07:35 gehel: restarting elastic1017 to validate upgrades
  • 07:27 marostegui: kill alter table on enwiki.revision db1047 after running for 13 days - T166452
  • 07:23 moritzm: installing glibc security updates
  • 07:22 marostegui: Stop MySQL dbstore2001 for maintenance - T168354
  • 07:20 marostegui: Deploy alter table s5 - db1071 - T166207
  • 07:10 marostegui: Deploy alter table s5 - db1095 - T166207
  • 06:57 moritzm: install remaining exim security updates

2017-06-19

  • 23:38 andrewbogott: are we logging?
  • 23:35 legoktm: legoktm@tin: Synchronized static/images/project-logos/: Upload logos for the Dinka Wikipedia (duration: 00m 42s)
  • 22:45 andrewbogott: removed some big dirs from /home/ori on install1002
  • 22:30 andrewbogott: find /srv/carbon/whisper/archived_metrics -mtime +730 -type f -delete on labmon1001
  • afk: Added non-voting operations-puppet-tests-docker job for operations/puppet repo, should (hopefully) be fast, and will timeout after 1 minute if it's not. More info https://gerrit.wikimedia.org/r/#/c/360091/ + T166888
  • afk: updated payments-wiki from 7a50542 to 8bdd706
  • 19:39 mepps: correction: updated civicrm from dfc26f0 to d558df2
  • 19:28 mepps: updated from dfc26f0 to d558df2
  • 18:35 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 41s)
  • 18:34 reedy@tin: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 00m 42s)
  • 18:29 reedy@tin: Synchronized wmf-config/abusefilter.php: (no justification provided) (duration: 00m 41s)
  • 18:21 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: logos (duration: 00m 41s)
  • 18:20 reedy@tin: Synchronized static/favicon/wmf.ico: (no justification provided) (duration: 00m 41s)
  • 18:19 reedy@tin: Synchronized wmf-config/flaggedrevs.php: Remove old setting that does nothing (duration: 00m 41s)
  • 18:18 reedy@tin: Synchronized static/images: (no justification provided) (duration: 00m 41s)
  • 18:10 reedy@tin: Synchronized dblists/securepollglobal.dblist: (no justification provided) (duration: 00m 41s)
  • 18:02 reedy@tin: Synchronized wmf-config/InterwikiSortOrders.php: Add atjwiki (duration: 00m 41s)
  • 17:42 ejegg: updated fundraising tools from 585f546 to 457bddb
  • 17:21 moritzm: installing exim4 security updates
  • 15:48 moritzm: uploaded linux-meta_1.13 to apt.wikimedia.org (with this update the linux-meta package now also defaults to 4.9 (previously 4.4))
  • 15:47 moritzm: uploaded linux_4.9.25-1~bpo8+3 to apt.wikimedia.org
  • 15:25 volans: installed python-setuptools-scm on copper
  • 15:16 marostegui: Deploy alter table labsdb1009 - T166207
  • 15:12 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=chlorine.eqiad.wmnet
  • 15:03 mobrovac: restbase restbase2001 is out of rotation, performing experiments with the new cassandra driver v3.2.2 which seems to be causing problems only in production
  • 14:59 godog: cold reset ms-be1013 drac
  • 14:53 gehel: pausing cluster restart of elasticsearch eqiad
  • 14:24 godog: roll-upgrade swift to 2.10 on ms-be10[22-30] - T162609
  • 14:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 - T153743 (duration: 00m 41s)
  • 14:06 gehel: starting cluster restart on elasticsearch / cirrus / eqiad for ltr plugin deployment
  • 14:01 gehel: restarting elasticsearch / relforge for ltr plugin deployment
  • 13:58 gehel: remove decommissioned nodes from redis / trebuchet for elasticsearch/plugins
  • 13:48 gehel: deploying latest elasticsearch plugin (ltr plugin)
  • 13:48 moritzm: fixing salt minion setup on wtp1047
  • 13:44 hashar: European SWAT completed
  • 13:44 aude@tin: Synchronized wmf-config/Wikibase.php: Remove old constraints section config (duration: 00m 41s)
  • 13:42 aude@tin: Synchronized wmf-config/Wikibase-production.php: Add constraints section to property pages on test.wikidata (duration: 00m 41s)
  • 13:29 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [cleanup] remove old interwiki search config (duration: 00m 41s)
  • 13:28 dcausse@tin: Synchronized wmf-config/CirrusSearch-labs.php: [cleanup] remove old interwiki search config (duration: 00m 41s)
  • 13:21 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable OOjs UI buttons on EditPage for plwiki - T162849 (duration: 00m 42s)
  • 13:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add sandbox link for dtywiki - T168038 (duration: 00m 42s)
  • 12:54 dcausse: restarting elasticsearch on relforge1* to pickup new snapshot of the ltr plugin
  • 12:36 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=chlorine.eqiad.wmnet
  • 12:04 elukey: run 'echo "autoLearnMode=1" > /tmp/disable_learn && megacli -AdpBbuCmd -SetBbuProperties -f /tmp/disable_learn -a0' on all the analytics workers to disable BBU Auto learn - T167809
  • 11:33 marostegui: Rename user Smuconlaw → Sgconlaw - T168109
  • 11:31 jynus: restarting replication on dbstore1002:s3 and db1015
  • 11:19 moritzm: rebooting cp3007 for kernel update
  • 11:01 _joe_: depooling mw1170-mw1179 for decommissioning, T168271
  • 10:15 godog: roll-upgrade swift to 2.10 on to ms-fe1* - T162609
  • 09:56 akosiaris: migrate neon.eqiad.wmnet to ganeti01.svc.eqiad.wmnet's row_A nodegroup
  • 09:55 dcausse: restarting elasticsearch on relforge1* to pickup new snapshot of the ltr plugin
  • 09:33 jynus: temporarily stop dbstore1002:s3 and db1015 to fix srwiki
  • 09:30 marostegui: Deploy alter table on s2 - dbstore1001 - T166205
  • 09:18 godog: swift eqiad-prod: remove ms-be1001 - ms-be1012 - T166489
  • 09:13 paravoid: rebooting achernar to address CPU throttling and apply the BIOS update
  • 09:11 paravoid: upgrading achernar's BIOS from 1.2.4 to 2.4.2 hoping it will address recurring CPU throttling issue (T162850)
  • 09:07 akosiaris: restart ircecho on einsteinium, was not notifying due to a thrown exception
  • 08:35 marostegui: Drop table title key from s2 - T164949
  • 08:16 marostegui: Drop table titlekey on s6 - T164949
  • 07:59 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool pc2004,5,6 after maintenance (duration: 00m 41s)
  • 07:42 moritzm: restarting app server canaries to pick up gnutls update
  • 07:13 marostegui: Reboot ms-be1010
  • 07:10 marostegui: Deploy alter table s5 - codfw master - db2023 (and will replicate) so this will generate lag on codfw slaves - T166207
  • 07:09 jynus: upgrade, reboot and clear data on pc2006
  • 07:05 jynus: upgrade, reboot and clear data on pc2005
  • 07:03 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool pc2005 & pc2006 (duration: 00m 41s)
  • 06:58 moritzm: installing gnutls security updates
  • 06:38 marostegui: Deploy alter table s2 - labsdb1001 - T166205
  • 06:37 jynus: force learning cycle to db1046 controller T166141
  • 06:23 marostegui: Deploy alter table on s2 - db1021 - T166205
  • 06:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1021 - T166205 (duration: 00m 41s)
  • 04:21 reedy@tin: Synchronized composer.lock: update (duration: 00m 41s)
  • 04:20 reedy@tin: Synchronized composer.json: update (duration: 00m 41s)
  • 04:19 reedy@tin: Synchronized multiversion/vendor/: Update! (duration: 01m 05s)
  • 04:05 reedy@tin: Synchronized wmf-config/CommonSettings.php: Fix comments minor code style (duration: 00m 42s)
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jun 19 02:26:06 UTC 2017 (duration 6m 8s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 07m 04s)

2017-06-18

  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jun 18 02:25:55 UTC 2017 (duration 6m 8s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 07m 27s)

2017-06-17

  • 19:30 ebernhardson: restarting elasticsearch on relforge to pick up new vrsion of ltr-query
  • 16:51 volans: restarted pdfrender on scb200[2,4] T159922
  • 15:26 jynus: rebuild pc2004's (depooled) data from scratch
  • 02:29 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jun 17 02:29:51 UTC 2017 (duration 6m 8s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 07m 09s)

2017-06-16

  • 19:54 Reedy: disabled cluster 2fa for Chrissymad for T168064 (confirmed by email)
  • 19:26 ejegg: re-enabled paypal audit download and parse job
  • 19:13 ebernhardson: restarting elasticesarch on relforge to pick up new ltr-query plugin version
  • 18:14 mutante: ms-be1001: did not change config, tried again, now detected 13 drives again, coming back
  • 18:10 mutante: ms-be1001 - The following VDs are missing: 09
  • 18:08 mutante: ms-be1001 - powercycling crashed server - "[14076481.245487] general protection fault: 0000 [#4] SMP
  • 13:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove comments from db1018 current status - T166205 (duration: 00m 41s)
  • 13:26 twentyafterfour: fixed phabricator "upgrade database" error.
  • 13:20 twentyafterfour: fixing phab database migrations
  • 13:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 after performance testing (duration: 00m 41s)
  • 10:18 jynus: running analyze on db1091 (depooled), may create lag
  • 10:11 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 for performance testing (duration: 00m 42s)
  • 09:52 moritzm: installing guile security updates
  • 09:13 moritzm: re-enabled puppet on mw2129 (no reason was given why it was disabled(
  • 08:50 jynus: bringing down pc1005 and pc1006 for maintenance T167567
  • 08:40 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1099 and db1001 hosts to config (duration: 00m 41s)
  • 08:23 jynus@tin: Synchronized wmf-config/db-eqiad.php: Switchover pc1005 and pc1006 to db1099 and db1001 (duration: 00m 45s)
  • 08:20 jynus: about to swithover pc1005 and pc1006 to db1099 and db1001
  • 05:45 ebernhardson: increase enwiki_content replicas on codfw from 2 to 3 to match eqiad
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jun 16 02:37:05 UTC 2017 (duration 6m 25s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 07m 02s)

2017-06-15

  • 23:37 mutante: added stretch support for jenkins (https://gerrit.wikimedia.org/r/#/c/359227/, https://gerrit.wikimedia.org/r/#/c/359356/) | 'reprepro copy stretch-wikimedia jessie-wikimedia jenkins' to make .deb available on stretch | releases1001 now running jenkins , icinga recovered | (hashar) (T164030)
  • 23:30 mutante: APT - reprepro copy strech-wikimedia jessie-wikimedia jenkins (copy existing jenkins package to stretch, it can be used on both)
  • 23:18 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T166408: Remove dead config variable MinervaPrintStyles (duration: 00m 41s)
  • 23:15 ebernhardson@tin: Finished scap: wmf-config Scap: T162276: Enable crossproject search (duration: 03m 37s)
  • 23:11 ebernhardson@tin: Started scap: wmf-config Scap: T162276: Enable crossproject search
  • 23:10 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Scap: T162276: Enable crossproject search (duration: 00m 51s)
  • 22:59 mutante: mw2251 - repooled
  • 22:56 mutante: mw2251 - scap pull
  • 22:53 ebernhardson: restarting elasticsearch on relforge to pickup new ltr-query plugin
  • 22:30 ejegg: updated DjangoBannerStats from 9e6b117 to 5963e7c
  • 22:02 volans: restarted pdfrender on scb1001 T159922
  • 21:45 mutante: powercycling mw2251 (frozen console)
  • 21:39 volans@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2251.codfw.wmnet
  • 21:37 volans: re-enabled puppet and force run to re-enable ircecho on einstenium
  • 21:29 demon@tin: Finished scap: Removing Cards extension (duration: 21m 49s)
  • 21:08 demon@tin: Started scap: Removing Cards extension
  • 20:57 mutante: upgrading RT (request tracker)
  • 19:35 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.5
  • 19:22 ladsgroup@tin: Finished deploy [ores/deploy@ab88a74]: Deploying gerrit:359224/1 for missing config variables (duration: 24m 15s)
  • 19:17 XioNoX: Re-enabled link between cr2-codfw and cr1-eqdfw - T167261
  • 18:58 ladsgroup@tin: Started deploy [ores/deploy@ab88a74]: Deploying gerrit:359224/1 for missing config variables
  • 18:44 paravoid: restarting all puppetmasters
  • 18:40 paravoid: temporarily stopping icinga-wm
  • 18:27 demon@tin: Synchronized wmf-config/CirrusSearch-common.php: Remove quirks and enable token_count_router thingie (duration: 00m 44s)
  • 18:16 demon@tin: Synchronized php-1.30.0-wmf.5/includes/libs/objectcache/MultiWriteBagOStuff.php: T167465 (duration: 00m 44s)
  • 18:14 demon@tin: Synchronized wmf-config/InitialiseSettings.php: T167617 (duration: 00m 44s)
  • 18:12 demon@tin: Synchronized wmf-config/FeaturedFeedsWMF.php: T167617 (duration: 00m 44s)
  • 17:50 mutante: install2002 - re-enabled puppet, reverted live hack, back to normal (issue seems to be NIC or other)
  • 17:28 mutante: install2002 - temp disabling puppet and applying hot fix to debug install issue for papaul
  • 17:27 bblack: disabling puppet on cp*wmnet to avoid puppet races on https://gerrit.wikimedia.org/r/#/c/341729 merge
  • 14:39 gehel: killing stuck replication on maps1001
  • 14:38 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op Ifc7b1ea80 - Remove EtcdConfig from beta (duration: 00m 45s)
  • 13:24 gehel: elasticsearch upgrade to 5.3.2 on relforge cluster completed, cluster still recovering - T163708
  • 13:23 aude@tin: Synchronized wmf-config/Wikibase.php: Add constraints statements section on Wikidata T167126 (duration: 00m 43s)
  • 13:19 dcausse: [cirrus] reindexing all zh wikis (eqiad & codfw)
  • 13:14 aude@tin: Synchronized wmf-config/InitialiseSettings.php: Enable BM25 for Chinese wikis (duration: 00m 44s)
  • 13:13 aude@tin: Synchronized tests/cirrusTest.php: (no justification provided) (duration: 00m 45s)
  • 13:02 gehel: starting elasticsearch upgrade to 5.3.2 on relforge cluster - T163708
  • 12:14 gehel: restart elasticsearch on relforge1001 to validate latest config changes
  • 10:16 moritzm: rollout remaining systemd updates from jessie point release
  • 09:14 jynus: shutting down and deleting data at pc1004 for cloning from db1096
  • 09:10 hashar: Jenkins back up and happy.
  • 09:05 moritzm: reenable puppet on notebook1002, was disabled for the merge of the zookeeper role refactor two days ago, can be re-enabled now
  • 09:04 hashar: Restarting Jenkins. It seems I managed to deadlock it
  • 08:52 ariel@tin: Finished deploy [dumps/dumps@1734c6d]: history dump rebalance script, fixup for extension script dumps, root logger for misc dumps (duration: 00m 02s)
  • 08:52 ariel@tin: Started deploy [dumps/dumps@1734c6d]: history dump rebalance script, fixup for extension script dumps, root logger for misc dumps
  • 08:40 gehel: restart relforge1001 to validate latest config changes
  • 08:16 akosiaris@tin: Finished deploy [citoid/deploy@ba0db9c]: Remove the bad PMCID test from spec (duration: 07m 44s)
  • 08:09 akosiaris@tin: Started deploy [citoid/deploy@ba0db9c]: Remove the bad PMCID test from spec
  • 08:02 moritzm: updating HHVM on terbium/wasat to 3.18
  • 07:57 akosiaris@tin: Finished deploy [citoid/deploy@ba0db9c]: Remove the bad PMCID test from spec (duration: 00m 38s)
  • 07:57 akosiaris@tin: Started deploy [citoid/deploy@ba0db9c]: Remove the bad PMCID test from spec
  • 07:48 akosiaris: schedule 2 hours downtime for all citoid endpoints health on scb boxes
  • 06:08 marostegui: Deploy alter table s2 - labsdb1003 - T166205
  • 05:50 marostegui: Deploy alter table s2 - db1018 - T166205
  • 05:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add comments to db1018 current status - T166205 (duration: 00m 43s)
  • 05:41 marostegui: Deploy alter table s4 - dbstore1001 - T166206
  • 05:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1036 - T166205 (duration: 00m 44s)
  • 02:50 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jun 15 02:50:16 UTC 2017 (duration 6m 48s)
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 07m 34s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 09m 15s)
  • 01:17 mutante: releases1001 - reinstalling with stretch
  • 00:15 mutante: dumpsdata1001 - was reported in icinga as CRIT systemdstate - reason was puppet service was failed with "Invalid value '"no"' for boolean parameter: daemonize" (it was ok on other hosts??). commented the option, stopped puppet, systemctl reset-failed - which made it recover (T165368)
  • 00:02 twentyafterfour: Deploying phabricator update (tagged release/2017-06-14/1) details: https://phabricator.wikimedia.org/project/view/2831/

2017-06-14

  • 23:55 mutante: mwreleases: revoke puppet cert, delete salt key, remove from icinga. releases1001 still syncing disks for a while (50m), being created... T164030
  • 23:49 mutante: ganeti: removed instance mwreleases1001, created new instance releases1001 with same parameters (2 VCPUS,4G memory, 1 x 128G disk) (T164030)
  • 23:41 mutante: mwreleases1001 - scheduled downtime, shutdown, kill VM, re-install as releases1001 (T164030)
  • 23:33 catrope@tin: Synchronized php-1.30.0-wmf.5/includes/: Unbreak watchlist highlighting T167922 (duration: 01m 30s)
  • 23:30 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Send search traffic back to eqiad T149006 (duration: 00m 44s)
  • 23:23 catrope@tin: Synchronized wmf-config/: ORES config cleanups (duration: 00m 46s)
  • 22:43 reedy@tin: Synchronized php-1.30.0-wmf.5/extensions/WikimediaMaintenance/addWiki.php: Remove accountaudit (duration: 00m 44s)
  • 22:33 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: meta namespace talk for atjwiki (duration: 00m 44s)
  • 21:36 reedy@tin: Synchronized wmf-config/interwiki.php: Update interwiki map for atjwiki T167714 (duration: 00m 44s)
  • 21:29 reedy@tin: Synchronized langlist: Add atj T167714 (duration: 00m 43s)
  • 21:29 reedy@tin: Synchronized static/images/project-logos/: atjwiki T167714 (duration: 00m 43s)
  • 21:27 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: atjwiki T167714 (duration: 00m 43s)
  • 21:26 reedy@tin: rebuilt wikiversions.php and synchronized wikiversions files: Add atjwiki T167714
  • 21:25 reedy@tin: Synchronized dblists/: add atjwiki T167714 (duration: 00m 42s)
  • 21:22 reedy@tin: Synchronized php-1.30.0-wmf.4/extensions/WikimediaMaintenance/addWiki.php: Remove accountaudit (duration: 00m 44s)
  • 21:15 reedy@terbium: scap aborted: (no justification provided) (duration: 00m 01s)
  • 21:15 reedy@terbium: Started scap: (no justification provided)
  • 20:06 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: noop (duration: 00m 43s)
  • 20:05 reedy@tin: Synchronized wmf-config/CommonSettings.php: CollaborationKit loader code (duration: 00m 43s)
  • 20:03 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Add CollaborationKit to testwiki (duration: 00m 44s)
  • 19:47 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op (duration: 00m 44s)
  • 19:42 Reedy: running mwscript initSiteStats.php srnwiki --update
  • 19:37 demon@tin: Synchronized wmf-config/extension-list-labs: No-op (duration: 00m 44s)
  • 19:23 demon@tin: Synchronized php: symlink bump (duration: 00m 43s)
  • 19:17 bblack: restart varnish backend on cp1074
  • 19:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.5
  • 18:50 otto@tin: Finished deploy [eventlogging/analytics@1ce446d]: (no justification provided) (duration: 00m 04s)
  • 18:49 otto@tin: Started deploy [eventlogging/analytics@1ce446d]: (no justification provided)
  • 18:34 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/358007/ Add wmgBabelMainCategory for many languages (duration: 00m 43s)
  • 18:32 niharika29@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:25 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Sort wmgBabelMainCategory alphabetically https://gerrit.wikimedia.org/r/#/c/358006/ (duration: 00m 44s)
  • 18:24 jynus: reimporting data from pc1004 to db1096
  • 18:17 niharika29@tin: Synchronized tests/cirrusTest.php: https://gerrit.wikimedia.org/r/#/c/358625/ Test elastic2020 does not fall out of cluster (duration: 00m 43s)
  • 18:13 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/358625/ Test elastic2020 does not fall out of cluster (duration: 00m 44s)
  • 18:06 moritzm: installing unzip security updates
  • 17:55 moritzm: restarting hhvm on mw1261-mw1265 to pick up libxslt update
  • 17:49 moritzm: installing mongodb update from jessie point release on tungsten
  • 16:03 godog: point varnish upload in esams back to eqiad
  • 16:00 mobrovac@tin: Finished deploy [restbase/deploy@4c1cdd0]: (no justification provided) (duration: 04m 51s)
  • 15:55 mobrovac@tin: Started deploy [restbase/deploy@4c1cdd0]: (no justification provided)
  • 15:44 godog: point varnish upload back to swift eqiad
  • 15:14 ema: restart varnish-backend on cp2017
  • 15:08 moritzm: installing systemd bugfix updates from jessie point update
  • 15:00 ema: restart varnish-backend on cp2014
  • 13:50 zeljkof: eu swat finished
  • 13:42 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove ContentTranslationTargetNamespace config (T167865) (duration: 00m 43s)
  • 13:41 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Remove unneeded ContentTranslationTargetNamespace (T167865) (duration: 00m 44s)
  • 13:35 zfilipin@tin: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 12:24 jynus@tin: Synchronized wmf-config/db-codfw.php: Switchover pc2004 to db2072 (duration: 00m 43s)
  • 12:13 akosiaris: upload apertium-spa-ita_0.2.0~r78826-1+wmf to apt.wikimedia.org/jessie-wikimedia/main
  • 12:13 akosiaris: upload apertium-fra-cat_1.2.0~r78602-1+wmf to apt.wikimedia.org/jessie-wikimedia/main
  • 11:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Switchover pc1004 to db1096 (duration: 00m 54s)
  • 11:34 jynus: about to deploy performance-impacting change on the parsercache persistent storage T167567
  • 11:19 marostegui: Deploy alter table s4 - labsdb1011 - T166206
  • 09:46 marostegui: Rename table titlekey before dropping it on enwiki - db1089 - T164949
  • 09:18 godog: delete files older than 365d from 'servers' graphite hierarchy
  • 07:59 marostegui: Drop table updates on s3 - T139342
  • 07:32 moritzm: installing zziplib security updates on jessie
  • 07:04 elukey: restart pdfrender on scb200[2,4] (xpra race condition)
  • 07:03 elukey: restart pdfrender on scb1004 (xpra race condition)
  • 06:32 moritzm: installing remaining libtasn security updates
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jun 14 03:14:28 UTC 2017 (duration 6m 56s)
  • 03:07 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.5) (duration: 14m 52s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 07m 58s)
  • 01:48 mutante: netmon1002 - chown rancid:rancid /var/lib/rancid ; touch /var/lib/rancid/.gitconfig, let rancid write to config, then git config --global user.email and user.name as the rancid user | fix permissions on .git/objects files, let rancid user own them all | re-commit .gitingore change | SSH_AUTH_SOCK=/run/keyholder/proxy.sock /usr/lib/rancid/bin/rancid-run as user "rancid" runs clean,
  • 01:20 mutante: netmon1002 - copied missing router.db, routers.all/.down/.up over from netmon1001 to /var/lib/rancid/core. routers.db is an untracked file, the others are in .gitignore. this is all like on netmon1001 as well. adding routers.db to .gitignore file on both, like the other router* files already were (T159756)
  • 01:00 mutante: netmon1002 - locally "git clone /var/lib/rancid/GIT/core" into /var/lib/rancid (i rsynced that but it's a bare repository without a work tree. work tree is /var/lib/rancid/core (after this) (T159756)
  • 00:44 mutante: naos: disarm keyholder and armed it again to proof i didn't break anything on jessie by fixing keyholder on stretch with gerrit:358884
  • 00:39 demon@tin: Synchronized wmf-config/CommonSettings.php: extdist update (duration: 00m 44s)
  • 00:09 aaron@tin: Synchronized wmf-config/InitialiseSettings.php: Capture messages on 'autoloader' debug log channel (duration: 00m 44s)

2017-06-13

  • 23:29 RainbowSprinkles: gerrit: upgrading on master 2.13.4-13-gc0c5cc4742 -> 2.13.8-1-g7c438d37a2 (been running on slave for a week)
  • 23:13 mutante: contint1001 - started zuul using the old init script
  • 23:05 mutante: netmon1001/1002: rsynced /var/lib/rancid/CVS and /var/lib/rancid/GIT from 1001 to 1002 for rancid migration (T159756)
  • 23:04 demon@tin: Synchronized php-1.30.0-wmf.4/extensions/OpenStackManager: Re-adding deleted special page (duration: 00m 45s)
  • 22:06 ejegg: updated fundraising tools from f2522cd to 585f546
  • 21:59 gwicke: restarted pdfrender on scb1003; was spinning on CPU & using 15G of memory (!)
  • 21:58 gwicke: restarted pdfrender on scb1002 and scb1004; was spinning on CPU
  • 21:56 hashar: Zuul back, running in an interactive terminal.
  • 21:46 mutante: netmon1002 - was able to "keyholder arm" after stretch install after applying https://gerrit.wikimedia.org/r/358884 as hotfix
  • 21:30 mobrovac@tin: Finished deploy [restbase/deploy@9a86d4c]: (no justification provided) (duration: 01m 06s)
  • 21:29 mobrovac@tin: Started deploy [restbase/deploy@9a86d4c]: (no justification provided)
  • 21:13 hashar: Gracefully restarting Zuul
  • 21:11 ppchelko@tin: Finished deploy [changeprop/deploy@4ba3c59]: Rate-limiter enhancements (duration: 01m 08s)
  • 21:10 ppchelko@tin: Started deploy [changeprop/deploy@4ba3c59]: Rate-limiter enhancements
  • 21:02 demon@tin: Synchronized php-1.30.0-wmf.5/extensions/CentralAuth/includes/CentralAuthHooks.php: Fix bad method name (duration: 00m 44s)
  • 20:37 hashar: Restarting Nodepool. apparently confused in pool tracking and spawning to many Trusty nodes (7 instead of 4)
  • 20:02 demon@tin: Synchronized php-1.30.0-wmf.5/includes/api/ApiParse.php: T167826 (duration: 00m 44s)
  • 20:00 mobrovac@tin: Finished deploy [restbase/deploy@4c1cdd0]: (no justification provided) (duration: 04m 29s)
  • 19:56 mobrovac@tin: Started deploy [restbase/deploy@4c1cdd0]: (no justification provided)
  • 19:37 Amir1: restarting ores-related services in scb1001 (T167819)
  • 19:24 mutante: scb1001 - killed process 10971 (pdfrendering/electron)
  • 19:24 demon@tin: Synchronized php-1.30.0-wmf.5/extensions/CategoryTree/CategoryPageSubclass.php: Fix up variable visibility (duration: 00m 44s)
  • 19:12 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.5
  • 19:09 mobrovac@tin: Finished deploy [restbase/deploy@9a86d4c]: (no justification provided) (duration: 07m 33s)
  • 19:08 mutante: netmon1002 - reinstallled with stretch, revoked puppet cert, salt key, signing new cert, accepting new key, initial puppet run (T159756)
  • 19:01 mobrovac@tin: Started deploy [restbase/deploy@9a86d4c]: (no justification provided)
  • 18:56 mutante: reinstalling netmon1002 with stretch - scheduled icinga downtime
  • 18:54 legoktm: starting to delete all rows from linter tables on large wikis - T167758
  • 18:48 mobrovac@tin: Finished deploy [restbase/deploy@4c1cdd0]: (no justification provided) (duration: 04m 36s)
  • 18:43 mobrovac@tin: Started deploy [restbase/deploy@4c1cdd0]: (no justification provided)
  • 18:39 mobrovac@tin: Started deploy [restbase/deploy@4c1cdd0]: (no justification provided)
  • 18:37 mobrovac@tin: Finished deploy [restbase/deploy@4c1cdd0]: (no justification provided) (duration: 04m 19s)
  • 18:33 mobrovac@tin: Started deploy [restbase/deploy@4c1cdd0]: (no justification provided)
  • 18:27 demon@tin: Finished scap: testwiki to wmf.5 + l10n bootstrap (duration: 42m 16s)
  • 17:52 bblack: cp4021 reboot for bnx2x modparam change
  • 17:50 ottomata: merged removal of x_forwarded_for from all varnishkafka webrequest instances
  • 17:45 ladsgroup@tin: Finished deploy [ores/deploy@862aea9]: ORES deploy early June: T167223 (duration: 33m 52s)
  • 17:45 demon@tin: Started scap: testwiki to wmf.5 + l10n bootstrap
  • 17:42 demon@tin: Pruned MediaWiki: 1.30.0-wmf.2 [keeping static files] (duration: 01m 13s)
  • 17:40 demon@tin: Pruned MediaWiki: 1.30.0-wmf.1 [keeping static files] (duration: 05m 10s)
  • 17:39 bblack: restart varnish-be on cp2002 (mailbox lag, likely induced by swift traffic testing in codfw)
  • 17:11 ladsgroup@tin: Started deploy [ores/deploy@862aea9]: ORES deploy early June: T167223
  • 17:06 akosiaris: rebooting sca2003 for tests
  • 16:35 moritzm: upgrading osmium to HHVM 3.18
  • 16:08 moritzm: installing libnl security updates on trusty
  • 15:41 akosiaris: upload apertium-spa_1.0.0~r78827-1+wmf to apt.wikimedia.org/jessie-wikimedia/main
  • 15:41 akosiaris: upload apertium-ita_0.9.0~r78828-1+wmf to apt.wikimedia.org/jessie-wikimedia/main
  • 15:41 akosiaris: upload apertium-fra_1.1.0~r78695-1+wmf to apt.wikimedia.org/jessie-wikimedia/main
  • 15:41 akosiaris: upload apertium-cat_2.1.0~r78615-1+wmf to apt.wikimedia.org/jessie-wikimedia/main
  • 15:41 gehel: restart of relforge1001 to test https://gerrit.wikimedia.org/r/#/c/358353/
  • 15:09 gehel: applying new GC configuration on elastic1018 - T167636
  • 14:53 godog: update inter-routing for upload to point esams to codfw
  • 14:22 gehel: restarting elasticsearch on relforge to validate GC configuration - T167636
  • 14:17 ottomata: stopping puppet on cp1045, testing removal of xff from varnishkafka webrequest data
  • 14:14 godog: point upload varnish to swift in codfw - T162609
  • 14:11 moritzm: upgrading mw1299-mw1306 to HHVM 3.18
  • 14:10 urandom: T164865: Restart RESTBase dev; apply range delete probability of 1.0
  • 13:30 godog: Thumbor to group1 wikis + mediawiki.org - T167793
  • 13:15 hashar: European SWAT completed
  • 13:13 hashar@tin: Synchronized php-1.30.0-wmf.4/extensions/Popups: actions/rest: Use DB-key version of title - T167633 (duration: 00m 41s)
  • 13:08 hashar@tin: Synchronized php-1.30.0-wmf.4/includes/htmlform/OOUIHTMLForm.php: Do not try to parse empty argument in getErrorsOrWarnings in OOUI - T167644 (duration: 00m 41s)
  • 13:04 hashar@tin: Synchronized wmf-config/Wikibase-production.php: Enable Wikidata echo notifications for all wikis (except enwiki, frwiki, dewiki) - T142102 (duration: 00m 42s)
  • 12:44 marostegui: Deploy alter table on s2 on db1036 - T166205
  • 12:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 - T166205 (duration: 00m 41s)
  • 12:12 marostegui: Deploy alter table on s2 on dbstore1002 - T166205
  • 12:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 - T166206 (duration: 00m 51s)
  • 11:56 godog: enable thumbor serving for group0 wikis with media files - T167782
  • 11:41 moritzm: upgrading HHVM on tin/naos to HHVM 3.18
  • 10:59 moritzm: upgrading mw1283-mw1290 to HHVM 3.18
  • 10:21 godog: reenable thumbor swift storage, same paths as mediawiki - T167783
  • 10:11 elukey: completed rollout of https://gerrit.wikimedia.org/r/354449
  • 09:54 moritzm: upgrading mw2248-mw2250 to HHVM 3.18
  • 09:37 godog: disable thumbor shadow requests, enable thumbor-only serving for testwiki - T167490
  • 09:28 moritzm: upgrading mw1276-mw1282 to HHVM 3.18
  • 09:27 elukey: puppet disabled on kafka*, analytics*, druid*, conf* for https://gerrit.wikimedia.org/r/354449 - incremental rollout
  • 09:13 marostegui: Deploy alter table s4 - db1095 - T166206
  • 08:56 moritzm: upgrading mw1165-mw1167 to HHVM 3.18
  • 08:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 - T166205 (duration: 00m 41s)
  • 08:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1089 original weight - T166935 (duration: 00m 42s)
  • 08:21 gehel: restart OSM synchronisation on maps2001
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight (duration: 00m 42s)
  • 08:05 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2020.codfw.wmnet
  • 08:01 gehel: adding elastic2020 back in the elasticsearch cluster - T149006
  • 07:48 marostegui: Drop table updates on enwiki (s1) - T139342
  • 07:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight (duration: 00m 41s)
  • 07:30 moritzm: restarting HHVM on mw canaries to pick up libtasn update
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with less weight (duration: 00m 41s)
  • 07:12 marostegui: Reboot scb2005 - T167638
  • 06:55 elukey: executed "cumin 'mw2*.codfw.wmnet' 'find /var/log/hhvm/* -user root -exec chown www-data:www-data {} \;'" to fix the last occurences of wrong root:adm owned hhvm error logs
  • 06:51 moritzm: installing libtasn security updates
  • 06:43 marostegui: Stop MySQL on db1089 to upgrade its raid controller firmware - T166935
  • 06:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T166935 (duration: 00m 42s)
  • 02:33 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jun 13 02:33:23 UTC 2017 (duration 6m 12s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 08m 00s)
  • 01:29 papaul: OS install on labtestnet2002
  • 00:40 andyrussg@tin: Finished scap: Update CentralNotice (duration: 20m 51s)
  • 00:19 andyrussg@tin: Started scap: Update CentralNotice

2017-06-12

  • 23:22 mutante: netmon1002 - keyholder arm - loaded rancid deploy key (uses separate passphrase from deployment key)
  • 22:01 mutante: netmon1002 - apt-get -t jessie-backports install rancid (upgrade from 2.3.8 to 3.6.2 to match version on netmon1001) - rancid version is not specified in puppet so even though backports gets enabled the older version gets installed and this manual step is needed unless we start specifying the version in the manifest (T159756)
  • 20:30 mutante: ns0, ns1 - same as before - gen zones, check zones, reload zones, to add "atj.wikipedia.org" (T167714)
  • 20:26 mutante: ns2 - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones to add new Wikipedia language "atj" (needed when editing langlist but not touching templates) (T167714)
  • 19:10 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Lift IP throttle for Wikipedia workshop (14 June 2017) T167011 + Fix throttle rule for Scotland university editathon (duration: 00m 41s)
  • 18:46 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Lift IP throttle for Editathon (13 June 2017) T167517 (duration: 00m 41s)
  • 18:40 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Lift IP throttle for Wikipedia Editathon (June 16th 2017) T167201 (duration: 00m 41s)
  • 18:30 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add NS:100 to wgNamespacesToBeSearchedDefault for enwikisource T167511 (duration: 00m 41s)
  • 18:27 thcipriani@tin: Synchronized php-1.30.0-wmf.4/resources/src/mediawiki.rcfilters/mw.rcfilters.Controller.js: SWAT: RCFilters: Retain extra url params when comparing url equivalency T167551 (duration: 00m 41s)
  • 18:17 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Setup the new wgPopupsGateway config variable. NOOP T165018 (duration: 00m 42s)
  • 17:24 joal@tin: Finished deploy [analytics/refinery@08fe129]: Bug correction on regular weekly deploy of refinery (2) (duration: 03m 00s)
  • 17:24 gehel: running stress + bonnie on elastic2020 to check new hardware - T149006
  • 17:21 joal@tin: Started deploy [analytics/refinery@08fe129]: Bug correction on regular weekly deploy of refinery (2)
  • 17:07 gehel@tin: Finished deploy [wdqs/wdqs@84557b8]: (no justification provided) (duration: 02m 32s)
  • 17:05 gehel@tin: Started deploy [wdqs/wdqs@84557b8]: (no justification provided)
  • 16:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with less weight (duration: 00m 41s)
  • 14:32 gehel: restart elasticsearch on relforge1001 to validate GC configuration
  • 14:14 moritzm: updating tor on radium to 0.2.9.11-1~d80.jessie+1
  • 14:14 hashar: European SWAT completed
  • 14:13 hashar@tin: Synchronized static/images/project-logos/: Update logo for the Norwegian Wikisource - T167192 (duration: 00m 41s)
  • 14:12 hashar@tin: Synchronized static/images/: Delete duplicate HD logos for the Punjabi Wikipedia (duration: 00m 41s)
  • 14:04 moritzm: updating tor in jessie-wikimedia to 0.2.9.11-1~d80.jessie+1 (via reprepro update from tor repository)
  • 13:59 moritzm: upgrading mw1296-mw1298 to HHVM 3.18
  • 13:53 marostegui: Shutdown db1089 for maintenance - T166935
  • 13:48 hashar@tin: Synchronized php-1.30.0-wmf.4/includes/specials/SpecialNewimages.php: SpecialNewimages: Do not add the module when the special page is included - T167601 (duration: 00m 41s)
  • 13:40 hashar: redoing all the fawiki* updateCollation.php since I ran them without deploying the IS.php change :(
  • 13:38 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Change Persian Wikis from uca-fa to xx-uca-fa - T139110 (duration: 00m 41s)
  • 13:35 moritzm: uploaded openssl 1.1.0f to apt.wikimedia.org
  • 13:31 joal@tin: Finished deploy [analytics/refinery@0dda4a9]: Bug correction for egular weekly deploy of refinery (duration: 03m 40s)
  • 13:30 aharoni: running mwscript updateCollation.php --wiki=bawikibooks
  • 13:28 joal@tin: Started deploy [analytics/refinery@0dda4a9]: Bug correction for egular weekly deploy of refinery
  • 13:25 hashar: terbium: for T139110 mwscript updateCollation.php --wiki=fawikiquote --previous-collation=uca-fa
  • 13:24 hashar: terbium: for T139110 mwscript updateCollation.php --wiki=fawikinews --previous-collation=uca-fa
  • 13:24 hashar: terbium: for T139110 mwscript updateCollation.php --wiki=fawikibooks --previous-collation=uca-fa
  • 13:24 aharoni: running mwscript updateCollation.php --wiki=bawiki
  • 13:23 hashar: terbium: for T139110 mwscript updateCollation.php --wiki=fawiktionary --previous-collation=uca-fa
  • 13:22 hashar: terbium: for T139110 mwscript updateCollation.php --wiki=fawikisource --previous-collation=uca-fa
  • 13:21 hashar: terbium: for T139110 mwscript updateCollation.php --wiki=fawiki --previous-collation=uca-fa
  • 13:17 moritzm: upgrading cp1008 to openssl 1.1.0f
  • 13:13 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set collation for Bashkir wikis to uppercase-ba - T162823 (duration: 00m 41s)
  • 13:10 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: update some logos 6974b9ab4..76939d15f (duration: 00m 41s)
  • 13:08 hashar@tin: Synchronized static/images/project-logos: (no justification provided) (duration: 00m 43s)
  • 12:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 for maintenance - T166935 (duration: 00m 41s)
  • 12:01 moritzm: upgrading mw1266-mw1275 to HHVM 3.18
  • 11:09 joal@tin: Finished deploy [analytics/refinery@d9c3419]: Regular weekly deploy of refinery (mostly unique_devices patches) (duration: 06m 18s)
  • 11:05 moritzm: upgrading job runners mw1162-mw1164 to HHVM 3.18
  • 11:03 joal@tin: Started deploy [analytics/refinery@d9c3419]: Regular weekly deploy of refinery (mostly unique_devices patches)
  • 10:59 marostegui: Drop table updates on commonswiki (s4) - T139342
  • 10:28 moritzm: upgrading mw1250-mw1258 to HHVM 3.18
  • 09:55 moritzm: upgrading mw1221-mw1235 to HHVM 3.18
  • 09:25 godog: swift eqiad-prod finish decom ms-be1005/6/7 - T166489
  • 09:13 moritzm: upgrading mw1236-mw1249 to HHVM 3.18
  • 09:12 marostegui: Drop table updates on dewiki and wikidatawiki (s5) - T139342
  • 08:31 godog: reboot ms-be1002, load avg slowly creeping up
  • 08:22 elukey: powercycle scb2005 (console frozen, host unresponsive)
  • 07:40 elukey: restarted citoid on scb1001 (kept failing health checks for Error: write EPIPE)
  • 07:38 marostegui: Reboot ms-be1008 as xfs is failing
  • 07:31 marostegui: Deploy alter table s2 - db1060 - T166205
  • 07:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 - T166205 (duration: 00m 41s)
  • 07:26 elukey: ran restart-pdfrender on scb1001 (OOM errors in the dmesg from hours ago)
  • 07:22 elukey: ran restart-pdfrender on scb1002 (OOM errors in the dmesg from hours ago)
  • 07:21 marostegui: Deploy alter table s4 - db1064 - https://phabricator.wikimedia.org/T166206
  • 07:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 - T166206 (duration: 00m 41s)
  • 06:53 moritzm: upgrade remaining app servers running HHVM 3.18 to 3.18.2+wmf5
  • 05:38 marostegui: Deploy alter table s4 - labsdb1003 - T166206
  • 02:14 l10nupdate@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)

2017-06-11

  • 14:14 elukey: executed cumin 'mw22[51-60].codfw.wmnet' 'find /var/log/hhvm/* -user root -exec chown www-data:www-data {} \;' to reduce cron-spam (new hosts added in March) - T146464
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jun 11 02:25:53 UTC 2017 (duration 6m 6s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 07m 37s)

2017-06-10

  • 11:54 andrewbogott: cleared leaked instances out of the nova fullstack test. Six were up and running and reachable, one had a network failure.
  • 10:19 TimStarling: on terbium: running purgeParserCache.php prior to cron job due to observed disk space usage increase
  • 10:00 marostegui: Purge binary logs on pc1006-pc2006
  • 09:58 marostegui: Purge binary logs on pc1004-pc2004 and pc1005-pc2005
  • 02:22 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jun 10 02:22:22 UTC 2017 (duration 6m 13s)
  • 02:16 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 05m 33s)

2017-06-09

  • 21:18 mobrovac@tin: Finished deploy [restbase/deploy@4e5cb35]: (no justification provided) (duration: 01m 40s)
  • 21:17 mobrovac@tin: Started deploy [restbase/deploy@4e5cb35]: (no justification provided)
  • 21:07 mobrovac@tin: Finished deploy [restbase/deploy@4e5cb35]: Ensure the extract field is always present in the summary response - T167045 (take #2) (duration: 05m 23s)
  • 21:02 mobrovac@tin: Started deploy [restbase/deploy@4e5cb35]: Ensure the extract field is always present in the summary response - T167045 (take #2)
  • 21:01 mobrovac@tin: Finished deploy [restbase/deploy@4e5cb35]: Ensure the extract field is always present in the summary response - T167045 (duration: 04m 57s)
  • 20:56 mobrovac@tin: Started deploy [restbase/deploy@4e5cb35]: Ensure the extract field is always present in the summary response - T167045
  • 20:54 mobrovac@tin: Finished deploy [restbase/deploy@4e5cb35] (staging): Ensure the extract field is always present in the summary response (duration: 03m 39s)
  • 20:50 mobrovac@tin: Started deploy [restbase/deploy@4e5cb35] (staging): Ensure the extract field is always present in the summary response
  • 20:12 demon@tin: Synchronized php-1.30.0-wmf.4/extensions/CirrusSearch/includes/Job/DeleteArchive.php: Really fix it this time (duration: 00m 43s)
  • 19:49 mutante: fermium: $ sudo /usr/local/sbin/disable_list wikino-bureaucrats (T166848)
  • 19:46 RainbowSprinkles: mw1299: running scap pull, maybe out of date?
  • 18:12 gehel: retry allocation of failed shards on elasticsearch eqiad
  • 15:47 _joe_: installed python-service-checker 0.1.3 on einsteinium,tegmen T167048
  • 15:44 _joe_: uploaded service-checker 0.1.3
  • 15:11 _joe_: upgraded python-service-checker to 0.1.2 on tegmen,einsteinium
  • 13:18 godog: upgrade thumbor to 0.1.40 - T167462
  • 12:36 gehel: reducing high watermark on elasticsearch eqiad to rebalance shards
  • 07:51 elukey: run megacli -LDSetProp -Direct -LALL -aALL on analytics[1058-1068] - T166140
  • 07:40 moritzm: upgrade app servers in codfw running HHVM 3.18 to +wmf5
  • 07:26 elukey: run megacli -LDSetProp ADRA -LALL -aALL on analytics[1058-1068] - T166140
  • 07:15 elukey: deleted /etc/logrotate.d/nova-manage from labtestvirt2003 to reduce cronspam (same solution used in T132422#2679434)
  • 06:58 moritzm: updating mw117* to HHVM 3.18+wmf5
  • 06:41 moritzm: updating mw1161 to HHVM 3.18
  • 05:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 - T166206 (duration: 00m 41s)
  • 05:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 - T166205 (duration: 00m 42s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jun 9 02:25:29 UTC 2017 (duration 6m 27s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 06m 04s)
  • 00:36 ejegg: disabled banner impressions loader
  • 00:15 mutante: mw1275 depooled (T124956)
  • 00:08 ejegg: updated CiviCRM from 5a83ee1 to dfc26f0
  • 00:01 mutante: seeing "php: Lost parent, LightProcess exiting" in syslog on mw1275 today (T124956)

2017-06-08

  • 23:48 mutante: mw1275 - restarted hhvm (php: Lost parent, LightProcess exiting in syslog)
  • 23:37 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: remaining wikis to wmf.4
  • 23:16 demon@tin: Synchronized php-1.30.0-wmf.4/extensions/CirrusSearch/includes/Job/DeleteArchive.php: Fix array access bug (duration: 00m 43s)
  • 23:15 demon@tin: Synchronized php-1.30.0-wmf.4/extensions/GeoData/includes/Searcher.php: Temp hax to point GeoData at codfw DC (duration: 00m 43s)
  • 22:56 demon@tin: Synchronized php-1.30.0-wmf.4/extensions/RevisionSlider/src/RevisionSliderHooks.php: Re-syncing with permanent committed fix (duration: 00m 44s)
  • 22:36 ejegg: updated civicrm from c70ae65 to 5a83ee1
  • 22:29 demon@tin: Synchronized php-1.30.0-wmf.4/extensions/RevisionSlider/src/RevisionSliderHooks.php: Livehack/test (duration: 00m 44s)
  • 22:17 demon@tin: Synchronized php-1.30.0-wmf.4/extensions/MobileFrontend/includes/specials/SpecialMobileDiff.php: (no justification provided) (duration: 00m 44s)
  • 22:15 mobrovac@tin: Finished deploy [changeprop/deploy@836b070]: Rate limiting, attempt #2 (duration: 01m 23s)
  • 22:13 mobrovac@tin: Started deploy [changeprop/deploy@836b070]: Rate limiting, attempt #2
  • 21:56 mobrovac@tin: Finished deploy [changeprop/deploy@dc1948f]: (no justification provided) (duration: 01m 39s)
  • 21:54 mobrovac@tin: Started deploy [changeprop/deploy@dc1948f]: (no justification provided)
  • 21:54 mobrovac@tin: Finished deploy [changeprop/deploy@56f7511]: (no justification provided) (duration: 01m 32s)
  • 21:52 mobrovac@tin: Started deploy [changeprop/deploy@56f7511]: (no justification provided)
  • 21:50 mobrovac@tin: Finished deploy [changeprop/deploy@56f7511]: (no justification provided) (duration: 00m 34s)
  • 21:50 mobrovac@tin: Started deploy [changeprop/deploy@56f7511]: (no justification provided)
  • 21:42 urandom: T160570: Rolling Cassandra restart, restbase-dev
  • 21:35 ppchelko@tin: Finished deploy [changeprop/deploy@56f7511]: Revert previous deploy (duration: 01m 07s)
  • 21:34 ppchelko@tin: Started deploy [changeprop/deploy@56f7511]: Revert previous deploy
  • 21:31 ppchelko@tin: Started deploy [changeprop/deploy@56f7511]: dc1948f6bc7b1 Revert previous deploy
  • 21:29 ppchelko@tin: Finished deploy [changeprop/deploy@56f7511]: dc1948f6bc7b1 (duration: 00m 16s)
  • 21:29 ppchelko@tin: Started deploy [changeprop/deploy@56f7511]: dc1948f6bc7b1
  • 21:24 ppchelko@tin: Finished deploy [changeprop/deploy@56f7511]: Rate limiting code and config. T161710 (duration: 01m 46s)
  • 21:23 ppchelko@tin: Started deploy [changeprop/deploy@56f7511]: Rate limiting code and config. T161710
  • 20:23 RainbowSprinkles: gerrit2001: upgraded to 2.13.8+git1-wmf.5 / 2.13.8-1-g7c438d37a2
  • 20:12 mutante: imported gerrit_2.13.8+git1-wmf.5_amd64 on apt.wikimedia.org (T158946)
  • 19:26 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.4
  • 19:13 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: mw.org -> wmf.4
  • 19:05 demon@tin: Synchronized wmf-config/InitialiseSettings.php: New wordmark for mk/srwiki (duration: 00m 57s)
  • 19:03 demon@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-sr.svg: new wordmark (duration: 00m 46s)
  • 18:59 maxsem@tin: Synchronized php-1.30.0-wmf.4/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/357846/ (duration: 00m 49s)
  • 18:55 maxsem@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 18:49 urandom: Restarting Cassandra, restbase-dev1001-a to test alternative disk access mode
  • 18:42 mutante: built gerrit_2.13.8+git1-wmf.5 on copper (T158946)
  • 18:40 maxsem@tin: Synchronized php-1.30.0-wmf.4/extensions/LoginNotify/: https://gerrit.wikimedia.org/r/#/c/357743/ (duration: 00m 44s)
  • 18:36 maxsem@tin: Synchronized php-1.30.0-wmf.4/includes/EditPage.php: https://gerrit.wikimedia.org/r/#/c/357855/ (duration: 00m 45s)
  • 18:25 maxsem@tin: Synchronized multiversion/submodules.json: https://gerrit.wikimedia.org/r/#/c/352985/3 (duration: 00m 43s)
  • 18:17 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/356881/4 (duration: 00m 44s)
  • 18:09 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/354731/6 (duration: 00m 44s)
  • 17:55 arlolra: Updated Parsoid to 108eed81 (T136653, T167081)
  • 17:46 arlolra@tin: Finished deploy [parsoid/deploy@f82cb4f]: Updating Parsoid to 108eed81 (duration: 10m 12s)
  • 17:36 arlolra@tin: Started deploy [parsoid/deploy@f82cb4f]: Updating Parsoid to 108eed81
  • 16:44 nuria@tin: Finished deploy [analytics/refinery@2fbed63]: (no justification provided) (duration: 04m 08s)
  • 16:40 nuria@tin: Started deploy [analytics/refinery@2fbed63]: (no justification provided)
  • 16:33 godog: delete net.ifnames for ms-be2001 and ms-be2013 - T158429
  • 16:24 bblack: cp1074: varnish-backend-restart for mailbox lag
  • 15:22 moritzm: updating mw1262-mw1265 to HHVM 3.18.2+wmf5
  • 15:11 XioNoX: Upgrading rancid to 3 - T167288
  • 14:56 moritzm: updating mw1261 to HHVM 3.18.2+wmf5
  • 14:54 XioNoX: 2 blackhole IPs pushed to cr* routers
  • 14:02 aude@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Do not enable Wikibase data access yet on beta wiktionary (duration: 00m 43s)
  • 13:47 aude@tin: Synchronized php-1.30.0-wmf.4/extensions/RevisionSlider: Fix fatal error: T167359 (duration: 00m 44s)
  • 13:41 aude@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 13:33 aude@tin: Synchronized php-1.30.0-wmf.4/extensions/Wikidata: Fix warning in date formatting T167360 (duration: 02m 16s)
  • 13:31 XioNoX: blackhole v4 IPs removed from all cr* routers
  • 12:39 moritzm: updating mwdebug* to HHVM 3.18.2+wmf5
  • 12:17 moritzm: uploaded hhvm 3.18.2-dfsg-1+wmf5 to apt.wikimedia.org
  • 12:17 moritzm: updated hhvm 3.18.2-dfsg-1+wmf5 to apt.wikimedia.org
  • 11:41 marostegui: Drop table updates on s7 - T139342
  • 11:41 moritzm: powercycling mw1294, mgmt is unresponsive
  • 09:41 moritzm: updating mysql-connector-java on hadoop cluster
  • 09:05 elukey: upgrade zookeeper packages to 3.4.5+dfsg-2+deb8u2 on conf100[123], conf200[23] and druid100[123]
  • 08:58 godog: swift eqiad-prod eqiad-prod: decom ms-be1005/6/7 - T166489
  • 08:50 TabbyCat: Rename user "Mlpearc" to "FlightTime" on Central Auth is now finished (T166028)
  • 08:36 godog: temporarily stop ircecho on tegmen, puppet spam
  • 08:22 TabbyCat: Starting big global rename as requested in T166028
  • 07:00 marostegui: Drop table updates on s6 - T139342
  • 05:59 _joe_: uploading new service-checker version to reprepro, T167048
  • 05:54 marostegui: Deploy alter table s2 - db1074 - T166205
  • 05:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 - T166205 (duration: 00m 43s)
  • 05:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 - T166205 (duration: 00m 45s)
  • 02:56 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jun 8 02:56:27 UTC 2017 (duration 6m 26s)
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 05m 07s)
  • 02:40 twentyafterfour: deploying hotfix for T166958
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 08m 41s)
  • 01:45 mutante: manually running mediawiki maintenance job "echo_mail_batch" (on terbium as www-data, just like cron). did _NOT_ get denied by DB (T167373)
  • 01:37 maxsem@tin: Synchronized php-1.30.0-wmf.2/extensions/GeoData/includes/Searcher.php: Livehack to stop exceptions (duration: 00m 46s)
  • 00:54 mutante: cp4019 - powercycled (same as others) | lvs1007 - sits at installer - waiting for IP to be configured (T150256)
  • 00:47 mutante: cp1059 - same thing - powercycle after failed boot after reimaging script
  • 00:41 mutante: cp4011 - like cp4010 - powercycling (host down, console sat at initramfs). it hat the "did not detect disk by uid" issue but boots normal after powercycle
  • 00:34 mutante: cp4020 - powercycling (host down, console sat at initramfs)
  • 00:31 mutante: cp2012 - fixed salt key issue as for cp3005 (delete key, stop/start minion, accept new key)
  • 00:25 mutante: salt-master: deleted salt-key for cp3005, stopped started minion cp3005 - key got accepted again (was: Salt Master has rejected this minion's public key)

2017-06-07

  • 23:33 ppchelko@tin: Finished deploy [trending-edits/deploy@e0a8716]: Include reverts from bots to get rid of false positives (duration: 07m 00s)
  • 23:30 catrope@tin: Synchronized php-1.30.0-wmf.4/extensions/RelatedArticles/resources/ext.relatedArticles.readMore.eventLogging/index.js: T167236 (duration: 00m 43s)
  • 23:28 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Relaunch related pages A/B test to 98% of users on enwiki (T167310) (duration: 00m 44s)
  • 23:26 ppchelko@tin: Started deploy [trending-edits/deploy@e0a8716]: Include reverts from bots to get rid of false positives
  • 22:24 bblack: reimaging ex-cache_maps hosts (fresh role::spare::system installs)
  • 22:18 bblack: puppet node clean+deactivate for cp3003
  • 22:15 bblack: lvs4002 - restarting pybal to remove old maps table entries
  • 22:14 bblack: lvs3002 - restarting pybal to remove old maps table entries
  • 22:13 bblack: lvs2002 - restarting pybal to remove old maps table entries
  • 22:13 bblack: lvs1002 - restarting pybal to remove old maps table entries
  • 22:12 bblack: lvs4004 - restarting pybal to remove old maps table entries
  • 22:11 bblack: lvs3004 - restarting pybal to remove old maps table entries
  • 22:09 bblack: lvs2005 - restarting pybal to remove old maps table entries
  • 22:07 bblack: lvs1005 - restarting pybal to remove old maps table entries
  • 21:32 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.2
  • 21:31 twentyafterfour: rolling back to wmf.2 due to error spike and popups no longer working refs T166829
  • 21:25 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.4
  • 21:23 twentyafterfour@tin: Synchronized php-1.30.0-wmf.4/: sync 3248a17 refs T167343 (duration: 07m 52s)
  • 20:26 twentyafterfour@tin: Synchronized php-1.30.0-wmf.4/extensions/MobileFrontend: Deploy 66ef9cb refs T167216 (duration: 00m 46s)
  • 20:04 twentyafterfour: Preparing to deploy the MediaWiki train for group1 wikis, 1.30.0-wmf.4 refs T166829
  • 18:22 thcipriani@tin: Synchronized wmf-config: SWAT: Enable archive indexing on delete for select wikis T162302 (duration: 00m 47s)
  • 18:14 thcipriani@tin: Synchronized portals: SWAT: Updating portals stats T128546 (duration: 00m 44s)
  • 18:13 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Updating portals stats T128546 (duration: 00m 44s)
  • 17:14 elukey: restart nutcracker on thumbor1002 (too many connections approaching the 1024 ulimit)
  • 15:37 akosiaris: disable puppet on puppetmaster1001, depool rhodium for tests
  • 14:51 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2007.codfw.wmnet
  • 14:48 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1007.codfw.wmnet
  • 14:11 dcausse: eu swat done
  • 12:56 aude@tin: Synchronized php-1.30.0-wmf.4/extensions/Wikidata: Fix parser function registration T167238 (duration: 02m 20s)
  • 12:43 marostegui: Drop table updates on s2 - T139342
  • 12:40 aude@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable Wikibase Client on beta wiktionary sites T158323 (duration: 00m 43s)
  • 12:40 elukey: upgrade zookeeper packages on conf2002 to 3.4.5+dfsg-2+deb8u2
  • 12:32 bblack: cp1072, cp1063 restarting varnish backend for mailbox lag
  • 12:26 aude@tin: Synchronized wmf-config/Wikibase.php: Site links for non-main namespace wiktionary pages T158323 (duration: 00m 43s)
  • 12:19 aude@tin: Synchronized wmf-config/Wikibase-labs.php: Site links for non-main namespace wiktionary pages (duration: 00m 44s)
  • 11:08 gehel: restarting cron on logstash cluster
  • 10:29 moritzm: installing tiff regression security update on trusty
  • 10:26 ema: upgrade lvs1*/lvs2* to jessie 8.8 point release T164703
  • 09:49 ema: upgrade lvs[3001-3004] to jessie 8.8 point release T164703
  • 09:28 gehel: upgrading kibana to v5.3.3 on logstash cluster - T167266
  • 09:15 ema: upgrade lvs4001-4004 to jessie 8.8 point release T164703
  • 08:58 marostegui: Deploy alter table on s2 - db1076 - T166205
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 - T166205 (duration: 00m 43s)
  • 08:50 marostegui: Deploy alter table s4 - db1056 - T166206
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 - T166206 (duration: 00m 43s)
  • 08:02 marostegui: Run redact_sanitarium on db1095 for dewiki - T153743
  • 07:22 marostegui: Deploy alter table on db1047 enwiki.revision - T162807
  • 06:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 - T166206 (duration: 00m 44s)
  • 05:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1053, depool db1056 - T166206 (duration: 01m 03s)
  • 03:11 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jun 7 03:11:40 UTC 2017 (duration 6m 54s)
  • 03:04 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.4) (duration: 14m 29s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 57s)
  • 00:21 RainbowSprinkles: gerrit: rolled back to 2.13.4-13-gc0c5cc4742 from 2.13.8. T152640 rearing its ugly head again (login issues)

2017-06-06

  • 23:59 thcipriani@tin: Synchronized php-1.30.0-wmf.2/extensions/Flow/includes/Content/BoardContentHandler.php: SWAT: Revert "Throw when unserializing invalid Flow workflow metadata JSON" T166100 T156813 (duration: 00m 43s)
  • 23:58 thcipriani@tin: Synchronized php-1.30.0-wmf.4/extensions/Flow/includes/Content/BoardContentHandler.php: SWAT: Revert "Throw when unserializing invalid Flow workflow metadata JSON" T166100 T156813 (duration: 00m 45s)
  • 23:56 RainbowSprinkles: gerrit: back from reindexing
  • 23:55 RainbowSprinkles: gerrit: force stopping for a second to reindex accounts
  • 23:17 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable page previews on wikispecies T166894 (duration: 00m 44s)
  • 23:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update ContentNamespaces for Commons Wiki T167077 (duration: 00m 46s)
  • 21:57 RainbowSprinkles: gerrit: restarting last time, didn't work like I wanted
  • 21:53 RainbowSprinkles: gerrit: restarting to test a config tweak
  • 21:41 mutante: contint1001 - graceful'ed Apache to deploy gerrit:351391
  • 21:19 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: unbreak mw.org pref page
  • 20:21 RainbowSprinkles: gerrit: Down for just a moment, finally doing point release on cobalt
  • 19:57 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.4
  • 19:45 demon@tin: Finished scap: testwiki to wmf.4 + prepping l10n. again (x2) (duration: 20m 25s)
  • 19:36 mutante: cobalt - removed systemd unit file (that has issues with ulimit and isn't used yet) - ran "systemctl reset-failed" which cleared the "systemctl status" which made the Icinga check recover
  • 19:24 demon@tin: Started scap: testwiki to wmf.4 + prepping l10n. again (x2)
  • 19:23 demon@tin: scap failed: RuntimeError scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details) (duration: 13m 32s)
  • 19:23 demon@tin: scap failed: average error rate on 1/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/3888cca979647b9381a7739b0bdbc88e for details)
  • 19:10 demon@tin: Started scap: testwiki to wmf.4 + prepping l10n. again
  • 19:08 demon@tin: Synchronized README: No-op, just forcing co-master sync (duration: 01m 27s)
  • 19:01 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: testwiki back to wmf.2
  • 18:55 maxsem@tin: Finished scap: LoginNotify to testwiki - rebuild messages (duration: 38m 19s)
  • 18:16 maxsem@tin: Started scap: LoginNotify to testwiki - rebuild messages
  • 18:15 maxsem@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/357317/2 (duration: 00m 44s)
  • 18:10 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/357317/2 (duration: 00m 44s)
  • 18:03 demon@tin: Finished scap: testwiki to wmf.3, prepping l10n cache (duration: 31m 58s)
  • 17:31 demon@tin: Started scap: testwiki to wmf.3, prepping l10n cache
  • 16:53 moritzm: installing wireshark security updates on trusty (jessie already fixed)
  • 16:41 bblack: rebooted lvs1007 (kernel update)
  • 16:35 bblack: rebooted lvs1007 (kernel update)
  • 15:21 otto@tin: Finished deploy [eventlogging/analytics@37233cd]: (no justification provided) (duration: 00m 04s)
  • 15:21 otto@tin: Started deploy [eventlogging/analytics@37233cd]: (no justification provided)
  • 14:58 moritzm: installing libsndfile security updates on trusty
  • 14:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1094 original weight (duration: 00m 40s)
  • 13:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1094 weight (duration: 00m 40s)
  • 13:39 elukey: shutdown analytics1033 and analytics1039 to replace their BBU - T166140
  • 13:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 with low weight (duration: 00m 40s)
  • 12:58 marostegui: Shutdown db1094 for maintenance - T166518
  • 12:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 for maintenance - T166518 (duration: 00m 39s)
  • 12:51 godog: upgrade scap to 3.5.8 - T127762
  • 12:41 mobrovac@tin: Finished deploy [changeprop/deploy@e92dd66]: Bump src to bc8abf3 (duration: 01m 45s)
  • 12:40 mobrovac@tin: Started deploy [changeprop/deploy@e92dd66]: Bump src to bc8abf3
  • 12:16 bblack: cp1049 - restaret varnish backend for mailbox lag
  • 12:08 gehel: kill stuck osm replication on maps1001
  • 11:28 akosiaris@tin: Finished deploy [servermon/servermon@4a2288f]: (no justification provided) (duration: 00m 04s)
  • 11:28 akosiaris@tin: Started deploy [servermon/servermon@4a2288f]: (no justification provided)
  • 11:17 moritzm: uploaded ferm 2.3.2+wmf1 to apt.wikimedia.org/stretch-wikimedia (T166653)
  • 11:02 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Enabling writing in full entity id in testwikidatawiki (T165197) (duration: 00m 39s)
  • 10:22 moritzm: installing NSS security updates
  • 09:43 moritzm: installing perl security updates
  • 09:41 akosiaris: stop jobchron/jobrunner processes across jobrunner and videoscalers in codfw
  • 09:35 akosiaris: restart jobchron service across videoscalers T129148
  • 09:33 akosiaris: restart jobchron service across jobrunners T129148
  • 09:32 akosiaris@tin: Finished deploy [jobrunner/jobrunner@161c84c]: (no justification provided) (duration: 01m 17s)
  • 09:31 akosiaris@tin: Started deploy [jobrunner/jobrunner@161c84c]: (no justification provided)
  • 09:29 akosiaris: running puppet on jobrunners T129148
  • 09:25 akosiaris: running puppet on videoscalers T129148
  • 09:25 akosiaris: moving around jobrunner/jobrunner was probably not required T129148
  • 09:19 akosiaris: running puppet again on tin, after moving /serv/deployment/jobrunner/jobrunner T129148
  • 09:12 akosiaris: running puppet on mw1161 T129148
  • 09:11 akosiaris: git pull and scap deploy --init for jobrunner T129148
  • 09:08 akosiaris: running puppet on tin T129148
  • 09:04 akosiaris: disable puppet on all jobrunners T129148
  • 09:04 akosiaris: disable puppet on all jobrunners
  • 08:54 dcausse: restarting elastic2014 to reclaim free space on deleted log file
  • 08:43 jynus: stopping db2035 and preparing for reimage
  • 08:39 gehel: raise log level to WARN for TransportShardBulkAction on elasticsearch cirrus - T167091
  • 07:53 gehel: starting upgrade to elasticsearch 5.3.2 on cirrus eqiad cluster - T163708
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add comments about current status of db1089 - T166935 (duration: 00m 39s)
  • 05:56 marostegui: Deploy alter table s3 on db1075 (eqiad master) - T166278
  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jun 6 02:27:37 UTC 2017 (duration 6m 3s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 32s)

2017-06-05

  • 23:33 thcipriani: running on terbium: mwscript extensions/ORES/maintenance/CheckModelVersions.php frwiki && mwscript extensions/ORES/maintenance/PopulateDatabase.php frwiki
  • 23:32 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES review tool in frwiki T165044 (duration: 00m 40s)
  • 23:23 thcipriani: frwiki create tables ores_model and ores_classification T165044
  • 22:03 bblack: cp1074 - varnish-backend-restart (mailbox lag)
  • 22:02 bblack: cp1099 - varnish-backend-restart (mailbox lag)
  • 21:34 bawolff: deployed patch for T165846
  • 21:01 reedy@tin: Synchronized wmf-config/CommonSettings.php: Run Pdf Processors in firejails T164145 T164000 (duration: 00m 40s)
  • 20:16 subbu: updated parsoid to 141fc07d (T166655)
  • 20:10 ssastry@tin: Finished deploy [parsoid/deploy@bb0613c]: Updating Parsoid to 141fc07d (duration: 07m 02s)
  • 20:03 ssastry@tin: Started deploy [parsoid/deploy@bb0613c]: Updating Parsoid to 141fc07d
  • 18:52 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/357169/2 (duration: 00m 39s)
  • 18:43 MaxSem: ran mwscript maintenance/namespaceDupes.php --wiki=etwiki --fix
  • 18:41 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/357025/2 (duration: 00m 39s)
  • 18:36 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/355594/2 (duration: 00m 39s)
  • 18:29 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/357186/2 (duration: 00m 42s)
  • 18:25 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/357026/2 (duration: 00m 38s)
  • 18:11 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/356437/4 (duration: 00m 40s)
  • 16:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1089 original weight - T166935 (duration: 00m 38s)
  • 16:19 jynus: stopping db2037 and preparing for reimage
  • 15:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight - T166935 (duration: 00m 39s)
  • 15:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight - T166935 (duration: 00m 38s)
  • 14:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low weight - T166935 (duration: 00m 39s)
  • 13:47 bblack: rebooting lvs1010 again
  • 13:27 zeljkof: eu swat finished
  • 13:16 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Lift IP throttle for Wikimedia Chile editathon (T166788) (duration: 00m 39s)
  • 13:02 bblack: rebooting lsv1010 (post-reinstall)
  • 12:54 marostegui: Stop MySQL db1047 - T166452
  • 09:06 marostegui: Stop replication on db1070 for maintenance - T153743
  • 08:10 godog: swift eqiad-prod decom ms-be1009 / 10 / 11 - T166489
  • 07:43 marostegui: Stop labsdb1011 to take a backup - T153743
  • 07:41 jynus: stopping db2038 mysql and preparing for reimage
  • 07:15 marostegui: Deploy alter table in s2 (codfw master) this will generate lag in codfw - T166205
  • 06:20 marostegui: Deploy alter table s4 - on labsdb1001 - T166206
  • 06:15 marostegui: Deploy alter table on s3 - db1069 - T166278
  • 06:13 marostegui: Deploy alter table on s4 - db1053 - T166206
  • 06:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 - T166206 (duration: 00m 39s)
  • 05:58 marostegui: Stop MySQL on db1095 to take a backup - T153743
  • 05:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add coments to db1089's current status (duration: 00m 39s)
  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jun 5 02:27:53 UTC 2017 (duration 6m 2s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 08m 14s)

2017-06-04

  • 10:31 ema: mw2256 down, console stuck on 'Starti'. power cycled.
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 09m 12s)

2017-06-03

  • 05:20 marostegui: Reboot db1089 - T166933
  • 05:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - it is broken (duration: 00m 41s)
  • 02:30 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jun 3 02:30:27 UTC 2017 (duration 6m 24s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 47s)
  • 00:04 mutante: wikitech-static-iad: mv /etc/acme/cert/wikitech-static-iad-signed.csr /etc/acme/cert/wikitech-static-iad.chained.crt ; wikitech-static-ord: copy wiki logo: /srv/mediawiki/images# wget https://wikitech-static-iad.wikimedia.org/w/images/labswiki.png

2017-06-02

  • 23:53 demon@tin: Synchronized wmf-config/throttle.php: pruning some old throttle exceptions (duration: 00m 40s)
  • 23:46 mutante: wikitech-static-iad: edited acme_tiny.py to adjust URL to agreement PDF, to fix ""Provided agreement URL [1] does not match current agreement URL[2]"
  • 23:45 mutante: wikitech-static-iad: create new cert for "iad" hostname, using acme-setup/acme-tiny: /usr/local/sbin# acme-setup -i "wikitech-static-iad" -s "wikitech-static-iad.wikimedia.org" ; python acme_tiny.py --account-key /etc/acme/acct/acct.key --csr /etc/acme/csr/wikitech-static-iad.pem --acme-dir /var/acme/challenge/ > /etc/acme/cert/wikitech-static-iad-signed.csr  ; had to hack acme_tiny.py
  • 23:22 mutante: wikitech-static-ord copied Lets-Encrypt intermediate certs from /usr/local/share/ca-certificates on old server
  • 23:19 mutante: wikitech-static (iad): adjust Apache config to use wikitech-static-iad
  • 23:18 mutante: wikitech-static-ord: installed package upgrades, installed vim, removing "ord" from Apache config after DNS change ..
  • 23:14 mutante: maintenance on status.wikimedia.org and wikitech-static.wikimedia.org
  • 20:08 ejegg: re-enabled AstroPay/dLocal payment methods
  • 19:36 ejegg: updated payments-wiki from 5edd788 to 7a50542
  • 19:23 ejegg: updated CiviCRM from 9c06bd2 to c70ae65
  • 18:29 mobrovac@tin: Finished deploy [restbase/deploy@4b14527]: (no justification provided) (duration: 00m 41s)
  • 18:29 mobrovac@tin: Started deploy [restbase/deploy@4b14527]: (no justification provided)
  • 18:28 mobrovac@tin: Started deploy [restbase/deploy@4b14527]: h
  • 17:01 bblack: starting wmf-auto-reimage on lvs1007-10
  • 16:16 RainbowSprinkles: gerrit2001: gerrit updated to 2.13.8+git1-wmf.4
  • 16:03 bblack: start wmf-auto-reimage of lvs1011, lvs1012
  • 15:01 jynus: restarting ircecho on tegment
  • 14:32 mobrovac@tin: Finished deploy [restbase/deploy@4b14527]: Add the extract_html property to the summary end point for T165017 (duration: 06m 43s)
  • 14:25 mobrovac@tin: Started deploy [restbase/deploy@4b14527]: Add the extract_html property to the summary end point for T165017
  • 13:28 gehel: restart elastic2003 to reload logging configuration
  • 12:11 hashar: restarting Jenkins to upgrade the logstash plugin
  • 09:49 jynus: stopping db2041 to prepare it for reimage
  • 09:18 marostegui: Deploy alter table s3 - db1015 - T166278
  • 09:12 marostegui: Deploy alter table s3 - labsdb1003 - T166278
  • 07:47 marostegui: Resume alter table on db1047 enwiki.revision - T166452
  • 07:45 moritzm: uploaded gerrit 2.13.8+git1-wmf4 to apt.wikimedia.org
  • 07:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1059 - T166206 (duration: 00m 39s)
  • 07:36 marostegui: Deploy alter table on s4 - labsdb1009 - T166206
  • 07:02 akosiaris: starting fleet wide PCC for gerrit change 356030. Should take a while to complete
  • 05:25 jynus@tin: Synchronized wmf-config/db-eqiad.php: Emergency pool of db1049 (duration: 00m 48s)
  • 04:42 elukey: removed some old scap revs for the Analytics refinery on stat1002 to free space (git fat jars replicating after each deployment, known issue)
  • 02:46 bd808: Loadavg on mw1198 very high (44+) and nginx/hhvm checks flapping

2017-06-01

  • 23:33 twentyafterfour: phabricator upgrade complete.
  • 23:29 twentyafterfour: Performing phabricator update, expect momentary downtime.
  • 23:25 twentyafterfour: Preparing phabricator update to tag release/2017-06-01/1 [ https://phabricator.wikimedia.org/project/view/2802/ ]
  • 23:20 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: T163463: apply sister search restrictions requested by enwiki (duration: 00m 39s)
  • 23:18 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T163463: apply sister search restrictions requested by enwiki (duration: 00m 40s)
  • 21:59 RainbowSprinkles: gerrit2001: Upgraded to 2.13.8, seems to be running fine this time.
  • 20:37 mobrovac@tin: Finished deploy [citoid/deploy@ba0db9c]: Update spec to minimise alert noise - T163986 (duration: 05m 20s)
  • 20:32 mobrovac@tin: Started deploy [citoid/deploy@ba0db9c]: Update spec to minimise alert noise - T163986
  • 20:23 bsitzmann@tin: Finished deploy [mobileapps/deploy@2a8e648]: Update mobileapps to c4dc72d (duration: 05m 18s)
  • 20:18 bsitzmann@tin: Started deploy [mobileapps/deploy@2a8e648]: Update mobileapps to c4dc72d
  • 19:30 mepps: updated SmashPig from 4f84d88 to d4458fa
  • 19:25 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.2
  • 19:23 gehel@tin: Finished deploy [wdqs/wdqs@3936e36]: (no justification provided) (duration: 01m 20s)
  • 19:22 gehel@tin: Started deploy [wdqs/wdqs@3936e36]: (no justification provided)
  • 19:08 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: Revert "Add RejectParserCacheValue handler for mw-parser-output invalidation" T166345 (duration: 00m 43s)
  • 18:21 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1002.eqiad.wmnet
  • 18:20 gehel: wdqs1002 back in LVS - thermal paste added - T166524
  • 17:42 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1002.eqiad.wmnet
  • 17:41 gehel: shutting down wdqs1002 for maintenance - T166524
  • 17:02 elukey: sto mysql, eventlogging_sync and shutdown db1047 (analytics-store) for maintenance - T159266
  • 16:22 jynus: retrying reimage of db2044
  • 15:03 elukey: restart kafka100[23] for jvm upgrades
  • 14:21 mforns@tin: Finished deploy [analytics/refinery@7540403]: (no justification provided) (duration: 02m 50s)
  • 14:18 mforns@tin: Started deploy [analytics/refinery@7540403]: (no justification provided)
  • 14:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2048 after maintenance (duration: 00m 44s)
  • 13:18 marostegui: Deploy alter table s3 revision on labsdb1001 - T166278
  • 13:15 marostegui: Deploy alter table s3 revision on labsdb1011 - T166278
  • 13:11 gilles: restored original configuration on mwdebug1001
  • 11:33 godog: test upgrade of swift 2.10 on ms-fe2005 - T162609
  • 10:24 gilles: Point nutcracker to localhost on mwdebug1001
  • 10:06 godog: run puppet to blacklist acpi_power_meter across the fleet and rmmod the module
  • 09:51 _joe_: refreshing facts on the puppet compiler
  • 08:15 godog: upgrade grafana to 4.3.2 on labmon1001 / krypton
  • 07:49 gilles: editing wikiversions.php manually on mwdebug1001 to point enwiki to wmf.2
  • 06:08 marostegui: Deploy alter table on s3, labsdb1010 - T166278
  • 06:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1035 - T166278 (duration: 00m 57s)
  • 06:04 marostegui: Deploy alter table on s3, db1044 - T166278
  • 06:02 marostegui: Deploy alter table on s3, dbstore1001 - T166278
  • 05:58 elukey: powercycle cp3032 - T166758
  • 05:43 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet
  • 02:52 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jun 1 02:52:25 UTC 2017 (duration 6m 42s)
  • 02:45 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 02s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 29s)

2017-05-31

  • 23:59 dereckson@tin: Synchronized wmf-config/throttle.php: Add throttule rules for 2017-06-01 Fortaleza event (T166619) (duration: 00m 41s)
  • 23:03 ejegg: disabled d*local payment methods
  • 22:37 ejegg: updated payments-wiki from 4786e7c to 5edd788
  • 22:14 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: wikipedias back to 1.30.0-wmf.1
  • 21:41 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: touch InitialiseSettings.php (duration: 00m 39s)
  • 21:37 ejegg: reverted payments-wiki to 4786e7c
  • 21:32 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: wikipedias to 1.30.0-wmf.2
  • 21:29 AaronSchulz: Restored mwdebug1001 to wmf1 with normal nutcracker/memcached and puppet running
  • 21:23 ejegg: updated payments-wiki from 4786e7c to d467d3b
  • 20:17 RainbowSprinkles: gerrit: bringing offline for a few minutes for point release (2.13.4 -> 2.13.8, T158946)
  • 20:15 mobrovac@tin: Finished deploy [citoid/deploy@7d69554]: Relaxing date validation - T132308 (duration: 02m 32s)
  • 20:13 mobrovac@tin: Started deploy [citoid/deploy@7d69554]: Relaxing date validation - T132308
  • 19:31 demon@tin: Synchronized scap/plugins/clean.py: cleanup r us (duration: 00m 42s)
  • 19:13 gehel@tin: Finished deploy [wdqs/wdqs@af495a2]: (no justification provided) (duration: 01m 29s)
  • 19:11 gehel@tin: Started deploy [wdqs/wdqs@af495a2]: (no justification provided)
  • 17:30 godog: swift eqiad-prod decom ms-be100[128] - T166489
  • 16:53 ema: restart varnish-backend on cp1074
  • 16:53 ema: merge cache_maps into cache_upload: finished moving LVS IPs T164608
  • 16:33 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Index article placeholders up to Q16956 on cywiki (T162244) (duration: 00m 42s)
  • 15:58 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 15:31 ema: merge cache_maps into cache_upload: move LVS IPs T164608
  • 14:34 XioNoX: init7 fixed the issue, ping works from the init7 interface, reenabling the BGP session - T166663
  • 14:02 moritzm: upgrading install2002 to reprepro 5.1.1
  • 13:26 hoo@tin: Synchronized wmf-config/Wikibase-production.php: WikibaseClient: Don't persist Statement usages (T151717) (duration: 00m 41s)
  • 13:21 ema: cache_eqiad: upgrade to jessie 8.8 point release T164703
  • 13:20 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Log "api-readonly" errors (T164191, T123867) (duration: 00m 43s)
  • 13:15 ema: cache_codfw: upgrade to jessie 8.8 point release T164703
  • 13:10 ema: cache_esams: upgrade to jessie 8.8 point release T164703
  • 13:08 marostegui: Stop MySQL on db1048 and shutdown the host for maintenance - T160731
  • 13:08 moritzm: uploaded zookeeper 3.4.5+dfsg-2+deb8u2 to apt.wikimedia.org
  • 12:36 ema: cache_ulsfo: upgrade to jessie 8.8 point release T164703
  • 12:35 marostegui: Deploy alter table on s3 revision table - db1035 - https://phabricator.wikimedia.org/T166278
  • 12:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077, depool db1035 - T166278 (duration: 00m 41s)
  • 12:22 ema: cp1008: upgrade to jessie 8.8 point release T164703
  • 12:11 XioNoX: Disable v6 BGP session with Init7 in knams because of routing loop on their network
  • 12:04 volans: merged stringify_facts=false for production hosts T166372
  • 10:59 jynus: preparing for backup and reimage to jessie of db2044
  • 10:35 moritzm: updated reprepro on install1002 to 5.1.1 from backports (for support of dbgsym and buildinfo files)
  • 10:29 godog: remove salt-minion salt-common from stretch-wikimedia - T166646
  • 09:30 marostegui: Deploy alter table on s3 revision table - db1078 - https://phabricator.wikimedia.org/T166278
  • 09:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078, depool db1077 - T166278 (duration: 00m 42s)
  • 09:24 _joe_: etcd in eqiad in read-write mode
  • 09:22 _joe_: started etcd replica eqiad => codfw
  • 09:15 _joe_: etcd replica codfw => eqiad now stopped
  • 09:09 _joe_: etcd in read-only mode for switchover to eqiad
  • 08:27 godog: complete linux 4.9 upgrade on Debian ms-be2* machines
  • 08:24 moritzm: installing imagemagick security updates on trusty (jessie already fixed)
  • 07:47 elukey: restart kafka on kafka10[14,22,20] for jvm upgrades
  • 06:45 moritzm: installing sudo security updates
  • 06:45 marostegui: Deploy alter table s3 revision table - dbstore1002 - T166278
  • 06:31 marostegui: Deploy alter table on s4 - db1059 - T166206
  • 06:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081, depool db1059 - T166206 (duration: 00m 41s)
  • 06:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T166278 (duration: 00m 43s)
  • 06:04 marostegui: Deploy alter table on s3 revision table - db1078 - T166278
  • 06:04 marostegui: Deploy alter table on s3 revision table - db1095 - T166278
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 45s)

2017-05-30

  • 23:15 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Page images can come outside the lead for all projects except Wikipedia (duration: 00m 41s)
  • 23:09 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Add Wikipedia wordmark in Serbian/Macedonian (duration: 00m 45s)
  • 23:08 demon@tin: Synchronized static/images/mobile/copyright/: Compressed + new images (duration: 00m 42s)
  • 22:43 Reedy: created securepoll_elections.el_owner on testwiki T166568
  • 22:20 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Make Flow default in all namespaces on cawikiquote (T165497) (duration: 00m 43s)
  • 22:20 mutante: Welcome new root shell user herron (T166587)
  • 22:10 RoanKattouw: Running populateContentModel.php on all talk namespaces for all tables on cawikiquote
  • 21:28 RoanKattouw: Running Flow/convertNamespaceFromWikitext.php on all discussion namespaces on cawikiquote (T165497)
  • 21:21 mobrovac@tin: Finished deploy [zotero/translators@f051fe7]: Translators update for T95128 and T166292 (duration: 00m 05s)
  • 21:21 mobrovac@tin: Started deploy [zotero/translators@f051fe7]: Translators update for T95128 and T166292
  • 20:36 AaronSchulz: Set all wikis to wmf.2 via wikiversions.php on mwdebug1001 only; manual nutcracker running a screen to use local memcached for debugging
  • 20:18 mutante: LDAP - added uid=herron to groups "ops" and "wmf" for ops onboarding of Keith (T166587)
  • 20:09 gilles: Restarting nutcracker on mwdebug1001
  • 20:06 gilles: Overwriting nutcracker.yml on mwdebug1001 to point memcache cluster only to memcached on localhost
  • 20:05 gilles: Manually installed memcached on mwdebug1001, running on default port 11211
  • 20:04 gilles: Disabled puppet on mwdebug1001
  • 18:37 urandom: T160570: Upgrading dev env to Cassandra 3.11 (snapshot)
  • 17:55 thcipriani: branching 1.30.0-wmf.3 T165957
  • 17:28 arlolra: Updated Parsoid to d07dfe1a (T161151, T136653)
  • 17:17 arlolra@tin: Finished deploy [parsoid/deploy@744f719]: Updating Parsoid to d07dfe1a (duration: 08m 41s)
  • 17:09 arlolra@tin: Started deploy [parsoid/deploy@744f719]: Updating Parsoid to d07dfe1a
  • 16:40 moritzm: installing shadow regression update
  • 15:33 marostegui: Deploy alter table on s3.revision on labsdb1009 - T166278
  • 15:14 moritzm: installing bash security updates on trusty (jessie already fixed)
  • 15:03 moritzm: installing mysql-connector-java security update on analytics1031
  • 14:53 _joe_: failing citoid over to codfw, T165105
  • 14:48 moritzm: updating mw2140-mw2147, mw2251-mw2253 to HHVM 3.18
  • 14:27 _joe_: restarting squid on aluminium.
  • 13:58 moritzm: updating mw2240-mw2242, mw2254-mw2260 to HHVM 3.18
  • 13:47 aude@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgPageImagesAPIDefaultLicense for wikidata (duration: 00m 41s)
  • 13:44 elukey: restart kafka on kafka1013 for jvm upgrades
  • 13:35 aude@tin: Synchronized wmf-config/Wikibase-production.php: Enable Wikibase echo notifications on Wikipedia, except enwiki, dewiki, frwiki T142102 (duration: 00m 42s)
  • 13:21 elukey: restart kafka on kafka1001 for jvm upgrades
  • 13:14 ema: upgrade prometheus-node-exporter to 0.14.0~git20170523-0 on ubuntu systems
  • 12:43 elukey: restart kafka on kafka200[123] for jvm upgrades (main-codfw, eventbus)
  • 12:10 moritzm: installin jbig2dec security updates
  • 12:07 elukey: restart kafka on kafka1012 for jvm upgrades
  • 12:01 moritzm: installing jbig2dec security updates
  • 11:48 marostegui: Rename update table on enwiki on db1089 host - T139342
  • 11:31 moritzm: installing fop security updates
  • 11:14 godog: upgrade grafana to 4.3.1 on krypton
  • 10:44 gilles: run refreshFileHeaders for group 0 wikis on Terbium
  • 10:32 akosiaris: enable calico IPv6 BGP peering for cr1-eqiad
  • 10:18 jynus: stopping and backing up db2048 in preparation for reimage
  • 09:50 ema: upgrade prometheus-node-exporter to 0.14.0~git20170523-0 on debian systems
  • 09:43 jynus: restarting db2055 for mariadb and kernel upgrade
  • 08:23 elukey: restart jmxtrans on all the kafka brokers (analytics+main-codfw/eqiad) for jvm upgrades
  • 08:17 elukey: restart kafka on kafka1018 for jvm upgrades
  • 07:38 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1002.eqiad.wmnet
  • 07:38 gehel: wdqs1002 back in LVS - T166524
  • 07:09 marostegui: Deploy alter table on enwiki.revision on db1047 - T166452
  • 06:45 marostegui: Deploy alter table on s3 db1038 - T166278
  • 06:41 marostegui: Deploy alter table on s3 dbstore1002 - https://phabricator.wikimedia.org/T166278
  • 06:35 marostegui: Deploy alter table s4 - db1081 - https://phabricator.wikimedia.org/T166206
  • 06:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084, depool db1081 - T166206 (duration: 00m 59s)
  • 06:23 marostegui: Deploy alter table on s3 dbstore2001 - T166278
  • 02:49 l10nupdate@tin: ResourceLoader cache refresh completed at Tue May 30 02:49:20 UTC 2017 (duration 6m 44s)
  • 02:42 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 54s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 08m 22s)

2017-05-29

  • 20:04 mobrovac@tin: Started restart [zotero/translation-server@50f216a]: Memory at 50%
  • 19:56 gehel: removing wdqs1002 from LVS pending investigation of T166524
  • 19:55 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1002.eqiad.wmnet
  • 18:57 gehel: restarting wdqs-updater on wdqs1002
  • 17:40 volans: re-enabled puppet on tegmen and re-enabled raid_handler T163998
  • 17:29 volans: disabled puppet on tegmen and disabled raid_handler temporarily T163998
  • 15:02 gehel: restarting wdqs-updater on wdqs1002
  • 14:33 moritzm: rebooting multatuli for systemd modules-load.d debugging
  • 14:24 godog: upgrade prometheus-hhvm-exporter to 0.3-1 in codfw/eqiad with less verbose logging - T158286
  • 14:15 gehel: reset remote for elasticsearch/plugins deployment - T163708
  • 14:14 marostegui: Stop MySQL labsdb1009 to take a backup - T153743
  • 14:04 gehel: starting upgrade to elasticsearch 5.3.2 on cirrus codfw cluster - T163708
  • 14:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2036 - T166278 (duration: 00m 41s)
  • 14:01 marostegui: Deploy alter table s3 on codfw master db2018 - T166278
  • 13:42 moritzm: updating gdb on mw* servers
  • 13:10 marostegui: Stop replication on db1070 to flush tables for export - T153743
  • 13:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 - T153743 (duration: 00m 41s)
  • 13:02 akosiaris: enable puppet across eqiad/esams after puppetmaster upgrade.
  • 12:52 akosiaris: disable puppet across eqiad/esams for puppetmaster upgrade. This should avoid any irc spam about failed puppet agent runs
  • 12:52 akosiaris: enable puppet across codfw/ulsfo after puppetmaster upgrade
  • 12:41 akosiaris: disable puppet across codfw/ulsfo for puppetmaster upgrade. This should avoid any irc spam about failed puppet agent runs
  • 12:36 moritzm: installing imagemagick security updates on jessie
  • {{safesubst:SAL entry|1=12:31 akosiaris: update kubernetes policy-options on cr{1,2}-{eqiad,codfw}. T165732}}
  • 10:39 moritzm: installing fop security updates
  • 10:18 ema: upgrade nginx to 1.11.10-1+wmf1 on hassium and hassaleh
  • 09:53 moritzm: upgrade remaining mw* hosts already running HHVM 3.18 to 3.18.2+dfsg-1+wmf4
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 (duration: 00m 41s)
  • 09:01 marostegui: Drop gather tables from: testwiki, test2wiki, enwikivoyage, hewiki, enwiki - T166097
  • 08:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1023 - T166486 (duration: 00m 41s)
  • 08:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1023 - T166486 (duration: 00m 42s)
  • 07:38 marostegui: Stop MySQL on db1095 to take a backup - this will make labsdb1009,10 and 11 break replication while it is down - T153743
  • 07:01 _joe_: reeanbling scap on mw2140, T166328
  • 06:45 _joe_: restarting changeprop on scb1002, using 15 gigs of RAM
  • 06:42 marostegui: Deploy alter table s3 - dbstore2002 - T166278
  • 06:41 marostegui: Deploy alter table s4 - dbstore1002 - T166206
  • 06:33 _joe_: trying to restart pdfrender on scb1002
  • 06:32 marostegui: Deploy alter table s3 - db2036 - T166278
  • 06:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2043, depool db2036 - T166278 (duration: 01m 44s)
  • 06:29 _joe_: powercycling mw1294
  • 06:11 marostegui: Deploy alter table on s4 db1084 - T166206
  • 06:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091, depool db1084 - T166206 (duration: 02m 45s)
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 - T166206 (duration: 03m 01s)
  • 05:54 marostegui: Restart MySQL on db1047 - T166452
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 08m 20s)

2017-05-28

  • 13:19 jynus: restart db1069:3313 mysql instance, stuck on replication
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 08m 46s)

2017-05-27

  • 02:51 l10nupdate@tin: ResourceLoader cache refresh completed at Sat May 27 02:51:13 UTC 2017 (duration 6m 49s)
  • 02:44 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 05s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 08m 41s)

2017-05-26

  • 14:29 marostegui: Stop pt-table-checksum on s1 - T162807
  • 14:04 marostegui: Deploy alter table on s3 revision table db2043 - T166278
  • 14:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2050, depool db2043 - T166278 (duration: 00m 41s)
  • 13:57 _joe_: consuming the backlog of htmlCacheUpdate jobs for enwiktionary
  • 13:19 gehel: restart wdqs-updater on all wdqs nodes - T166378
  • 12:55 marostegui: Deploy alter table s4 on db1097 - T166206
  • 12:44 elukey: Restart Hadoop daemons on analytics100[12] (Hadoop master nodes) for jvm upgrades
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 - T166206 (duration: 00m 41s)
  • 10:56 gehel: restart wdqs-updater on all wdqs nodes - T166378
  • 09:30 volans: slowly testing if puppet stringify_facts=false is a noop across the fleet T166372
  • 08:45 volans: killed daemonized puppet on tegmen, lvs1006 T166203
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097 - T166206 (duration: 00m 40s)
  • 06:10 marostegui: Deploy alter table on s3 - db2050 - T166278
  • 06:09 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2057, depool db2050 - T166278 (duration: 00m 56s)
  • 06:05 marostegui: Resume pt-table-checksum on s1 - T162807
  • 02:58 l10nupdate@tin: ResourceLoader cache refresh completed at Fri May 26 02:58:48 UTC 2017 (duration 6m 37s)
  • 02:52 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 59s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 08m 31s)
  • 01:33 bsitzmann@tin: Finished deploy [mobileapps/deploy@a8d0c91]: Update mobileapps to db6493c (duration: 03m 45s)
  • 01:29 bsitzmann@tin: Started deploy [mobileapps/deploy@a8d0c91]: Update mobileapps to db6493c
  • 00:16 thcipriani@tin: Finished scap: SWAT: Fix version of DonationInterface deployed to donatewiki T166302 (duration: 19m 44s)

2017-05-25

  • 23:56 thcipriani@tin: Started scap: SWAT: Fix version of DonationInterface deployed to donatewiki T166302
  • 23:44 thcipriani@tin: Synchronized php-1.30.0-wmf.2/resources/src/jquery/jquery.makeCollapsible.js: SWAT: jquery.makeCollapsible: Restore considering empty <a> as part of toggle T166298 (duration: 00m 42s)
  • 23:20 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Revert "Revert "Add Code of Conduct footer links to wikitech and mw.o"" PART II (duration: 00m 41s)
  • 23:19 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Revert "Add Code of Conduct footer links to wikitech and mw.o"" PART I (duration: 00m 43s)
  • 22:58 thcipriani: mw1170 wikipedias back to 1.30.0-wmf.1
  • 22:26 thcipriani: mw1170 running wmf.2 for all wikis for troubleshooting T166345
  • 22:24 thcipriani: mw1161 wikipedias back to running running wmf.1
  • 22:20 thcipriani: mw1161 running wmf.2 for all wikis for troubleshooting T166345
  • 22:17 papaul: ores200[1-9] - signing puppet certs, salt-key, initial run
  • 21:43 papaul: OS install on ores200[1-9]
  • 21:31 arlolra: Updated Parsoid to 5b52d07b (T166068)
  • 21:25 arlolra@tin: Finished deploy [parsoid/deploy@4a2c3f4]: Updating Parsoid to 5b52d07b (duration: 07m 43s)
  • 21:18 arlolra@tin: Started deploy [parsoid/deploy@4a2c3f4]: Updating Parsoid to 5b52d07b
  • 20:30 urandom: T164865: RESTBase dev, disable revision range deletes
  • 20:25 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: wikipedias back to 1.30.0-wmf.1
  • 19:48 chasemp: restart redises on rdb2003
  • 19:44 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: revert SWAT: Add Code of Conduct footer links to wikitech and mw.o Part II (duration: 00m 38s)
  • 19:43 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: revert SWAT: Add Code of Conduct footer links to wikitech and mw.o Part I (duration: 00m 39s)
  • 19:23 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add Code of Conduct footer links to wikitech and mw.o Part II (duration: 00m 39s)
  • 19:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Code of Conduct footer links to wikitech and mw.o Part I (duration: 00m 39s)
  • 19:09 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.2
  • 18:47 volans: completed upgrade of facter across the fleet T166203 (apart few hosts down)
  • 18:39 volans: forcing BBU learn on db1016
  • 18:34 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Remove special Math extension settings for hewiki Remove UseMathJax from CommonSettings.php T165475 (duration: 00m 43s)
  • 18:27 urandom: T164865: RESTBase dev, re-enable render range deletes
  • 18:12 thcipriani: mwscript namespaceDupes.php hewiki --fix
  • 18:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespace aliases for Hebrew Wikipedia T164858 (duration: 00m 47s)
  • 17:51 volans@sarin: conftool action : set/pooled=inactive; selector: name=mw2140.codfw.wmnet
  • 17:31 jynus@neodymium: conftool action : set/pooled=no; selector: name=mw2140.codfw.wmnet
  • 17:30 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 after maintenance, 2nd try (duration: 02m 42s)
  • 17:25 jynus@neodymium: conftool action : set/pooled=inactive; selector: name=mw2140.codfw.wmnet
  • 17:14 bsitzmann@tin: Finished deploy [mobileapps/deploy@614d752]: Update mobileapps to 946fe1f (duration: 04m 04s)
  • 17:12 jynus: powercycling mw2140
  • 17:10 bsitzmann@tin: Started deploy [mobileapps/deploy@614d752]: Update mobileapps to 946fe1f
  • 17:07 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 after maintenance (duration: 02m 43s)
  • 16:27 urandom: T164865: RESTBase dev, re-enable revision range deletes
  • 15:43 godog: delete thumbnails with > 2000px for wikivoyage / wikiversity / wikisource / wikiquote - T162796
  • 15:28 jynus: restarting and upgrading db2055 for kernel downgrade
  • 14:40 bblack: restart cp1074 backend (mailbox)
  • 14:08 godog: shut ms-be1021 for BBU replacement - T163777
  • 13:39 jynus: restarting and upgrading db2055 for maintenance
  • 13:29 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2055 for maintenance (duration: 00m 41s)
  • 13:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 back to high load after maintenance (duration: 00m 41s)
  • 13:04 elukey: restart cassandra-a on aqs1004 to test https://gerrit.wikimedia.org/r/354107
  • 12:41 akosiaris: cordon kubernetes100{2,3,4} for testing calico-node on kubernetes1001
  • 10:01 elukey: restart HDFS datanode daemons on all the hadoop worker nodes for jvm upgrades
  • 09:39 elukey: reimage analytics1030 to Debian Jessie - T165529
  • 09:35 elukey: restart Yarn nodemanager daemons on all the hadoop worker nodes for jvm upgrades
  • 09:28 godog: ban commons object on request in ulsfo
  • 09:07 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 after maintenance with low weight (duration: 00m 41s)
  • 08:25 jynus: stopping and restarting db1077
  • 08:03 volans: resuming slow upgrade of facter across the fleet checking is a noop T166203
  • 07:58 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for maintenance and upgrade (duration: 00m 41s)
  • 07:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2057 - T166278 (duration: 00m 41s)
  • 07:28 godog: roll-restart jessie ms-be2* for linux 4.9 update - T162029
  • 06:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 - T166206 (duration: 00m 55s)
  • 05:58 marostegui: Start pt-table-checksum on s1 - T162807
  • 02:51 l10nupdate@tin: ResourceLoader cache refresh completed at Thu May 25 02:51:53 UTC 2017 (duration 6m 38s)
  • 02:45 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 07m 18s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 09m 46s)
  • 00:18 aaron@tin: Synchronized wmf-config/ProductionServices.php: Enable HTTPs for Swift usage (duration: 00m 41s)
  • 00:15 aaron@tin: Synchronized wmf-config/filebackend.php: Enable HTTPs for Swift usage (duration: 00m 41s)
  • 00:10 twentyafterfour: phabricator upgrade complete, service is online
  • 00:06 twentyafterfour: upgrading phabricator, expect momentary downtime

2017-05-24

  • 23:53 ejegg: updated payments-wiki from 5fa4a70 to 4786e7c
  • 23:49 XenoRyet: updated civicrm from 9b7a74c to 9c06bd2
  • 23:28 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Allow page images outside the lead on Wikivoyage wikis (T166251) (duration: 00m 41s)
  • 23:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable related pages for everyone (T155079) (duration: 00m 42s)
  • 23:18 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable print styles in Minerva (T163287) (duration: 00m 42s)
  • 23:10 catrope@tin: Synchronized multiversion/MWMultiVersion.php: Allow absolute script path for getMediaWikiCli() (duration: 00m 44s)
  • 22:33 krinkle@tin: Synchronized php-1.30.0-wmf.2/extensions/wikihiero: Fix styles queue warning - T92459 (duration: 00m 42s)
  • 22:02 mutante: terbium: dbtree: git stash and git pull origin to fix unclean repo state, deploy fix to syntax error
  • 21:53 urandom: T164865: Disabling range delete-based render culling, dev env
  • 21:34 Dereckson: Run fixProofreadIndexPagesContentModel.php new version (with Gerrit:355534 fix) to every wikisource
  • 21:10 Dereckson: Fixed wikisource Index: content model for ta.wikisource, en.wikisource and not wikisource databases (frrwiki + test2 + sourceswiki)
  • 21:10 demon@tin: Synchronized php-1.30.0-wmf.2/extensions/ProofreadPage/maintenance/fixProofreadIndexPagesContentModel.php: Now with proper batch support (duration: 00m 41s)
  • 20:38 demon@tin: Synchronized scap/plugins/clean.py: cleanups (duration: 00m 41s)
  • 20:29 Dereckson: Run fixProofreadIndexPagesContentModel on vec.wikisource (requested by Tpt), aborted after 50k (as that's greater than the expected number of rows)
  • 20:08 ejegg: reverted payments-wiki to 5fa4a70
  • 20:04 ejegg: updated payments-wiki from 5fa4a70 to 4786e7c
  • 19:25 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.2
  • 19:20 otto@tin: Finished deploy [eventlogging/analytics@c90a609]: (no justification provided) (duration: 00m 01s)
  • 19:20 otto@tin: Started deploy [eventlogging/analytics@c90a609]: (no justification provided)
  • 19:14 otto@tin: Finished deploy [eventlogging/analytics@c90a609]: (no justification provided) (duration: 00m 02s)
  • 19:14 otto@tin: Started deploy [eventlogging/analytics@c90a609]: (no justification provided)
  • 19:12 otto@tin: Finished deploy [eventlogging/analytics@c90a609]: (no justification provided) (duration: 00m 01s)
  • 19:12 otto@tin: Started deploy [eventlogging/analytics@c90a609]: (no justification provided)
  • 19:12 otto@tin: Finished deploy [eventlogging/analytics@c90a609]: (no justification provided) (duration: 00m 02s)
  • 19:12 otto@tin: Started deploy [eventlogging/analytics@c90a609]: (no justification provided)
  • 19:12 demon@tin: Synchronized wmf-config/: Dropping old ExtensionMessages (duration: 00m 42s)
  • 19:11 otto@tin: Finished deploy [eventlogging/analytics@c90a609]: (no justification provided) (duration: 00m 02s)
  • 19:11 otto@tin: Started deploy [eventlogging/analytics@c90a609]: (no justification provided)
  • 19:11 otto@tin: Finished deploy [eventlogging/analytics@c90a609]: (no justification provided) (duration: 00m 02s)
  • 19:11 otto@tin: Started deploy [eventlogging/analytics@c90a609]: (no justification provided)
  • 19:07 demon@tin: Synchronized wmf-config/: Dropping old contribution-tracking-setup.php -- finally (duration: 00m 42s)
  • 19:03 demon@tin: Synchronized wmf-config/CommonSettings.php: Dropping old ContribTracking config (duration: 00m 41s)
  • 19:02 demon@tin: Synchronized .gitignore: Completeness (duration: 00m 41s)
  • 19:00 thcipriani@tin: Finished scap: SWAT: Use file width/height instead of metadata for getContentHeaders Batch/pipeline backend operations in refreshFileHeaders T150741 (duration: 03m 12s)
  • 18:57 thcipriani@tin: Started scap: SWAT: Use file width/height instead of metadata for getContentHeaders Batch/pipeline backend operations in refreshFileHeaders T150741
  • 18:56 thcipriani@tin: Synchronized php-1.30.0-wmf.2/extensions/TimedMediaHandler/handlers: SWAT: Make getContentHeaders rely on fallback width/height T150741 (duration: 00m 41s)
  • 18:55 thcipriani@tin: Synchronized php-1.30.0-wmf.2/extensions/PagedTiffHandler/PagedTiffHandler_body.php: SWAT: Update getContentHeaders signature T150741 (duration: 00m 42s)
  • 18:54 thcipriani@tin: Synchronized php-1.30.0-wmf.2/extensions/PdfHandler/PdfHandler_body.php: SWAT: Update getContentHeaders signature T150741 (duration: 00m 40s)
  • 18:31 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: mobileFrontend: Move first paragraph before infobox T150325 (duration: 00m 41s)
  • 18:18 thcipriani: running mwscript namespaceDupes.php trwiki --fix
  • 18:17 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create a new namespace "Vikiproje" for trwiki T166102 (duration: 00m 41s)
  • 18:10 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgUploadNavigationUrl on srwiki T165901 (duration: 00m 42s)
  • 18:00 urandom: T164865: Upgrading Cassandra from 3.7.3-instaclustr to 3.10
  • 17:45 ottomata: rolling druid back to 0.9.0
  • 16:58 moritzm: installing ghostscript regression update on trusty (jessie security update was not affected)
  • 16:56 jynus: restarting and upgrading db2047
  • 16:54 volans: pause slowly upgrading facter across the fleet, resuming tomorrow T166203
  • 16:37 marostegui: Stop pt-table-checksum on s1 - T162807
  • 16:26 bblack: restarting varnish backend on cp1099 (mailbox lag)
  • 15:45 godog: test-upgrade grafana 4.3.1 on labmon1001
  • 15:35 krinkle@tin: Synchronized php-1.30.0-wmf.2/resources/Resources.php: Restore mediawiki.page.watch.ajax dependency - Iebfda85c7 (duration: 00m 42s)
  • 15:00 godog: deploy thumbor 0.1.39 for memcache-based throttling - T151065
  • 14:54 moritzm: uploaded gerrit 2.13.8+wmf2 to apt.wikimedia.org
  • 14:04 moritzm: installing jasper security updates on trusty (jessie already fixed)
  • 13:59 marostegui: Start running pt-table-checksum on s1 (will not run over night for now) - T162807
  • 13:59 paravoid: cr2-esams: enabling netflows experimentally
  • 13:54 elukey: upgrade Druid daemons on druid100[123] to 0.10 - T164008
  • 13:28 volans: slowly upgrading facter across the fleet checking is a noop T166203
  • 13:14 godog: upload prometheus-hhvm-exporter 0.3-1 to jessie-wikimedia - T158286
  • 12:20 moritzm: upgrade application servers using HHVM 3.18 to the latest 3.18.2+wmf4 build
  • 12:09 moritzm: updating puppet on puppetmaster2002
  • 12:08 godog: bounce pybal on lvs1003 - T134893
  • 11:52 XioNoX: pregressively adding "remove-private" to ix4/6 and transit4/6 bgp groups on cr2-esams T83037
  • 11:36 moritzm: uploaded puppet_3.8.5-2~bpo8+2 to apt.wikimedia.org
  • 10:50 akosiaris: repool esams T133387
  • 10:46 volans: stopped temporarily ircecho to avoid alert spam
  • 10:43 ema: upgrade prometheus-node-exporter on lvs hosts to 0.14.0~git20170523-0 T160156
  • 10:43 ema: upgrade prometheus-node-exporter on cache hosts to 0.14.0~git20170523-0 T160156
  • 10:05 volans: forcing puppet run on failed hosts only in esams T133387
  • 09:59 XioNoX: asw-esams back up (T133387)
  • 09:53 XioNoX: rebooting asw-esams for upgrade (T133387)
  • 09:49 ema: upgrade prometheus-node-exporter on cache hosts to 0.14.0~git20170523-0 T147569
  • 09:26 godog: upload prometheus-node-exporter 0.14.0~git20170523-0 to jessie-wikimedia - T160156
  • 09:15 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: cp3036.esams.wmnet
  • 09:10 akosiaris: drain esams for network tests for T133387
  • 08:52 marostegui: Deploy alter table on codfw master (db2019 and let it replicate) on s4 - T166206
  • 08:51 joal@tin: Finished deploy [analytics/refinery@9377d9c]: Deploying to fix yesterday's deploy bugs (duration: 02m 44s)
  • 08:49 akosiaris: depool cp3036 for T133387 testing
  • 08:49 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: cp3036.esams.wmnet
  • 08:48 joal@tin: Started deploy [analytics/refinery@9377d9c]: Deploying to fix yesterday's deploy bugs
  • 07:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T164530 (duration: 00m 41s)
  • 07:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086, depool db1094 - T164530 (duration: 00m 41s)
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079, depool db1086 - T164530 (duration: 00m 42s)
  • 06:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T164530 (duration: 00m 54s)
  • 06:34 marostegui: Deploy alter table on s2.fawiki directly on codfw master (db2029) after running the clean up duplicates script - https://phabricator.wikimedia.org/T164530
  • 06:04 marostegui: Run pt-table-checksum on s7.frwiktionary - https://phabricator.wikimedia.org/T163190
  • 06:02 marostegui: Deploy alter table on s2 db1047 - https://phabricator.wikimedia.org/T162611
  • 03:04 l10nupdate@tin: ResourceLoader cache refresh completed at Wed May 24 03:04:03 UTC 2017 (duration 6m 45s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.2) (duration: 13m 38s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 30s)

2017-05-23

  • 21:00 mepps: deployed payment wiki 0c06f8e
  • 20:24 bblack: enable BBR for all caches - T147569
  • 20:20 bblack: enable BBR for all caches @ codfw - T147569
  • 20:10 bblack: enable BBR for all caches @ ulsfo - T147569
  • 20:06 bblack: disabling puppet on all caches for BBR deploy control
  • 19:52 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.30.0-wmf.2
  • 19:34 thcipriani@tin: Finished scap: testwiki to php-1.30.0-wmf.2 and rebuild l10n cache (duration: 27m 52s)
  • 19:07 bblack: resetting cp1074 queues again: "fq flow_limit 200 buckets 10240"
  • 19:06 thcipriani@tin: Started scap: testwiki to php-1.30.0-wmf.2 and rebuild l10n cache
  • 18:43 bblack: resetting cp1074 queues again: "fq flow_limit 200 buckets 4096"
  • 17:40 bblack: fq on cp1074 reset to flow_limit 200 (resets counters)
  • 17:24 ladsgroup@tin: Finished deploy [ores/deploy@4874809]: Trying again with deploying ores (duration: 21m 30s)
  • 17:09 thcipriani: starting branch cut for 1.30.0-wmf.2 T163512
  • 17:03 ladsgroup@tin: Started deploy [ores/deploy@4874809]: Trying again with deploying ores
  • 16:50 volans: upgrading facter on mw[2250-2259] as a test batch
  • 16:49 bblack: BBR: enabling bbr on cp1074 - T147569
  • 16:43 bblack: BBR: enabling mq+fq on cp1074 - T147569
  • 16:26 bblack: puppet re-enables on caches
  • 16:24 demon@tin: Synchronized README: testing (duration: 00m 38s)
  • 16:17 bblack: disabled puppet on all cp* for RPS-related deployments (just in case!)
  • 16:16 bblack: disabled puppet on all lvs* for RPS-related deployments
  • 16:15 ema: cp1074: enable prometheus node_exporter qdisc collector T147569
  • 15:50 marostegui: Stop replication on dbstore1002 s7 thread for maintenance - T163190
  • 15:23 volans: re-enabled raid_handler and puppet on tegmen
  • 15:02 otto@tin: Finished deploy [eventlogging/analytics@UNKNOWN]: (no justification provided) (duration: 00m 02s)
  • 15:01 otto@tin: Started deploy [eventlogging/analytics@UNKNOWN]: (no justification provided)
  • 14:56 otto@tin: Finished deploy [eventlogging/analytics@25f8096]: (no justification provided) (duration: 00m 04s)
  • 14:56 otto@tin: Started deploy [eventlogging/analytics@25f8096]: (no justification provided)
  • 14:42 volans: temporarily disabled raid_handler and puppet on tegmen
  • 14:25 jynus: deploying new check_raid monitoring write policy for megacli T166108
  • 14:21 Dereckson: EU SWAT done
  • 14:21 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage on dty.wikipedia (T166121) (duration: 00m 38s)
  • 14:09 XioNoX: re-enabling BGP session to Init7 - T165288
  • 14:03 moritzm: installing nutcracker update in codfw (T163795)
  • 13:37 marostegui: Run CleanDuplicateScores script to clean up possible duplicates on fawiki before starting to create the UNIQUE keys - https://phabricator.wikimedia.org/T164530
  • 13:23 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add *.esa.int to CopyUploadsDomains (T164643) (duration: 00m 39s)
  • 12:47 elukey@tin: Finished deploy [analytics/refinery@679aeea]: Updated stat1002 with the last refinery deployment (duration: 00m 42s)
  • 12:46 elukey@tin: Started deploy [analytics/refinery@679aeea]: Updated stat1002 with the last refinery deployment
  • 12:46 elukey@tin: Finished deploy [analytics/refinery@679aeea]: (no justification provided) (duration: 00m 01s)
  • 12:45 elukey@tin: Started deploy [analytics/refinery@679aeea]: (no justification provided)
  • 12:39 joal@tin: Finished deploy [analytics/refinery@679aeea]: Weekly deploy (2 weeks late, big deploy)-2 (duration: 01m 35s)
  • 12:38 joal@tin: Started deploy [analytics/refinery@679aeea]: Weekly deploy (2 weeks late, big deploy)-2
  • 12:24 joal@tin: Finished deploy [analytics/refinery@679aeea]: Weekly deploy (with 2 weeks late, big deploy) (duration: 04m 24s)
  • 12:20 moritzm: upgrading mw1261-mw1265 to hhvm 3.18.2+dfsg-1+wmf4
  • 12:20 joal@tin: Started deploy [analytics/refinery@679aeea]: Weekly deploy (with 2 weeks late, big deploy)
  • 12:13 joal@tin: Finished deploy [analytics/refinery@222d0c0]: (no justification provided) (duration: 03m 56s)
  • 12:09 joal@tin: Started deploy [analytics/refinery@222d0c0]: (no justification provided)
  • 12:09 moritzm: uploaded hhvm 3.18.2+dfsg-1+wmf4 to apt.wikimedia.org (contains extended upstream fix for XML reader crash) (T162586)
  • 11:56 elukey: set vm.dirty_backround_bytes=25165824 on aqs1004 as part of testing for https://gerrit.wikimedia.org/r/#/c/354107 (Rollback: set vm.dirty_backround_ratio=10)
  • 11:51 _joe_: uploaded calico-cni 1.8.3-1~wmf1 to jessie-wikimedia
  • 11:51 _joe_: uploaded calicoctl 1.2.0-1~wmf1 to jessie-wikimedia
  • 11:44 _joe_: pushed calico/node:1.2.0 to the docker registry
  • 11:42 _joe_: pushed calico/kube-policy-controller:0.6.0 to the docker registry
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T164530 (duration: 00m 38s)
  • 11:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088, depool db1093 - T164530 (duration: 00m 38s)
  • 10:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085, depool db1088 - T164530 (duration: 00m 38s)
  • 10:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 - T164530 (duration: 00m 38s)
  • 10:27 godog: upload kafkatee 0.1.5 to jessie-wikimedia, remove unused kafkatee 0.1.4 from trusty-wikimedia - T149451
  • 10:14 marostegui: Run pt-table-checksum on s7.frwiktionary - T165743
  • 09:56 moritzm: restarting cassandra on restbase1013, restbase1014, restbase1015, restbase1017 to pick up Java security updates
  • 09:49 godog: swift eqiad-prod: ms-be1028/ms-be1039 object weight 3500 - T160640
  • 09:46 addshore: addshore@terbium:~$ ~/mymwscriptwikiset extensions/Cognate/maintenance/purgeDeletedCognatePages.php et+wiktionary.dblist --batch-size=1000 >> ~/purge.201705161230.log T164407
  • 09:24 moritzm: restarting cassandra on restbase1007, restbase1009, restbase1012 to pick up Java security updates
  • 09:16 hashar: Restarting Jenkins on contint1001
  • 09:15 elukey: reverted manual hack on mw1161 with scap pull
  • 08:15 elukey: apply manually https://gerrit.wikimedia.org/r/#/c/351854/2/wmf-config/jobqueue.php (persistent connections between hhvm and redis) to mw1161 as production test
  • 08:13 marostegui: Force WB as a default policy on db1031 because of degraded BBU
  • 08:00 addshore: the last script I started is now stopped
  • 07:48 addshore: addshore@terbium:~$ ~/mymwscriptwikiset extensions/Cognate/maintenance/purgeDeletedCognatePages.php et+wiktionary.dblist --batch-size=1000 >> ~/purge.201705161230.log T164407
  • 07:25 moritzm: installing openjdk security updates on maps and wdqs clusters
  • 07:13 marostegui: Deploy schema change on ruwiki.ores_classification directly on codfw master (db2028) - T164530
  • 07:07 marostegui: Rename gather_list gather_list_flag gather_list_item on db1078 db1094 and db1089 - T166097
  • 06:29 marostegui: Deploy alter table on s7.frwiktionary db2040 and db1034 - T165743
  • 06:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1021 - T162611 (duration: 00m 38s)
  • 06:20 marostegui: Deploy alter table on s2 eqiad master db1054 - T162611
  • 02:29 l10nupdate@tin: ResourceLoader cache refresh completed at Tue May 23 02:29:17 UTC 2017 (duration 6m 16s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 25s)

2017-05-22

  • 23:51 aaron@tin: Synchronized wmf-config: Move swift auth URL to ProductionServices (duration: 00m 52s)
  • 23:49 aaron@tin: Synchronized static/images/project-logos/hywiki-2x.png: Fix hy.wikipedia high resolution logos (duration: 00m 38s)
  • 23:48 aaron@tin: Synchronized static/images/project-logos/hywiki-1.5x.png: Fix hy.wikipedia high resolution logos (duration: 00m 38s)
  • 23:34 demon@tin: Synchronized wmf-config/ProductionServices.php: I4b19b4 (duration: 00m 38s)
  • 23:33 demon@tin: Synchronized wmf-config/filebackend.php: I4b19b4 (duration: 00m 38s)
  • 23:20 aaron@tin: Synchronized wmf-config/filebackend.php: Move swift auth URL to ProductionServices (duration: 00m 38s)
  • 23:19 aaron@tin: Synchronized wmf-config/ProductionServices.php: Move swift auth URL to ProductionServices (duration: 00m 38s)
  • 23:15 aaron@tin: Synchronized wmf-config/logging.php: Include DB shard in production SPI log entries (duration: 00m 38s)
  • 21:11 bblack: BBR: cp1065: reverted back to cubic+pfifo_fast - T147569
  • 21:10 bblack: BBR: cp1074: reverted back to cubic+pfifo_fast - T147569
  • 20:56 ladsgroup@tin: Finished deploy [ores/deploy@4874809]: Second deploy of ores for enabling frwiki damaging (duration: 05m 23s)
  • 20:50 ladsgroup@tin: Started deploy [ores/deploy@4874809]: Second deploy of ores for enabling frwiki damaging
  • 20:46 arlolra: Updated Parsoid to ebac1890 (T165139)
  • 20:43 ladsgroup@tin: Finished deploy [ores/deploy@263255a]: (no justification provided) (duration: 29m 07s)
  • 20:40 arlolra@tin: Finished deploy [parsoid/deploy@a9f2229]: Updating Parsoid to ebac1890 (duration: 07m 54s)
  • 20:32 arlolra@tin: Started deploy [parsoid/deploy@a9f2229]: Updating Parsoid to ebac1890
  • 20:14 ladsgroup@tin: Started deploy [ores/deploy@263255a]: (no justification provided)
  • 20:14 Amir1: starting deploy of ores:68cca85 to prod
  • 19:30 bblack: BBR: cp1074: switching congestion control to bbr manually - T147569
  • 19:29 bblack: BBR: cp1074: switching qdisc to mq+fq manually - T147569
  • 19:25 bblack: BBR: cp1065: switching congestion control to bbr manually - T147569
  • 19:16 bblack: BBR: cp1065: switching qdisc to mq+fq manually - T147569
  • 18:57 demon@tin: Synchronized README: forcing co-master sync (duration: 00m 42s)
  • 18:56 demon@tin: Pruned MediaWiki: 1.29.0-wmf.20 (duration: 01m 21s)
  • 18:22 ejegg: updated payments-wiki from 3b84521 to 5fa4a70
  • 18:18 bblack: rebooting acamar
  • 18:06 mepps: updated thank you send drush command
  • 18:01 mepps: updated civicrm 9b7a74c
  • 18:00 mepps: updated process control for new thank you send drush command 7c9572b
  • 17:49 ejegg: turned off paypal audit parser
  • 16:06 akosiaris: re-enable notifications in icinga
  • 15:27 _joe_: restarted puppetmasters in codfw
  • 13:23 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: Use wikitech db group instead of labswiki+ labtestwiki (duration: 00m 39s)
  • 13:22 akosiaris: silence icinga
  • 13:17 dcausse@tin: Synchronized wmf-config/CommonSettings.php: Enable TimedMediaHandler's new video player Beta Feature in Labs (duration: 00m 43s)
  • 13:02 _joe_: restarted etcdmirror on conf1002, consequence of https://gerrit.wikimedia.org/r/354095
  • 09:59 moritzm: repooled mw2221 (was down for hardware error)
  • 09:37 marostegui: Deploy alter table s7.frwiktionary on db1039 - https://phabricator.wikimedia.org/T165743
  • 09:15 marostegui: Drop table MediaWikiInstallPingback_15732959 from db1046, db1047 and dbstore1002 - T165836
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026, depool db1045 - T164530 (duration: 00m 39s)
  • 08:55 marostegui: Restart mysql on db1069 to apply new replication filters - T165977
  • 08:50 marostegui: Restart mysql on db1095 to apply new replication filters - T165977
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 - T164530 (duration: 00m 38s)
  • 08:02 marostegui: Deploy alter table on s2 (revision table) db1021 - https://phabricator.wikimedia.org/T162611
  • 08:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1021 - T162611 (duration: 00m 38s)
  • 07:56 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2035 - T162611 (duration: 00m 38s)
  • 07:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db2035 - T162611 (duration: 00m 38s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1036 - T162611 (duration: 00m 39s)
  • 07:22 moritzm: installing openjdk-7 security updates on jessie
  • 07:14 marostegui: Deploy alter table on s5 wikidatawiki.ores_classification directly on codfw master - T164530
  • 07:07 marostegui: Run CleanDuplicateScores script to clean up possible duplicates on wikidatawiki before starting to create the UNIQUE keys - T164530
  • 06:56 marostegui: Deploy alter table s7.frwiktionary on dbstore1001 - https://phabricator.wikimedia.org/T165743
  • 06:53 marostegui: Deploy alter table s7.frwiktionary on db2029 (codfw master) - https://phabricator.wikimedia.org/T165743
  • 06:47 marostegui: Deploy alter table on db2035 and db1036 for s2. bgwiktionary,eowiki, idwiki - T162611
  • 06:47 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2035 - T162611 (duration: 00m 38s)
  • 06:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 - T162611 (duration: 00m 39s)
  • 06:02 smalyshev@tin: Finished deploy [wdqs/wdqs@e4301da]: Redeploy GUI due to breakage in T165228 (duration: 01m 50s)
  • 06:00 smalyshev@tin: Started deploy [wdqs/wdqs@e4301da]: Redeploy GUI due to breakage in T165228
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Mon May 22 02:26:59 UTC 2017 (duration 6m 0s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 18s)

2017-05-21

  • 09:42 Reedy: force ran puppet on deployment-tin to pickup dbname in wmf-beta-update-database.py
  • 09:07 smalyshev@tin: Finished deploy [wdqs/wdqs@227ab25]: Redeploy GUI due to breakage in T165228 (duration: 00m 19s)
  • 09:06 smalyshev@tin: Started deploy [wdqs/wdqs@227ab25]: Redeploy GUI due to breakage in T165228
  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Sun May 21 02:27:46 UTC 2017 (duration 6m 3s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 30s)

2017-05-20

  • 21:54 Dereckson: Run namespaceDupe on fr.wikisource and en.wikisource
  • 17:29 addshore: addshore@terbium:/srv/mediawiki/php-1.30.0-wmf.1$ mwscriptwikiset extensions/Cognate/maintenance/purgeDeletedCognatePages.php wiktionary.dblist --batch-size=1000 >> ~/purge.201705161230.log T164407
  • 17:29 addshore: addshore@terbium:/srv/mediawiki/php-1.30.0-wmf.1$ mwscriptwikiset extensions/Cognate/maintenance/purgeDeletedCognatePages.php wiktionary.dblist --batch-size=1000 >> ~/purge.201705161230.log
  • 09:08 thcipriani: restarting jenkins on contint1001
  • 08:24 smalyshev@tin: Finished deploy [wdqs/wdqs@227ab25]: Whitelist update (duration: 02m 32s)
  • 08:22 smalyshev@tin: Started deploy [wdqs/wdqs@227ab25]: Whitelist update
  • 07:52 gehel: restart wdqs-updater on all wdqs clusters (stuck on too large update)
  • 02:29 l10nupdate@tin: ResourceLoader cache refresh completed at Sat May 20 02:29:14 UTC 2017 (duration 6m 13s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 17s)

2017-05-19

  • 16:44 reedy@tin: Synchronized wmf-config/throttle.php: Wikimedia Vienna Hackathon (duration: 00m 39s)
  • 15:40 mutante: planet10001 - manually deleting cron job for deleted sr.planet (should puppetize the "absence" too)
  • 13:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 - T165743 (duration: 00m 38s)
  • 13:47 marostegui: Deploy alter table s7.frwiktionary db1033 - T165743
  • 13:31 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 - T165743 (duration: 00m 39s)
  • 13:09 moritzm: downgraded mw1161 to HHVM 3.12 (crashes often compared to app servers, downgrade over the weekend)
  • 12:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2068 - T165743 (duration: 00m 39s)
  • 12:40 marostegui: Deploy alter table s7.frwiktionary on db2068 - T165743
  • 12:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2068 - T165743 (duration: 00m 40s)
  • 11:39 marostegui: Deploy alter table s2.revision table on labsdb1003 - T162611
  • 11:05 moritzm: uploaded nutcracker 0.4.1-1+wm3~jessie1 to apt.wikimedia.org (T163795)
  • 10:31 ebernhardson: restarting elsaticsearch on relforge1001 to pull in remote reindex
  • 10:19 moritzm: powercycling mw2221, stuck in reboot and serial console unresponsive
  • 10:08 _joe_: moved stale repos to /srv/deployment/STALE on tin, T129290
  • 10:07 moritzm: rebooting mw2220/mw2221 for update to Linux 4.9 / HHVM 3.18 / nutcracker tests
  • 09:15 reedy@tin: Synchronized dblists/: Update size dblists (duration: 00m 39s)
  • 09:01 reedy@tin: Synchronized php-1.30.0-wmf.1/extensions/WikimediaMaintenance/makeSizeDBLists.php: Catch a silly error (duration: 00m 39s)
  • 08:14 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2048 for reimage (duration: 00m 39s)
  • 07:36 akosiaris: reboot kubernetes2001 for tests
  • 06:51 moritzm: installing openjdk-7/trusty regression update
  • 06:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T159753 T164530 (duration: 00m 38s)
  • 06:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T159753 T164530 (duration: 00m 39s)
  • 06:09 marostegui: Deploy alter table s2.revision table - db1018 - https://phabricator.wikimedia.org/T162611
  • 06:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 - T162611 (duration: 00m 40s)
  • 05:56 jynus: shutting down db2049 and preparing it for reimage
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Fri May 19 02:28:08 UTC 2017 (duration 6m 0s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 53s)

2017-05-18

  • 20:47 mutante: wasat - git pull - bring to latest, the last changed had never been deployed here like on terbium, but it's also not a backend for dbtree yet (T163141)
  • 20:44 mutante: terbium / dbtree - deploying gerrit:353388 (sudo -u mwdeploy git pull origin in /srv/dbtree) (T163143)
  • 20:03 urandom: T164865: restarting RESTBase-dev, range delete-based render retention
  • 19:52 urandom: T164865: restarting RESTBase-dev to apply range delete-based render retention
  • 19:06 urandom: T164865: configure RESTBase tables for size-tiered compaction (dev env only)
  • 18:37 dereckson@tin: Synchronized php-1.30.0-wmf.1/extensions/SecurePoll/includes/pages/DumpPage.php: Revert "Dump should return decrypted votes" (T145695) (duration: 00m 48s)
  • 17:10 robh: mr1-ulsfo having oob connection re-routed at ulsfo, will flap a bit from 1700-1730 gmt
  • 17:09 moritzm: upgrading mw2130-mw2139 to Linux 4.9 and HHVM 3.18
  • 16:28 moritzm: restarting cassandra on restbase1010, restbase1011, restbase1016, restbase1018 to pick up OpenJDK security updates
  • 16:11 elukey: upgraded cassandra-tools-wmf on aqs hosts
  • 15:54 _joe_: uploaded package cni to jessie-wikimedia
  • 15:34 marostegui: Deploy alter table s2.revision table - db1060 - https://phabricator.wikimedia.org/T162611
  • 15:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074, depool db1060 - T162611 (duration: 00m 39s)
  • 14:46 moritzm: rebooting restbase1008 for update to Linux 4.9 and to pick up OpenJDK security updates
  • 14:32 XioNoX: rebooting mr1-ulsfo for software upgrade - T164970
  • 14:12 akosiaris: perform a final reboot on kubernetes200X
  • 13:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 - T159753 T164530 (duration: 00m 39s)
  • 13:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T159753 T164530 (duration: 01m 03s)
  • 13:33 jynus: stopping mariadb and preparing for reimage at db2051
  • 13:14 elukey: AMEND prev: reloaded kafkatee on oxygen
  • 13:14 elukey: reloaded kafkatee to test T151748
  • 12:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T159753 T164530 (duration: 00m 38s)
  • 12:51 moritzm: upgrading mw1209-mw1219 to Linux 4.9 and HHVM 3.18
  • 12:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072, depool db1066 - T159753 T164530 (duration: 00m 38s)
  • 12:44 marostegui: Deploy alter table s2.revision table - db1074 - https://phabricator.wikimedia.org/T162611
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076, depool db1074 - T159753 T164530 (duration: 00m 39s)
  • 12:42 moritzm: upgrading mw1161 (job runner) to HHVM 3.18
  • 11:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073, depool db1072 - T159753 T164530 (duration: 00m 39s)
  • 11:10 marostegui: Run pt-table-checksum on s7.metawiki - T163190
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080, depool db1073 - T159753 T164530 (duration: 00m 39s)
  • 09:47 moritzm: upgrading image scalers in codfw to Linux 4.9 and HHVM 3.18
  • 09:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083, depool db1080 - T159753 T164530 (duration: 00m 38s)
  • 09:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T159753 T164530 (duration: 00m 39s)
  • 09:07 moritzm: upgrading image scalers mw1294/mw1295 to Linux 4.9 and HHVM 3.18
  • 09:06 marostegui: Deploy alter table s2.revision table - db1076 - https://phabricator.wikimedia.org/T162611
  • 09:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090, depool db1076 - T162611 (duration: 00m 39s)
  • 08:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T159753 T164530 (duration: 00m 39s)
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T159753 T164530 (duration: 00m 39s)
  • 08:32 moritzm: upgrading mw1180-mw1188, mw1200-mw1208 to new hhvm-luasandbox/hhvm-luasandbox-dbg packages
  • 08:16 apergos: reboot dataset1001 for kernel update
  • 08:09 marostegui: Deploy alter table on s1.enwiki directly on codfw master (db2016) after running the clean up duplicates script - https://phabricator.wikimedia.org/T164530
  • 08:01 moritzm: reboot rhenium for update to Linux 4.9
  • 07:36 moritzm: installing freetype security updates on trusty (jessie already fixed)
  • 07:27 akosiaris: restart nagios-nrpe-server on dbstore2001
  • 07:01 marostegui: Deploy alter table on s2.plwiki directly on codfw master (db2017) after running the clean up duplicates script - https://phabricator.wikimedia.org/T164530
  • 06:43 moritzm: installing tiff security updates
  • 06:24 marostegui: Deploy alter table on s2.ptwiki directly on codfw master (db2017) after running the clean up duplicates script - https://phabricator.wikimedia.org/T164530
  • 06:21 marostegui: Deploy alter table s2.revision table - labsdb1001 - T162611
  • 06:10 marostegui: Deploy alter table s2.revision table - dbstore1001 - T162611
  • 06:10 marostegui: Deploy alter table s2.revision table - db1090 - T162611
  • 06:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 - T162611 (duration: 00m 38s)
  • 06:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 - T116557 (duration: 00m 39s)
  • 05:01 Jamesofur: insert decryption key for WMF Board Election
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Thu May 18 02:26:11 UTC 2017 (duration 5m 59s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 14s)

2017-05-17

  • 23:45 cwd: re-enabled p-c jobs
  • 23:06 cwd: disabled p-c jobs
  • 22:27 ejegg: updated SmashPig from 0145e2d to 4f84d88
  • 22:00 urandom: T164865: altering compaction strategy to sizetiered, local_group_wikipedia_T_parsoid_html.data (in RESTBase dev)
  • 21:50 ejegg: rolled back SmashPig to 0145e2d
  • 21:47 ejegg: updated SmashPig from 0145e2d to 1affad1
  • 20:36 ejegg: updated paypal EC fallback currency in payments-wiki config
  • 19:21 robh: mr1-ulsfo replacement underway
  • 18:54 urandom: T164865: restarting RESTBase in dev env to apply range-delete probability bug-fix
  • 18:30 dereckson@tin: Synchronized php-1.30.0-wmf.1/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.init.js: Do not check for visual editor availability when loading source editor (Gerrit:354126) (duration: 00m 39s)
  • 18:24 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgCiteResponsiveReferences on ilo. and ms.wikipedia (T164230, T165247) (duration: 00m 39s)
  • 18:08 paravoid: reprepro include facter 2.4.6 to jessie-wikimedia/trusty-wikimedia
  • 16:52 bblack: restarting varnish backend on cp1099 (mailbox)
  • 16:42 moritzm: upgrading mw2120-mw2129 to Linux 4.9 and HHVM 3.18
  • 15:08 moritzm: upgrading mw1189-mw1199 to new hhvm-luasandbox/hhvm-luasandbox-dbg packages
  • 14:50 marostegui: Deploy alter table on s2.revision table on db1069 - T162611
  • 14:26 demon@tin: Synchronized README: No-op, forcing co-master sync (duration: 00m 40s)
  • 14:20 demon@tin: Pruned MediaWiki: 1.29.0-wmf.21 [keeping static files] (duration: 00m 22s)
  • 14:20 moritzm: upgrading mw1170-mw1179 to new hhvm-luasandbox/hhvm-luasandbox-dbg packages
  • 14:19 demon@tin: Pruned MediaWiki: 1.29.0-wmf.19 (duration: 01m 07s)
  • 14:17 demon@tin: Pruned MediaWiki: 1.29.0-wmf.19 [keeping static files] (duration: 00m 12s)
  • 13:47 cmjohnson1: replacing optics on cr1-3/1/2 and/or asw-c-eqiad:xe-8/0/38 T165008
  • 13:47 addshore@tin: Synchronized php-1.30.0-wmf.1/extensions/TwoColConflict/modules/: SWAT Fix issues with column alignment T165129 (duration: 00m 39s)
  • 13:44 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT Take RevisionSlider out of beta on all sites NOOP PT 2/2 (duration: 00m 39s)
  • 13:42 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT Take RevisionSlider out of beta on all sites T163685 PT 1/2 (duration: 00m 40s)
  • 13:42 elukey: shutdown analytics1030 for T165529
  • 13:41 moritzm: upgrading mw1261-mw1265 to new hhvm-luasandbox/hhvm-luasandbox-dbg packages
  • 13:23 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Harden zerowiki config (T162771) (duration: 00m 41s)
  • 12:42 marostegui: Deploy alter table on s2.trwiki directly on codfw master (db2017) after running the clean up duplicates script - T164530
  • 11:27 moritzm: uploaded php-luasandbox_2.0.12~jessie3 to apt.wikimedia.org (adds a separate debug package hhvm-luasandbox-dbg)
  • 11:17 moritzm: rebooting restbase2012 for update to Linux 4.9 and to pick up openjdk security updates
  • 10:58 moritzm: rebooting restbase2011 for update to Linux 4.9 and to pick up openjdk security updates
  • 10:47 jynus: stopping db2052 and preparing it for reimage
  • 10:26 moritzm: rebooting restbase2010 for update to Linux 4.9 and to pick up openjdk security updates
  • 09:58 moritzm: rebooting restbase2009 for update to Linux 4.9 and to pick up openjdk security updates
  • 09:31 moritzm: rebooting restbase2008 for update to Linux 4.9 and to pick up openjdk security updates
  • 08:50 marostegui: Deploy alter table on codfw master (db2016) and let ir replicate - T159753
  • 06:56 marostegui: Drop already renamed tables from labtestweb2001 (labtestwiki) - T164887
  • 06:54 marostegui: Drop already renamed tables from silver (labswiki) - T164887
  • 06:52 marostegui: Deploy alter table on s2 (revision table) dbstore1002 - T162611
  • 06:26 marostegui: Deploy alter table on s2 (revision table) db2017 (codfw master) - https://phabricator.wikimedia.org/T1626111
  • 06:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2041 and db2049 - T162611 (duration: 00m 39s)
  • 06:01 marostegui: Resume pt-table-checksum on s7.centralauth - https://phabricator.wikimedia.org/T163190
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Wed May 17 02:25:59 UTC 2017 (duration 6m 1s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 06m 58s)

2017-05-16

  • 23:25 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Update wmde-policy RSS feed on meta. (T165285) (duration: 00m 39s)
  • 22:42 Dereckson: Tin has now an up-to-date /srv/mediawiki-staging HEAD, with operations/mediawiki-config repo = prod = staging
  • 20:22 mobrovac@tin: Started restart [restbase/deploy@d98af6f] (dev-cluster): Apply the revision range deletion algorithm, take 2 - T164865
  • 20:10 mobrovac@tin: Started restart [restbase/deploy@d98af6f] (dev-cluster): Apply the revision range delition algorithm - T164865
  • 18:49 jynus: rolled back to HEAD~2 on tin to leave things the way I found them
  • 18:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 after reimage (duration: 00m 39s)
  • 18:26 bblack: cp1074: run-no-puppet varnish-backend-restart (has high mailbox lag, causing small 503 spikes)
  • 17:23 cmjohnson1: swapping optics asw-c-eqiad xe-8/0/38 T165008
  • 17:05 moritzm: upgrading mw2017/mw2099 to Linux 4.9 and HHVM 3.18
  • 16:40 moritzm: upgrading mw2190-mw2199 to Linux 4.9 and HHVM 3.18
  • 16:22 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2098.codfw.wmnet
  • 15:48 jynus: restarting and upgrading db1095
  • 15:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove old comment (duration: 00m 39s)
  • 14:53 marostegui: Deploy alter table on s2 (revision table) db2041 - https://phabricator.wikimedia.org/T162611
  • 14:53 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2056, depool db2041 - T162611 (duration: 00m 41s)
  • 14:50 mobrovac@tin: Started restart [restbase/deploy@d98af6f]: Apply new puppet role/profile paradigm
  • 14:36 kartik@tin: Finished deploy [cxserver/deploy@6118dda]: Update cxserver to 740641f (duration: 02m 21s)
  • 14:34 kartik@tin: Started deploy [cxserver/deploy@6118dda]: Update cxserver to 740641f
  • 14:27 moritzm: upgrading mw2180-mw2189 to Linux 4.9 and HHVM 3.18
  • 14:08 jynus: rolling restart labsdb1009,10,11 for mariadb upgrade (and kernel upgrade)
  • 14:06 moritzm: rebooting restbase2007 for update to Linux 4.9 and to pick up openjdk security updates
  • 13:53 moritzm: upgrading mw2170-mw2179 to Linux 4.9 and HHVM 3.18
  • 13:48 addshore: SWAT done
  • 13:48 addshore@tin: Synchronized php-1.30.0-wmf.1/extensions/QuickSurveys/extension.json: SWAT: Explicitly add mediawiki.cookie dependency (duration: 00m 39s)
  • 13:40 moritzm: rebooting restbase2006 for update to Linux 4.9 and to pick up openjdk security updates
  • 13:39 addshore@tin: Synchronized wmf-config/throttle.php: SWAT: Raise the account creation limit for www.enwp.org/WP:Meetup/Eugene/WikiAPA T165421 (duration: 00m 39s)
  • 13:36 addshore@tin: Synchronized wmf-config/: SWAT: #1 T164502, #2, #3 (duration: 00m 41s)
  • 13:19 moritzm: upgrading mw2163-mw2169 to HHVM 3.18
  • 13:07 moritzm: upgrading mw2110-mw2117 to HHVM 3.18
  • 12:55 marostegui: Run pt-table-checksum on s7.centralauth - https://phabricator.wikimedia.org/T163190
  • 12:06 marostegui: Deploy alter table on s2 (revision table) db2049 - https://phabricator.wikimedia.org/T162611
  • 12:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2049 - T162611 (duration: 00m 39s)
  • 11:37 moritzm: upgrading mw1190-mw1208 to Linux 4.9 and HHVM 3.18
  • 11:28 jynus: stopping db1055 before reimage for backup
  • 11:27 Amir1: ladsgroup@terbium:~$ mwscript extensions/ORES/maintenance/CleanDuplicateScores.php --wiki=enwiki
  • 11:25 Amir1: cleaning up is completely done current number of rows: 9,261,264 T159753
  • 11:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 for reimage (duration: 00m 39s)
  • 10:56 moritzm: upgrading codfw app servers already using HHVM 3.18 to 3.18.2+wmf3
  • 10:49 marostegui: Deploy schema change on testwikidatawiki.wb_terms on s3 codfw master - T165246
  • 10:36 jynus: upgrading and restarting db2062's mariadb service
  • 10:30 moritzm: installing openjdk-7 security updates on trusty hosts
  • 10:28 addshore: T164407 addshore@terbium mwscriptwikiset extensions/Cognate/maintenance/populateCognatePages.php wiktionary.dblist --batch-size=1000
  • 10:27 addshore: addshore@terbium mwscriptwikiset extensions/Cognate/maintenance/populateCognatePages.php wiktionary.dblist --batch-size=1000
  • 10:14 moritzm: upgrading mw1185-mw1189 to Linux 4.9 and HHVM 3.18
  • 09:26 moritzm: upgrading mw1189 / mw1293 from HHVM 3.18.2+wmf2 to 3.18.2+wmf3
  • 08:59 moritzm: upgrading mw1170-mw1184 from HHVM 3.18.2+wmf2 to 3.18.2+wmf3
  • 08:45 moritzm: upgrading git packages on tin/naos from local 2.11 backport to the version from jessie-backports
  • 08:22 moritzm: installing git security updates on trusty (jessie already fixed)
  • 07:39 godog: upload prometheus-mysqld-exporter 0.10.0 to jessie-wikimedia - T161296
  • 07:10 moritzm: upgrading mw1261-mw1265 to HHVM 3.18.2+wmf3
  • 07:06 Amir1_: start of cleaning up ores_classification table in enwiki last round (four hours) (T159753)
  • 06:58 moritzm: restarted hhvm on mw1165 (stuck in HPHP::Treadmill deadlock)
  • 06:37 marostegui: Stop replication at the same position on db1044 and db2018 - https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 06:32 marostegui: Disable replication codfw > eqiad on s3 https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 06:08 marostegui: Run pt-table-checksum on s7.viwiki - T163190
  • 06:02 marostegui: Deploy alter table on s2 (revision table) db2056 - T162611
  • 06:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2063, depool db2056 - T162611 (duration: 00m 40s)
  • 05:17 XioNoX: fyi, one of the links between codfw and eqiad is down for a scheduled Zayo maintenance. No outage, traffic routed around.
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Tue May 16 02:26:19 UTC 2017 (duration 6m 3s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 12s)
  • 00:30 ejegg: updated payments-wiki from 57451de to 3b84521
  • 00:10 ejegg: updated CiviCRM from 061cd61 to 4ece34c

2017-05-15

  • 23:41 bd808@tin: Synchronized php-1.30.0-wmf.1/resources/src/mediawiki.rcfilters/mw.rcfilters.Controller.js: RCFilters: Actually read/write highlight parameter (T165107) (duration: 00m 40s)
  • 22:23 mobrovac@tin: Finished deploy [restbase/deploy@d98af6f]: Wt2lint bug fix - T163091 (duration: 06m 44s)
  • 22:16 mobrovac@tin: Started deploy [restbase/deploy@d98af6f]: Wt2lint bug fix - T163091
  • 21:19 mobrovac@tin: Finished deploy [restbase/deploy@c52add0]: Expose the new /transform/wikitext/to/lint end point to the public - T163091 (duration: 06m 32s)
  • 21:13 mobrovac@tin: Started deploy [restbase/deploy@c52add0]: Expose the new /transform/wikitext/to/lint end point to the public - T163091
  • 20:48 gilles: run refreshImageMetadata --force for group1 + group2 wikis except commons on terbium T150741
  • 20:20 subbu: Updated Parsoid to a182c227 (T141226, T164792, T37247, T153107, T163091, T164006, T161151, T162920, T163549)
  • 20:11 ssastry@tin: Finished deploy [parsoid/deploy@132d0e5]: Updating Parsoid to a182c227 (duration: 07m 21s)
  • 20:04 ssastry@tin: Started deploy [parsoid/deploy@132d0e5]: Updating Parsoid to a182c227
  • 19:42 catrope@tin: Synchronized php-1.30.0-wmf.1/includes/api/ApiQueryRevisions.php: T165100 (duration: 00m 40s)
  • 18:45 catrope@tin: Synchronized php-1.30.0-wmf.1/extensions/MobileFrontend/: Revert "Use csrf token for watching" (T165209) (duration: 00m 41s)
  • 18:45 RoanKattouw: Canary failing on mw1279 due to Wikimedia\Rdbms\Database::makeList: empty input for field rev_id from ApiQueryRevisions
  • 18:20 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable test reader QuickSurveys (T131949, T164769, T164894, T164960, T164943) (duration: 00m 40s)
  • 17:24 mobrovac@tin: Finished deploy [restbase/deploy@c70a1e1] (dev-cluster): Bring RESTBase up to date in the Dev Cluster (duration: 01m 51s)
  • 17:22 mobrovac@tin: Started deploy [restbase/deploy@c70a1e1] (dev-cluster): Bring RESTBase up to date in the Dev Cluster
  • 15:39 akosiaris: upgrade pybal to 1.13.6 across the LVS fleet
  • 15:10 mobrovac@tin: Finished deploy [citoid/deploy@3ed34ef]: Better publishing date extraction support - T132308 (duration: 02m 49s)
  • 15:07 mobrovac@tin: Started deploy [citoid/deploy@3ed34ef]: Better publishing date extraction support - T132308
  • 14:24 mobrovac@tin: Started restart [restbase/deploy@c70a1e1] (dev-cluster): Restart after applying https://gerrit.wikimedia.org/r/#/c/352851/
  • 13:50 moritzm: upgrading mwdebug servers to 3.18.2+wmf3
  • 13:48 addshore@tin: Synchronized php-1.30.0-wmf.1/includes/media/DjVu.php: SWAT: Add X-Content-Dimensions support to DjVu T150741 (duration: 00m 39s)
  • 13:47 addshore@tin: Synchronized php-1.30.0-wmf.1/extensions/TimedMediaHandler/handlers: SWAT: Fix X-Content-Dimensions support T150741 (duration: 00m 40s)
  • 13:37 addshore@tin: Synchronized php-1.30.0-wmf.1/extensions/VisualEditor: SWAT: #1 #2 T165238 T165238 VisualEditor (duration: 00m 41s)
  • 13:27 moritzm: uploaded HHVM 3.18.2+dfsg-1+wmf3 to apt.wikimedia.org (addresses segfault in XML reader (T162586, T165074)
  • 13:20 addshore@tin: Synchronized php-1.30.0-wmf.1/extensions/Cognate/maintenance/populateCognatePages.php: SWAT: Add a clear-first option to populatePages script T164407 PT 2/2 (duration: 00m 39s)
  • 13:19 addshore@tin: Synchronized php-1.30.0-wmf.1/extensions/Cognate/src/CognateStore.php: SWAT: Add a clear-first option to populatePages script T164407 PT 1/2 (duration: 00m 40s)
  • 13:10 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add QuickSurvey for reader segmentation research T131949 T164769 T164894 T164960 T164963 (duration: 00m 40s)
  • 12:54 akosiaris: upload pybal 1.13.6 to apt.wikimedia.org/jessie-wikimedia/main
  • 12:33 aude@tin: Synchronized wmf-config/Wikibase-production.php: Enable data type for tabular data (duration: 00m 41s)
  • 11:09 Amir1_: cleaning up ores_classification has finished 18M rows deleted, current number of rows 38,937,217 (T159753)
  • 10:36 moritzm: rebooting mw2224-mw2242 for update to Linux 4.9
  • 10:18 moritzm: installing batik security updates on trusty
  • 10:14 moritzm: installing fop security updates on trusty
  • 09:34 moritzm: installing bind security updates (we're using client-side libs/tools only)
  • 09:10 godog: swift codfw-prod: more ms-be2001/ms-be2012 decom - T162785
  • 08:29 godog: swift eqiad-prod: ms-be1028/ms-be1039 object weight 3000 - T160640
  • 08:26 moritzm: installing rtmpdump security updates on jessie
  • 08:17 Amir1_: start of cleaning up ores_classification table
  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Mon May 15 02:27:02 UTC 2017 (duration 5m 59s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 42s)
  • 01:25 bblack: depooled cp1053 from all services (possible hardware issues)

2017-05-14

  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Sun May 14 02:26:33 UTC 2017 (duration 6m 2s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 17s)

2017-05-13

  • 12:27 gehel: restarting wdqs updater on wdqs cluster
  • 02:33 l10nupdate@tin: ResourceLoader cache refresh completed at Sat May 13 02:33:34 UTC 2017 (duration 6m 9s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 07m 07s)
  • 01:18 mobrovac: zotero restart as memis above 50%
  • 00:54 urandom: T165139: Truncating RESTBase feed_aggregated tables (corruption)
  • 00:31 urandom: T165139: Truncating RESTBase summary tables (corruption)

2017-05-12

  • 20:55 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Touch (duration: 00m 39s)
  • 20:49 demon@tin: Synchronized wmf-config/: Swapping DynamicSidebar to normal extension registration (duration: 00m 19s)
  • 19:20 thcipriani@tin: Synchronized php-1.30.0-wmf.1/extensions/TextExtracts/includes/ApiQueryExtracts.php: API: Change memcache key to clear cache T165161 (duration: 00m 39s)
  • 19:02 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: Add RejectParserCacheValue handler for mw-parser-output T165161 (duration: 00m 40s)
  • 18:47 bblack: starting spaced-out ~4h run of "run-no-puppet varnish-frontend-restart" on cache_upload+cache_text to re-set transient storage levels (in screen on neodymium)
  • 18:10 thcipriani@tin: Finished scap: Revert "Wrap parser output in
    " 4/4 (duration: 19m 13s)
  • 17:51 thcipriani@tin: Started scap: Revert "Wrap parser output in
    " 4/4
  • 17:51 thcipriani@tin: Synchronized php-1.30.0-wmf.1/includes/api/ApiParse.php: Revert "Wrap parser output in
    " 3/4 (duration: 00m 42s)
  • 17:50 thcipriani@tin: Synchronized php-1.30.0-wmf.1/includes/cache/MessageCache.php: Revert "Wrap parser output in
    " 2/4 (duration: 00m 39s)
  • 17:49 thcipriani@tin: Synchronized php-1.30.0-wmf.1/includes/parser/Parser.php: Revert "Wrap parser output in
    " 1/4 (duration: 00m 39s)
  • 17:38 ema: cp4010: upgrade varnish back to 4.1.6-1wm1, transient storage issues are unrelated
  • 17:33 krinkle@tin: Synchronized php-1.30.0-wmf.1/includes/resourceloader/ResourceLoaderClientHtml.php: (no justification provided) (duration: 00m 40s)
  • 16:53 moritzm: powercycling mw1294 (machine unacessible/locked up)
  • 16:23 moritzm: repooled mw1172 after scap pull (was down with hardware error)
  • 14:10 moritzm: rebooting mw2163-mw2179 for update to Linux 4.9
  • 13:47 moritzm: rebooting mw2110-mw2117 for update to Linux 4.9
  • 13:06 moritzm: repooled mw2098 (was down with hardware error)
  • 12:53 moritzm: downgrading mw1161 (job runner) to HHVM 3.12, some known instabilities and fix for one HHVM 3.18 will likely be available next week, so going the conversative way over the weekend
  • 11:35 gehel: cleaning old elasticsearch and logstash logs on logstash cluster
  • 10:38 _joe_: moved hpssacli.tar.gz to /root on puppetmaster1001
  • 09:59 hashar@tin: Synchronized php-1.30.0-wmf.1/extensions/MobileFrontend: Correctly handle the mw-parser-output wrapper - T164733 (duration: 00m 43s)
  • 09:02 akosiaris: move planet2001 to ganeti nodegroup row_A
  • 08:58 marostegui: Rename semantic tables before dropping them on wikitech hosts (silver and labtestweb2001) - T164887
  • 06:05 marostegui: Deploy alter table on s2 (revision table) db2063 - https://phabricator.wikimedia.org/T162611
  • 06:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2064, depool db2063 - T162611 (duration: 00m 39s)
  • 05:53 marostegui: Stop MySQL dbstore2001 for testing - T165033
  • 02:30 l10nupdate@tin: ResourceLoader cache refresh completed at Fri May 12 02:30:11 UTC 2017 (duration 6m 16s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 06m 49s)

2017-05-11

  • 23:49 thcipriani@tin: Synchronized wmf-config/CommonSettings-labs.php: SWAT: Enable saving RC Filters on Beta Cluster (beta-only-change) (duration: 00m 39s)
  • 23:39 thcipriani@tin: Synchronized php-1.30.0-wmf.1/resources/src/mediawiki.rcfilters: SWAT: Gate option to save RC filters to default false 3/3 (duration: 00m 39s)
  • 23:39 thcipriani@tin: Synchronized php-1.30.0-wmf.1/includes/specials/SpecialRecentchanges.php: SWAT: Gate option to save RC filters to default false 2/3 (duration: 00m 39s)
  • 23:38 thcipriani@tin: Synchronized php-1.30.0-wmf.1/includes/DefaultSettings.php: SWAT: Gate option to save RC filters to default false 1/3 (duration: 00m 39s)
  • 23:30 thcipriani@tin: Synchronized php-1.30.0-wmf.1/extensions/TemplateData/extension.json: SWAT: Fix styles queue violation for "ext.templateData" T92459 (duration: 00m 39s)
  • 23:23 twentyafterfour: restart apache on iridium to apply hotfix for T163967
  • 23:21 thcipriani@tin: Synchronized php-1.30.0-wmf.1/resources/src/mediawiki/mediawiki.Upload.Dialog.js: SWAT: mw.Upload.Dialog: Define .static.name T164999 (duration: 00m 40s)
  • 23:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable OOUI for EditPage on MW.org (duration: 00m 40s)
  • 23:09 Amir1: clean up for ores_classification is finished for now, 9M rows cleaned, current number of row: 55,959,017 (T159753)
  • 21:19 twentyafterfour@tin: Synchronized php-1.30.0-wmf.1/includes/specials/SpecialSearch.php: hotfix T165091 (duration: 00m 39s)
  • 21:02 Amir1: start of cleaning up ores_classification in enwiki for two hours (T159753)
  • 20:57 hashar: CI Phpunit jobs were segfaulting due to an upgrade of HHVM to 3.18. Got rolled back to 3.12 - T165074
  • 20:06 demon@tin: Synchronized scap/plugins/prep.py: scap prep is fast now (duration: 00m 44s)
  • 19:41 demon@tin: Synchronized scap/plugins/clean.py: no-op, completeness (duration: 00m 42s)
  • 19:35 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.30.0-wmf.1
  • 18:53 thcipriani@tin: Synchronized php-1.30.0-wmf.1/extensions/Gadgets/includes/GadgetResourceLoaderModule.php: SWAT: Revert "Move gadget styles from main stylesheet request to site request" T165040 T165031 (duration: 00m 42s)
  • 18:47 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable OOUI in EditPage for fawiki T162849 (duration: 00m 42s)
  • 18:39 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 18:23 thcipriani@tin: Synchronized php-1.30.0-wmf.1/extensions/WikimediaEvents/modules/ext.wikimediaEvents.recentChangesClicks.js: SWAT: RecentChangesClicks: Address minor performance concerns T158458 (duration: 00m 42s)
  • 15:35 ladsgroup@tin: Synchronized wmf-config: Set oresDamagingPref default to values that actually exist (T165011) (duration: 00m 44s)
  • 15:35 Amir1: starts of ladsgroup@tin:/srv/mediawiki-staging$ scap sync-dir wmf-config 'Set oresDamagingPref default to values that actually exist (T165011)'
  • 15:30 chasemp: rotate novaadmin in /labtest/ ldappasswd -H ldap://labtestservices2001.wikimedia.org -x -D "uid=novaadmin,ou=people,dc=wikimedia,dc=org" -W -A -S
  • 14:37 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=sca2004.codfw.wmnet
  • 14:36 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=sca2004.codfw.wmnet
  • 14:10 gehel@tin: Finished deploy [wdqs/wdqs@bc30531]: (no justification provided) (duration: 01m 23s)
  • 14:08 gehel@tin: Started deploy [wdqs/wdqs@bc30531]: (no justification provided)
  • 14:07 gehel: deploying WDQS to fix T165029
  • 14:01 mobrovac@tin: Started restart [zotero/translation-server@50f216a]: Zotero unresponsive
  • 13:59 aude@tin: Synchronized php-1.30.0-wmf.1/extensions/Wikidata: Update quality constraints (duration: 02m 14s)
  • 13:56 mobrovac@tin: Started restart [zotero/translation-server@6a4a828]: (no justification provided)
  • 13:48 addshore@tin: Synchronized wmf-config/jobqueue-labs.php: SWAT: LABS ONLY Re-enable persistent connection to Redis for jobrunners in lab (duration: 00m 41s)
  • 13:33 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: (notask) wgRevisionSliderAlternateSlider true everywhere PT 2/2 (duration: 00m 42s)
  • 13:33 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: (notask) wgRevisionSliderAlternateSlider true everywhere PT 1/2 (duration: 00m 43s)
  • 13:31 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T162796 Stop prerendering thumbs at 2560/2880 pixels (duration: 00m 41s)
  • 13:23 moritzm: rebooting restbase2005 for update to Linux 4.9 / new openjdk
  • 13:21 addshore@tin: Synchronized php-1.30.0-wmf.1/extensions/Cognate/src/CognateStore.php: SWAT: T165005 Dont pass ConnectionRefs to ConnectionManager::releaseConnection (duration: 00m 42s)
  • 13:10 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T164888 Correct alias(es) from es.wikisource to eo.wikisource (duration: 00m 42s)
  • 12:55 akosiaris: migrate sca2004 to ganeti nodegroup row_A
  • 12:33 marostegui: Run pt-table-checksum on s7.ukwiki - https://phabricator.wikimedia.org/T163190
  • 12:19 elukey: reboot kafka100[23] for kernel upgrades (kafka main-eqiad, eventbus eqiad)
  • 11:03 marostegui: Deploy alter table on s2 (revision table) db2064 - T162611
  • 11:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2064 - T162611 (duration: 00m 42s)
  • 10:15 akosiaris: reboot ganeti200{5,6,7,8} for network reconfiguration
  • 10:10 marostegui: Run pt-table-checksum on s7.rowiki - https://phabricator.wikimedia.org/T163190
  • 10:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1056 original load (duration: 00m 49s)
  • 09:46 ema: cp4010: downgrade varnish to 4.1.5-1wm4 and check frontend transient memory usage
  • 09:12 ayounsi@puppetmaster1001: conftool action : set/pooled=yes; selector: name=logstash1003.eqiad.wmnet
  • 09:12 ayounsi@puppetmaster1001: conftool action : set/pooled=yes; selector: name=logstash1002.eqiad.wmnet
  • 09:12 ayounsi@puppetmaster1001: conftool action : set/pooled=yes; selector: name=logstash1001.eqiad.wmnet
  • 09:10 moritzm: upgrading mw1170-mw1188 to HHVM 3.18 / Linux 4.9 (also pruning HHVM CLI bytecode since downtimed anyway)
  • 08:55 moritzm: migrating mw1161 (job runner) to HHVM 3.18 and Linux 4.9
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1056 load (duration: 00m 43s)
  • 08:35 marostegui: Run pt-table-checksum on s7.kowiki - https://phabricator.wikimedia.org/T163190
  • 08:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1056 load (duration: 00m 42s)
  • 08:26 moritzm: migrating mw1189 (API server) to HHVM 3.18 and Linux 4.9
  • 07:53 godog: roll-restart ms-fe1* for linux 4.9 upgrade - T162029
  • 06:50 moritzm: migrating mw1293 (image scaler) to HHVM 3.18 and Linux 4.9
  • 06:30 marostegui: Drop mira user on wikitech database - T164968
  • 06:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 with less load (duration: 00m 43s)
  • 05:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T147166 T130067 (duration: 00m 57s)
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Thu May 11 03:14:51 UTC 2017 (duration 6m 44s)
  • 03:08 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 13m 33s)
  • 02:46 Jamesofur: all election emails out
  • 02:41 Jamesofur: Sending English and all other language election emails via terbium
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.21) (duration: 13m 22s)
  • 02:21 Jamesofur: sending Chinese election emails via terbium
  • 02:18 Jamesofur: sending uk and vi election emails via terbium
  • 02:10 Jamesofur: sending pt,pt-br and ru election emails via terbium
  • 01:55 Jamesofur: sending polish and dutch election emails via terbium
  • 01:32 Jamesofur: sending Italian and Japanese election emails via terbium
  • 01:21 Jamesofur: sending he, hi and id election emails via terbium
  • 01:08 Jamesofur: sending French election emails via terbium
  • 01:05 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.1
  • 01:00 Jamesofur: sending farsi election emails via terbium
  • 00:50 Jamesofur: sending Spanish election emails via terbium
  • 00:36 Jamesofur: sending german election emails via terbium
  • 00:29 Jamesofur: sending bg and bn election emails via terbium
  • 00:11 Jamesofur: sending arabic election emails via terbium
  • 00:03 maxsem@tin: Finished deploy [kartotherian/deploy@9401f38]: Try https://gerrit.wikimedia.org/r/#/c/352886/ and https://gerrit.wikimedia.org/r/#/c/353184/ on test hosts (duration: 145m 42s)

2017-05-10

  • 23:50 twentyafterfour@tin: Finished scap: Sync fix for T164983 plus i18n files leftover from swat. refs T162954 (duration: 30m 37s)
  • 23:19 twentyafterfour@tin: Started scap: Sync fix for T164983 plus i18n files leftover from swat. refs T162954
  • 23:13 catrope@tin: Synchronized php-1.30.0-wmf.1/extensions/WikimediaEvents/: T164617 (duration: 00m 42s)
  • 23:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable archive search on select wikis (T162302) (duration: 00m 41s)
  • 21:38 twentyafterfour@tin: Synchronized php-1.30.0-wmf.1/extensions/ORES/includes/Hooks.php: sync fix for T164984 refs T162954 (duration: 00m 42s)
  • 21:38 maxsem@tin: Started deploy [kartotherian/deploy@9401f38]: Try https://gerrit.wikimedia.org/r/#/c/352886/ and https://gerrit.wikimedia.org/r/#/c/353184/ on test hosts
  • 20:55 elukey: restart hhvm on mw1268 (HHVM 3.12, HPHP::Treadmill::getAgeOldestRequest issue)
  • 20:37 demon@tin: Synchronized README: no-op, comaster sync (duration: 00m 42s)
  • 20:36 Dereckson: Run namespaceDupes.php on es.wikisource (T164195)
  • 20:35 bsitzmann@tin: Finished deploy [mobileapps/deploy@5d3b34a]: Update mobileapps to 75b135e (duration: 03m 55s)
  • 20:33 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Restore Autor: and Portal: namespaces on es.wikisource (T164195) (duration: 00m 42s)
  • 20:31 bsitzmann@tin: Started deploy [mobileapps/deploy@5d3b34a]: Update mobileapps to 75b135e
  • 19:51 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.21
  • 19:51 twentyafterfour: rolling group1 back to 1.29.0-wmf.21 due to T164984
  • 19:45 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.30.0-wmf.1
  • 19:33 twentyafterfour: deploying 1.30.0-wmf.1 to group1 wikis. refs T162954
  • 19:29 dereckson@tin: Synchronized php-1.30.0-wmf.1/extensions/TimedMediaHandler/: Store original media dimensions as additional header (T150741) (duration: 00m 43s)
  • 19:28 dereckson@tin: Synchronized php-1.30.0-wmf.1/extensions/PdfHandler/PdfHandler_body.php: Store original media dimensions as additional header (T150741) (duration: 00m 42s)
  • 19:27 dereckson@tin: Synchronized wmf-config/interwiki.php: Interwiki map update (disable __list sorting, T145337) (duration: 00m 41s)
  • 19:26 dereckson@tin: Synchronized php-1.30.0-wmf.1/extensions/PagedTiffHandler/PagedTiffHandler_body.php: Store original media dimensions as additional header (T150741) (duration: 00m 42s)
  • 19:17 dereckson@tin: Synchronized php-1.30.0-wmf.1/extensions/TwoColConflict/: Add "oojs-ui" dep to ext.TwoColConflict.filterOptionsJs (duration: 00m 42s)
  • 18:57 paravoid: mr1-ulsfo: request system snapshot media internal slice alternate; request system reboot
  • 18:53 dereckson@tin: Synchronized php-1.29.0-wmf.21/extensions/TwoColConflict/: Add "oojs-ui" dep to ext.TwoColConflict.filterOptionsJs (duration: 00m 42s)
  • 18:30 dereckson@tin: Synchronized php-1.30.0-wmf.1/extensions/CirrusSearch/maintenance/forceSearchIndex.php: Fix index usage on archive indexing (duration: 00m 42s)
  • 18:14 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Put Cognate in write mode for all wiktionaries (T164407) (duration: 00m 42s)
  • 17:46 jynus: setting db1056's cpu scaling_governor to performance, rather than powersave
  • 17:20 moritzm: installing groovy security updates
  • 17:03 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Revert "Create Autor and Portal namespaces on Spanish Wikisource" (PT164195) (duration: 00m 43s)
  • 16:30 godog: roll-restart swift object servers to apply https://gerrit.wikimedia.org/r/#/c/353078
  • 15:44 moritzm: instaling git security updates on jessie systems
  • 15:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T147166 T130067 (duration: 01m 43s)
  • 15:18 moritzm: uploaded HHVM 3.18.2 and HHVM extensions to apt.wikimedia.org/main (previously only in experimental)
  • 15:03 jynus: shutting down db1056 for pysical maintenance T164944
  • 14:57 elukey: reboot kafka1001 for kernel upgrades (kafka main-eqiad, eventbus eqiad)
  • 14:50 marostegui: Stop replication at the same position on db1067 and db2016 - https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 14:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T147166 T130067 (duration: 00m 43s)
  • 14:43 marostegui: Run pt-table-checksum on s7.huwiki - https://phabricator.wikimedia.org/T163190
  • 14:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097 (duration: 00m 43s)
  • 14:39 jynus: disabling puppet to solve disk mount issues T164915
  • 14:36 godog: roll-restart swift-proxy to apply https://gerrit.wikimedia.org/r/#/c/353078/
  • 14:36 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1005.eqiad.wmnet
  • 14:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 (duration: 00m 43s)
  • 14:27 hashar: European SWAT completed
  • 14:21 moritzm: upgrading mw1263-mw1265 to latest HHVM package (including the redis QUIT patch)
  • 14:19 hashar@tin: Finished scap: Store original media dimensions as additional header - T150741 (duration: 03m 53s)
  • 14:15 hashar@tin: Started scap: Store original media dimensions as additional header - T150741
  • 14:15 hashar@tin: scap aborted: Store original media dimensions as additional header - T150741 (duration: 00m 00s)
  • 14:15 hashar@tin: Started scap: Store original media dimensions as additional header - T150741
  • 14:15 hashar@tin: scap aborted: (no justification provided) (duration: 00m 00s)
  • 14:15 hashar@tin: Started scap: (no justification provided)
  • 14:13 hashar: ValueError: /srv/mediawiki-staging/php-1.30.0-wmf.1/extensions/Collection/.eslintrc.json is an invalid JSON file
  • 13:53 elukey: reboot kafka200[23] for kernel upgrades (kafka main-codfw cluster, eventbus codfw)
  • 13:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up inappropriate usages of wmg - T151891 (duration: 00m 42s)
  • 13:34 hashar@tin: Synchronized wmf-config/CommonSettings.php: Clean up inappropriate usages of wmg - T151891 (duration: 00m 42s)
  • 13:28 marostegui: Disable replication codfw > eqiad on s1 - https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 13:24 hashar@tin: Synchronized php-1.29.0-wmf.21/extensions/Popups: eventLogging: Discard events with duplicate tokens - T161769 T163198 (duration: 00m 43s)
  • 13:19 hashar@tin: Synchronized php-1.30.0-wmf.1/extensions/Popups: eventLogging: Discard events with duplicate tokens - T161769 T163198 (duration: 01m 08s)
  • 13:17 hashar@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ar.svg: (no justification provided) (duration: 00m 42s)
  • 13:13 hashar@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ar.svg: Add new Arabic Wikipedia logo - T164648 (duration: 00m 44s)
  • 13:12 akosiaris: restart pybal on lvs1006, lvs1009, lvs1012 to pick up the kubemaster LVS service
  • 13:09 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add new Arabic Wikipedia logo - T164648 && Disable page previews beta features on various projects - T164740 (duration: 00m 42s)
  • 13:07 marostegui: Run pt-table-checksum on s7.hewiki - https://phabricator.wikimedia.org/T163190
  • 13:04 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Import sources on dty.wikipedia - T164573 (duration: 00m 43s)
  • 12:47 moritzm: installing irqbalance updates from jessie point update
  • 12:45 akosiaris: rebooting ganeti2007, ganeti2008 for networking config update
  • 12:34 moritzm: installing logback security updates
  • 11:27 jynus: stopping mariadb and preparing db1056 for reimage
  • 11:22 marostegui: Stop replication at the same position on db1049 and db2023
  • 11:14 marostegui: Stop replication at the same position on db1050 and db2028
  • 10:50 marostegui: Stop replication at the same position on db1033 and db2029 - T147166 T130067
  • 10:44 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 for reimage (duration: 00m 43s)
  • 10:43 marostegui: Disable replication codfw > eqiad on s7 - T147166 T130067
  • 09:36 godog: roll-restart ms-fe2* for linux 4.9 upgrade - T162029
  • 09:11 moritzm: installing vim security updates on jessie
  • 09:05 volans: updated CI puppet compiler facts from production
  • 08:59 moritzm: installing wget security updates on jessie
  • 08:35 moritzm: rebooting mx2001 for update to Linux 4.9
  • 08:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: wmgUseTwoColConflict true for all wikis (duration: 00m 54s)
  • 07:30 marostegui: Stop replication at the same position on db10418 and db2017 - T147166 https://phabricator.wikimedia.org/T130067
  • 07:16 marostegui: Disable replication codfw > eqiad on s2 -T147166 T130067
  • 07:13 Amir1: another round of cleaning up ores_classification is done, 12M rows deleted. Current number of rows: 64,902,521 (T159753)
  • 06:36 moritzm: installing rtmpdump security updates on trusty
  • 06:15 marostegui: Deploy alter table wikidatawiki.wb_terms on dbstore1001 - T162539 T163190
  • 06:08 marostegui: Run pt-table-checksum on s7.frwiktionary - T163190
  • 05:04 Amir1: start of cleaning up ores_classification rows for three hours
  • 04:49 kartik@tin: Finished deploy [cxserver/deploy@533b4f4]: Update cxserver to 534619c (duration: 02m 38s)
  • 04:46 kartik@tin: Started deploy [cxserver/deploy@533b4f4]: Update cxserver to 534619c
  • 03:02 l10nupdate@tin: ResourceLoader cache refresh completed at Wed May 10 03:02:23 UTC 2017 (duration 6m 37s)
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.30.0-wmf.1) (duration: 06m 50s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.21) (duration: 08m 09s)
  • 00:29 maxsem@tin: Synchronized wmf-config/wikitech.php: https://gerrit.wikimedia.org/r/#/c/352980/3 (duration: 00m 42s)
  • 00:12 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES on fiwiki (T163011) (duration: 00m 43s)
  • 00:10 RoanKattouw: Running extensions/ORES/maintenance/PopulateDatabase.php on fiwiki

2017-05-09

  • 23:54 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RCFilters beta feature on all remaining wikis (T144458) (duration: 00m 44s)
  • 23:33 mutante: db1040 - remove from puppet, puppet node clean/deactivate, deleted salt-key, remove from icinga by running puppet on tegmen after that (T164057)
  • 23:23 demon@tin: Finished scap: rebuilding l10n for extension-list swap (duration: 34m 10s)
  • 23:13 mutante: analytics1027 - decom: revoke puppet cert, delete salt key, puppet node clean/deactivate, check icinga removal (T161597)
  • 22:49 demon@tin: Started scap: rebuilding l10n for extension-list swap
  • 22:46 reedy@tin: Synchronized wmf-config/extension-list-wikitech: Consistency (duration: 00m 42s)
  • 22:20 reedy@tin: Synchronized wmf-config/wikitech.php: Disable Semantic extensions (duration: 00m 42s)
  • 22:03 reedy@tin: scap aborted: (no justification provided) (duration: 00m 03s)
  • 22:03 reedy@tin: Started scap: (no justification provided)
  • 21:40 twentyafterfour: Mediawiki train group0 finished, will resume tomorrow with group 1 wikis. refs T162954
  • 21:32 twentyafterfour: group0 wikis to 1.30.0-wmf.1 refs T162954
  • 21:32 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.30.0-wmf.1
  • 20:52 twentyafterfour@tin: Finished scap: MediaWiki sync new branch wmf/1.30.0-wmf.1 + localization cache and deploy to testwikis refs T162954 (duration: 29m 41s)
  • 20:22 twentyafterfour@tin: Started scap: MediaWiki sync new branch wmf/1.30.0-wmf.1 + localization cache and deploy to testwikis refs T162954
  • 19:47 maxsem@tin: Finished deploy [kartotherian/deploy@740235c]: https://gerrit.wikimedia.org/r/#/c/352886/ (duration: 05m 35s)
  • 19:42 maxsem@tin: Started deploy [kartotherian/deploy@740235c]: https://gerrit.wikimedia.org/r/#/c/352886/
  • 18:39 bblack: varnish: manually etting runtime lru_interval / nuke_limit via varnishadm for all clusters' backends to match start-time change in https://gerrit.wikimedia.org/r/#/c/352827/
  • 18:26 subbu: updated Parsoid to 9d8badc8 (T151277)
  • 18:22 ssastry@tin: Finished deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8 (duration: 07m 09s)
  • 18:16 mepps: updated SmashPig from 200f63e to 0145e2d
  • 18:15 ssastry@tin: Started deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8
  • 17:29 elukey: executing varnish-backend-restart on cp1072 as attempt to mitigate "FetchError Could not get storage" and "ExpKill LRU_Fail" - T145661
  • 17:25 elukey: executing varnish-backend-restart on cp1074 as attempt to mitigate "FetchError Could not get storage" and "ExpKill LRU_Fail" - T145661
  • 17:23 twentyafterfour: Preparing to branch 1.30.0-wmf.1 [ T162954 ]
  • 16:08 elukey: playing with mw2146 for T163674
  • 16:00 elukey: stopping Hadoop daemons and shutting down analytics[1032-1033,1040].eqiad.wmnet - T132256
  • 15:20 moritzm: installing rpcbind/libtirpc security updates on ms1001
  • 15:15 moritzm: uploaded kubernetes 1.5.5-1+wmf1 to stretch-wikimedia/experimental
  • 15:02 urandom: starting instances restbase2005
  • 14:55 moritzm: repooled mw1264 after hardware error has been fixed (and scap pull)
  • 14:45 hashar: European SWAT completed
  • 14:39 bblack: varnish: varnishadm runtime set default_ttl=86400 for text+upload fe+be layers via cumin, to match deployed start-time changes in https://gerrit.wikimedia.org/r/#/c/352826/
  • 14:22 hashar@tin: Finished scap: (no justification provided) (duration: 03m 10s)
  • 14:19 hashar@tin: Started scap: (no justification provided)
  • 14:16 elukey: correction: reboot kafka2001 for kernel upgrades (eventbus codfw)
  • 14:16 elukey: reboot kafka1001 for kernel upgrades (eventbus codfw)
  • 14:10 hashar@tin: Finished scap: TwoColConflict update (duration: 19m 30s)
  • 14:09 marostegui: Stop MySQL and shutdown db1048 (phabricator slave) to replace BBU - T160731
  • 14:06 marostegui: Run pt-table-checksum on s7.fawiki - T163190
  • 13:51 hashar@tin: Started scap: TwoColConflict update
  • 13:49 hashar@tin: Synchronized php-1.29.0-wmf.21/extensions/TwoColConflict: BACKPORTS from master - T162806 T163886 (duration: 00m 41s)
  • 13:47 hashar@tin: Synchronized wmf-config/Wikibase-production.php: Enable sending Wikidata notification on Wikivoyage - T142103 (duration: 00m 39s)
  • 13:46 gehel: upgrade deployment-prep cluster to elasticsearch 5.3.2 - T163707
  • 13:44 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Create Autor and Portal namespaces on Spanish Wikisource - T164195 (duration: 00m 39s)
  • 13:39 gehel: cancel upgrading elasticsearch on relforge (plugin under test is missing a release for 5.3.2) - T163703
  • 13:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Allow page move only autopatrolled at hiwiki - T164239 (duration: 00m 42s)
  • 13:33 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Allow new page patroll for autoconfirmed users on bnwiki - T164159 (duration: 00m 40s)
  • 13:26 ayounsi@tin: Finished deploy [librenms/librenms@b10cc7c]: (no justification provided) (duration: 00m 04s)
  • 13:26 ayounsi@tin: Started deploy [librenms/librenms@b10cc7c]: (no justification provided)
  • 13:25 hashar@tin: Synchronized php-1.29.0-wmf.21/extensions/ContentTranslation/modules/tools/ext.cx.tools.template.js: Fix the container calculation for template editor - T163105 (duration: 00m 40s)
  • 13:23 gehel: upgrading elasticsearch on relforge - T163703
  • 13:11 reedy@tin: Synchronized wmf-config/extension-list: PageTriage to extension.json in extension-list (duration: 00m 39s)
  • 13:08 reedy@tin: Synchronized wmf-config/mobile.php: wfLoadExtension for ZeroBanner (duration: 00m 41s)
  • 13:02 moritzm: rebooting restbase2004 for update to Linux 4.9 and new OpenJDK
  • 12:34 gehel: upgrade ELK on deplyoment-logstash2
  • 12:19 moritzm: rebooting restbase2003 for update to Linux 4.9 and new OpenJDK
  • 11:47 marostegui: Stop replication at the same position on db1049 and db2023 - https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 11:45 marostegui: Disable replication codfw > eqiad on s5 - https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 11:35 moritzm: rebooting restbase2002 for update to Linux 4.9 and new OpenJDK
  • 11:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1097 - T147166 T130067 (duration: 00m 39s)
  • 11:03 elukey: forced net.netfilter.nf_conntrack_tcp_timeout_time_wait = 65 to all the kafka brokers
  • 10:39 ayounsi@tin: Finished deploy [librenms/librenms@259e998]: (no justification provided) (duration: 00m 09s)
  • 10:39 ayounsi@tin: Started deploy [librenms/librenms@259e998]: (no justification provided)
  • 10:35 akosiaris@tin: Finished deploy [librenms/librenms@259e998]: (no justification provided) (duration: 00m 02s)
  • 10:35 akosiaris@tin: Started deploy [librenms/librenms@259e998]: (no justification provided)
  • 10:34 elukey: reboot kafka1022 for kernel upgrades
  • 10:09 elukey: reboot kafka1020 for kernel upgrades
  • 09:57 moritzm: restarting hhvm on mw1190, deadlocked in HPHP::Treadmill::getAgeOldestRequest
  • 09:41 marostegui: Stop replication at the same position on db1097 and db2019 - https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 09:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 - T147166 T130067 (duration: 00m 41s)
  • 09:21 marostegui: Disable replication codfw > eqiad on s4 - https://phabricator.wikimedia.org/T147166 https://phabricator.wikimedia.org/T130067
  • 09:12 marostegui: Run pt-table-checksum on s7.eswiki - T163190
  • 09:07 hoo: Removed 2fa from global account Jcornelius (T164682)
  • 08:05 godog: roll-restart swift proxy for ratelimit middleware - T162793
  • 07:53 moritzm: uploaded kubernetes 1.4.2-6 for stretch-wikimedia to apt.wikimedia.org
  • 07:34 moritzm: removing unneeded rpcbind/nfs-common packages (T106477)
  • 07:31 marostegui: Stop replication at the same position on db1050 and db2028 - T147166 T130067
  • 07:27 marostegui: Disable replication codfw > eqiad on s6 - T147166 T130067
  • 07:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2019 - T149526 (duration: 00m 39s)
  • 07:11 elukey: reboot kafka1014 for kernel upgrades
  • 07:01 _joe_: installing the new version of python-service-checker across the fleet
  • 06:37 marostegui: Run pt-table-checksum on s7.cawiki - T163190
  • 06:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2038 - T162539 T163548 (duration: 00m 41s)
  • 05:54 marostegui: Deploy alter table on wikidatawiki.wb_terms on codfw master db2023 - https://phabricator.wikimedia.org/T162539 - https://phabricator.wikimedia.org/T163548
  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Tue May 9 02:27:46 UTC 2017 (duration 5m 58s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.21) (duration: 08m 17s)

2017-05-08

  • 23:21 bd808@tin: Synchronized wmf-config/wikitech.php: Revert "Disable creation of new forms on wikitech" (T53642) (duration: 01m 10s)
  • 22:54 bd808@tin: Finished deploy [striker/deploy@00e8545]: openstack: Role modifications require global admin rights (T164787) (duration: 00m 27s)
  • 22:54 bd808@tin: Started deploy [striker/deploy@00e8545]: openstack: Role modifications require global admin rights (T164787)
  • 22:17 bd808: Deleted 2fa for user Mdann52 on wikitech after verifying account ownership via ssh file creation. T164804
  • 22:01 andrewbogott: rebooting labservices1002 to mess with the bios
  • 21:55 bblack: depooled cp3035 (memory issues - already schedule for FE restart to fix, which will repool when it's reached in the list...)
  • 21:25 mobrovac@tin: Finished deploy [graphoid/deploy@a288409]: Switched to npm-stored graph-shared, fix mapsnapshot - T164046 (duration: 01m 39s)
  • 21:24 mobrovac@tin: Started deploy [graphoid/deploy@a288409]: Switched to npm-stored graph-shared, fix mapsnapshot - T164046
  • 21:22 mobrovac@tin: Finished deploy [graphoid/deploy@a288409]: Switched to npm-stored graph-shared, fix mapsnapshot - T164046 (duration: 03m 51s)
  • 21:18 mobrovac@tin: Started deploy [graphoid/deploy@a288409]: Switched to npm-stored graph-shared, fix mapsnapshot - T164046
  • 21:17 mobrovac@tin: Finished deploy [graphoid/deploy@a288409]: Switched to npm-stored graph-shared, fix mapsnapshot - T164046 (duration: 00m 38s)
  • 21:17 mobrovac@tin: Started deploy [graphoid/deploy@a288409]: Switched to npm-stored graph-shared, fix mapsnapshot - T164046
  • 21:12 mobrovac@tin: Finished deploy [restbase/deploy@c70a1e1]: Remove the mobile-text end point - T158128 (duration: 06m 23s)
  • 21:06 mobrovac@tin: Started deploy [restbase/deploy@c70a1e1]: Remove the mobile-text end point - T158128
  • 21:05 arlolra@tin: Finished deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8 (duration: 02m 43s)
  • 21:02 arlolra@tin: Started deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8
  • 20:48 arlolra@tin: Finished deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8 (duration: 01m 36s)
  • 20:47 arlolra@tin: Started deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8
  • 20:37 gehel: silencing elasticsearch shard incinga check, recovery after upgrade is going to take a long time - T161908
  • 20:34 arlolra@tin: Finished deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8 (duration: 04m 50s)
  • 20:30 arlolra@tin: Started deploy [parsoid/deploy@0459ae3]: Updating Parsoid to 9d8badc8
  • 20:27 gehel: restarted kibana on logstash cluster - T161908
  • 20:21 gehel: upgrading kibana on logstash cluster - T161908
  • 20:02 gehel: restarting elasticsearch on logstash cluster after upgrade - T161908
  • 19:47 gehel: logstash / elasticsearch downtime coming up - T161908
  • 19:34 bd808: Deployment of Striker for T162508 complete; will continue debug keystone issue that is preventing Tool Labs membership requests from being approved
  • 19:34 bblack: restarted varnishxcache service on cp3031, was malfunctioning and sending crazy stats to grafana...
  • 19:28 gehel: starting ELK (logstash) upgrade - T161908
  • 19:17 bd808@tin: Finished deploy [striker/deploy@3836477]: Implement Tool Labs membership application and processing (T162508) (duration: 00m 32s)
  • 19:17 bd808@tin: Started deploy [striker/deploy@3836477]: Implement Tool Labs membership application and processing (T162508)
  • 19:15 bd808: Forced puppet run on californium to provision new striker config settings
  • 19:07 bd808: Applied database migration for T162508 to striker database on m5-master
  • 18:58 MaxSem: Restarted tilerator and tileratorui across the cluster
  • 18:52 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T164621 (duration: 00m 39s)
  • 18:50 bblack: running varnish frontend restarts to fix memory sizing on 256G+ hosts over the next ~4.5 h (mostly text+upload hosts)
  • 18:49 bblack: cp4006 repooled (frontend restarted)
  • 18:45 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T164498 (duration: 00m 39s)
  • 18:44 bblack: running varnish frontend restarts to fix memory sizing on 96GB and 192GB hosts over the next ~45m (mostly maps+misc hosts)
  • 18:41 catrope@tin: Synchronized php-1.29.0-wmf.21/extensions/Popups/: T163198 (duration: 00m 39s)
  • 18:40 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp4006.ulsfo.wmnet
  • 18:40 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp4006.ulsfo.wmnet
  • 18:38 catrope@tin: Synchronized php-1.29.0-wmf.21/extensions/VisualEditor/extension.json: T164472 (duration: 00m 39s)
  • 18:36 catrope@tin: Synchronized php-1.29.0-wmf.21/extensions/VisualEditor/modules/ve-mw/dm/metaitems/ve.dm.MWFlaggedMetaItem.js: T164054 (duration: 00m 38s)
  • 18:33 catrope@tin: Synchronized php-1.29.0-wmf.21/includes: T100999 (duration: 01m 24s)
  • 18:33 maxsem@tin: Finished deploy [tilerator/deploy@001811e]: 001811e, was in testing for 3 weeks (duration: 00m 20s)
  • 18:32 maxsem@tin: Started deploy [tilerator/deploy@001811e]: 001811e, was in testing for 3 weeks
  • 18:30 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T164614 (duration: 00m 40s)
  • 18:26 catrope@tin: Synchronized static/images/project-logos/: T163048 (duration: 00m 39s)
  • 18:15 gehel: restarting wdqs-updater
  • 17:08 gehel@tin: Finished deploy [wdqs/wdqs@e637cf0]: (no justification provided) (duration: 01m 36s)
  • 17:07 gehel@tin: Started deploy [wdqs/wdqs@e637cf0]: (no justification provided)
  • 16:27 _joe_: installing the new service-checker on restbase2001,scb2001
  • 16:01 papaul: ganeti200[7-8] - signing puppet certs, salt-key, initial run
  • 15:40 papaul: OS install on ganeti200[7-8]
  • 15:28 bblack: cp4016 repooled
  • 14:23 _joe_: uploading new version of service-checker to reprepro
  • 14:20 zeljkof: eu swat finished!
  • 14:19 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: $wmgRelatedArticlesShowInSidebar is now undefined (duration: 00m 39s)
  • 14:19 marostegui: Run pt-table-checksum on s7.arwiki - T163190
  • 14:15 chasemp: touch /forcefsck && /sbin/reboot labservices1002
  • 14:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Bengali logo to mobile site (T164652) (duration: 00m 39s)
  • 14:08 zfilipin@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-bn.svg: SWAT: Add Bengali logo to mobile site (T164652) (duration: 00m 39s)
  • 14:02 zeljkof: extending eu swat for a few minutes
  • 13:55 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: pagePreviews: Fix NavPopups gadget detection (T164044) (duration: 00m 39s)
  • 13:47 chasemp: labservices1002 'touch /forcefsck && sudo reboot'
  • 13:45 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Wikivoyage should show related pages in footer of skin (T164391) (duration: 00m 39s)
  • 13:44 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Wikivoyage should show related pages in footer of skin (T164391) (duration: 00m 39s)
  • 13:42 moritzm: depooled mw1264 (set to inactive), since the host is down (T164725)
  • 13:07 moritzm: restarting cassandra on restbase2001 to pick up openjdk security updates
  • 11:15 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: USe redis lockManager for change dispatching (T159826) (duration: 00m 56s)
  • 11:14 Amir1: start of ladsgroup@tin:/srv/mediawiki-staging$ scap sync-file wmf-config/Wikibase-production.php 'USe redis lockManager for change dispatching (T159826)'
  • 09:54 moritzm: upgrading mw1261-mw1264 to Linux 4.9
  • 09:30 godog: swift eqiad-prod: ms-be1028/ms-be1039 object weight 2000 - T160640
  • 09:25 elukey: rolling restart of cassandra on aqs* hosts to pick up new jvm upgrades
  • 09:17 godog: swift codfw-prod: more ms-be2001/ms-be2012 decom - T162785
  • 08:55 elukey: restart Kafka mirror maker on kafka101[24]
  • 08:47 elukey: reboot kafka1013 for kernel upgrades
  • 08:25 godog: swift eqiad-prod: ms-be1028/ms-be1039 container/account full weight - T160640
  • 08:06 Amir1: clean up party of ores_classification is done now (T159753) 10M rows deleted. Current number of rows: 76,586,043
  • 06:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Decommission db1024 - T162699 (duration: 00m 39s)
  • 06:30 marostegui@tin: Synchronized wmf-config/db-codfw.php: Decommission db1024 - T162699 (duration: 00m 39s)
  • 06:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2045, depool db2038 - T162539 T163548 (duration: 00m 40s)
  • 05:09 Amir1: start of cleaning up ores_classification rows for two hours (T159753)
  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Mon May 8 02:27:37 UTC 2017 (duration 5m 58s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.21) (duration: 08m 25s)

2017-05-07

  • 21:09 elukey: depooled cp4016.ulsfo.wmnet (sudo -i depool from localhost) due to issues with vhtcpd (segfaults in dmesg).
  • 17:20 andrewbogott: clearing out broken instances in the nova fullstack queue and restarting the tests.
  • 17:12 andrewbogott: rebooting labservices1002 in hopes of getting its IO unstuck
  • 16:52 andrewbogott: switching primary designate server from labservices1002 to labservices1001
  • 16:07 andrewbogott: restarted designate-central on labservices1002 due to many log messages like 'Deadlock detected. Retrying...'
  • 16:05 andrewbogott: restarted pdns and pdns-recursor on labcontrol1002
  • 09:08 ema: cp4018: restart vhtcpd and varnish services; repool
  • 08:43 elukey: depooled cp4018.ulsfo.wmnet (sudo -i depool from localhost) due to issues with HTCP)
  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Sun May 7 02:27:14 UTC 2017 (duration 5m 59s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.21) (duration: 07m 49s)

2017-05-06

  • 02:30 l10nupdate@tin: ResourceLoader cache refresh completed at Sat May 6 02:30:10 UTC 2017 (duration 6m 2s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.21) (duration: 07m 38s)

2017-05-05

  • 20:21 demon@tin: Synchronized scap/scap.cfg: no-op (duration: 00m 39s)
  • 18:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 after maintenance (duration: 00m 40s)
  • 18:21 mutante: ocg1002 - apt-get clean'ed for disk space
  • 16:09 jynus: shutting down db1070 for hw maintenance T160392
  • 16:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 for hw maintenance (duration: 00m 39s)
  • 15:30 jynus: running schema change on puppet.fact_values (m1)
  • 15:28 marostegui: Deploy alter table on wikidatawiki.wb_terms - db2045 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 15:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2052, depool db2045 - T162539 T163548 (duration: 00m 41s)
  • 15:18 elukey: increase nginx error log verbosity on mw2146 as test for T163674 (correct task)
  • 15:13 elukey: increase nginx error log verbosity on mw2146 as test for T164586
  • 15:04 bblack: nginx upgrading to 1.11.10-1+wmf1 on cache_upload
  • 14:59 bblack: nginx upgrading to 1.11.10-1+wmf1 on cache_text
  • 14:41 bblack: restarting all maps+misc varnish frontends for mem sizing update (spread over the next ~1.5h)
  • 14:30 bblack: restarting varnish frontend on cp4010 (text) for mem size update
  • 13:45 moritzm: installing remaining freetype security updates
  • 13:40 akosiaris@tin: Finished deploy [librenms/librenms@c0aa3ca]: Deploy WMF specific pages to librenms (duration: 00m 03s)
  • 13:39 akosiaris@tin: Started deploy [librenms/librenms@c0aa3ca]: Deploy WMF specific pages to librenms
  • 13:28 urandom: T163292: bootstrapping Cassandra on restbase1008-c
  • 13:25 chasemp: labstore1005/1004 'dpkg -i /home/jmm/*deb' for rpcbind fix (these are new security packages from mortizm)
  • 12:34 akosiaris@tin: Finished deploy [librenms/librenms@9fa1391]: (no justification provided) (duration: 00m 07s)
  • 12:34 akosiaris@tin: Started deploy [librenms/librenms@9fa1391]: (no justification provided)
  • 12:16 elukey: reboot kafka1018 for kernel upgrades
  • 11:30 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 03s)
  • 11:30 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 11:29 moritzm: installing openjdk-8 security updates/cassandra restarts on restbase staging clusters
  • 11:26 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 02s)
  • 11:26 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 11:17 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 02s)
  • 11:17 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 11:09 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 14s)
  • 11:08 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 11:02 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 23s)
  • 11:02 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 11:01 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 01s)
  • 11:01 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 11:01 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 02s)
  • 11:00 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 11:00 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 03s)
  • 11:00 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 10:58 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 02s)
  • 10:58 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 10:57 akosiaris@tin: Finished deploy [librenms/librenms@b25a5e9]: (no justification provided) (duration: 00m 13s)
  • 10:57 akosiaris@tin: Started deploy [librenms/librenms@b25a5e9]: (no justification provided)
  • 09:00 elukey: re-arm keyholder on mira (new scap key added for librenms)
  • 08:48 elukey: re-arming keyholder on naos
  • 08:46 godog: swift codfw-prod: ms-be2001 - ms-be2012 weight 700 - T162785
  • 07:49 marostegui: Deploy alter table on wikidatawiki.wb_terms - dbstore1002 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 07:11 marostegui: Deploy alter table on wikidatawiki.wb_terms - dbstore2001 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 06:45 ema: starting cache_upload upgrades to varnish 4.1.6-1wm1
  • 05:55 marostegui: Deploy alter table on wikidatawiki.wb_terms - db2052 - T162539 T163548
  • 05:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2059, depool db2052 - T162539 T163548 (duration: 00m 40s)
  • 04:21 mutante: scheduled long downtime for mailman I/O stats on fermium - until we find better ways to deal with the normal spikes causing alerts
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Fri May 5 02:38:35 UTC 2017 (duration 5m 14s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.21) (duration: 08m 29s)
  • 01:26 urandom: T163292: starting bootstrap of restbase1018-b

2017-05-04

  • 20:06 maxsem@tin: Synchronized php-1.29.0-wmf.21/extensions/JsonConfig: https://gerrit.wikimedia.org/r/#/c/351749/ (duration: 00m 40s)
  • 19:01 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T164407 wgCognateReadOnly false for medium wikis (duration: 00m 39s)
  • 18:18 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T164407 wgCognateReadOnly false for small wikis (duration: 00m 40s)
  • 17:32 bblack: nginx upgrading to 1.11.10-1+wmf1 on cache_maps
  • 17:30 thcipriani@tin: Synchronized php-1.29.0-wmf.21/extensions/Cognate: Add stats tracking for CognateRepo method usage (duration: 00m 39s)
  • 17:01 thcipriani@tin: Synchronized wmf-config: Revert revert Enable Cognate for Wiktionary in Read Only mode T164407 (duration: 00m 40s)
  • 16:59 thcipriani@tin: Synchronized php-1.29.0-wmf.21/extensions/Cognate/src/CognateStore.php: Construct DBReadOnlyError with null db (duration: 00m 39s)
  • 16:55 urandom: T163292: Starting bootstrap of restbase1018-a
  • 16:49 thcipriani@tin: Synchronized wmf-config: Revert Enable Cognate for Wiktionary in Read Only mode T164407 (duration: 00m 40s)
  • 16:42 thcipriani@tin: Synchronized wmf-config: Enable Cognate for Wiktionary in Read Only mode T164407 (duration: 00m 40s)
  • 16:29 thcipriani@tin: Synchronized php-1.29.0-wmf.21/extensions/Cognate: SWAT: Add read only mode T164407 (duration: 00m 56s)
  • 16:18 bblack: nginx upgraded to 1.11.10-1+wmf1 on all cache_misc
  • 16:14 thcipriani@tin: Synchronized README: test tin is back (duration: 01m 06s)
  • 16:09 filippo@tin: scap aborted: README (duration: 00m 28s)
  • 16:09 filippo@tin: Started scap: README
  • 16:03 urandom: T160759: restoring default Cassandra tombstone_threshold in eqiad
  • 16:00 godog: switch deployment server back to tin.eqiad.wmnet
  • 15:57 jynus@naos: Synchronized wmf-config/db-eqiad.php: Remove all read traffic from x1, es2 & es3-master-eqiad (duration: 01m 08s)
  • 15:45 oblivian@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=swift-rw,name=codfw
  • 15:45 bblack: nginx upgraded to 1.11.10-1+wmf1 on cp1051 (cache_misc)
  • 15:42 bblack: nginx upgraded to 1.11.10-1+wmf1 on cp1045 (cache_misc)
  • 15:36 godog: run-puppet-agent on cache_upload in codfw/swift for swift a/p in codfw
  • 15:34 chasemp: add cwd to acl*procurement-review for phab S4
  • 15:32 godog: run-puppet-agent on cache_upload in codfw/swift for swift a/a
  • 15:31 oblivian:: Setting swift-rw in eqiad UP
  • 15:31 oblivian:: Setting switft-rw in codfw DOWN
  • 15:16 marostegui: Deploy alter table on wikidatawiki.wb_terms - db2059- https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 15:15 chasemp: labsdb1003 maintain-views --databases ptwikimedia,pawikisourcewbwikimedia,dtywiki --replace-all --debug T164103
  • 15:14 marostegui@naos: Synchronized wmf-config/db-codfw.php: Repool db2066, depool db2059 - T162539 T163548 (duration: 01m 06s)
  • 15:03 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Restore db1070 original weight - T160392 (duration: 00m 57s)
  • 14:46 oblivian:: Setting wdqs in codfw UP
  • 14:44 oblivian:: Setting restbase-async in eqiad DOWN
  • 14:43 oblivian:: Setting restbase in codfw DOWN
  • 14:43 _joe_: forcing a puppet run on cache (text,maps, misc) in eqiad/codfw to complete the switchback
  • 14:40 oblivian:: Setting restbase in eqiad UP
  • 14:39 oblivian:: Setting restbase-async in codfw UP
  • 14:36 moritzm: installing mysql-connector-java security updates on hadoop cluster
  • 14:35 _joe_: running puppet on varnishes in eqiad (text,misc,maps) to pick up the a/a traffic to services
  • 14:29 jynus: dropping and recreating user for maintain-views on labsdb1001 T164103
  • 14:24 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Increase db1070 weight - T160392 (duration: 01m 10s)
  • 14:23 chasemp: maintain-meta_p --databases dtywiki,pawikisource,ptwikimedia,wbwikimedia --debug labsdb1003 for T164103
  • 14:16 chasemp: maintain-meta_p --all-databases --purge --debug labsdb1001 for T164103
  • 14:09 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1070 with less weight - T160392 (duration: 01m 16s)
  • 14:03 chasemp: maintain-meta_p --all-databases --purge --debug labsdb1009/1010/1011 for T164103
  • 13:31 gehel: restart services on maps eqiad
  • 13:21 dereckson@naos: Synchronized wmf-config/throttle.php: Lift Account registration limit for cywiki for an event / T164482 (duration: 01m 08s)
  • 13:18 gehel: restart services on maps codfw
  • 13:15 gehel: restart services on maps-test
  • 12:42 marostegui: Stop MySQL db1070 for maintenance - T160392
  • 12:40 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Depool db1070 for maintenance - T160392 (duration: 01m 35s)
  • 12:28 marostegui: Deploy alter table enwiki.revision on dbstore1001 - T132416
  • 11:56 moritzm: installing mysql-connector-java security updates
  • 11:45 ema: starting cache_text upgrades to varnish 4.1.6-1wm1
  • 11:38 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Remove db1022 from config files as it will be decommissioned - T163778 (duration: 01m 06s)
  • 11:36 marostegui@naos: Synchronized wmf-config/db-codfw.php: Remove db1022 from config files as it will be decommissioned - T163778 (duration: 01m 25s)
  • 10:48 moritzm: installing tomcat security updates
  • 10:22 elukey: executed DEL ocg_job_status on rdb1007:6379 (new ocg_job_status hash is stored on the ocg* hosts) - T159850
  • 10:11 moritzm: restarting hhvm on mediawiki canaries to pick up freetype security update
  • 10:05 ema: restart varnish-be on cp2024 without RT experiment
  • 09:40 elukey: stop kafka on kafka1012 and reboot the host for kernel upgrade
  • 09:16 joal@naos: Finished deploy [analytics/refinery@9d35029]: (no justification provided) (duration: 02m 58s)
  • 09:13 joal@naos: Started deploy [analytics/refinery@9d35029]: (no justification provided)
  • 08:50 marostegui@naos: Synchronized wmf-config/db-codfw.php: Remove db1040 from config files as it will be decommissioned - T164057 (duration: 00m 48s)
  • 08:49 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Remove db1040 from config files as it will be decommissioned - T164057 (duration: 00m 55s)
  • 08:23 gehel: restart elasticsearch on relforge for JDK update
  • 07:59 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Remove tempdb2001 from config files as it will be decommissioned - T161712 (duration: 01m 07s)
  • 07:58 marostegui@naos: Synchronized wmf-config/db-codfw.php: Remove tempdb2001 from config files as it will be decommissioned - T161712 (duration: 01m 25s)
  • 07:25 _joe_: restarted cp3043 backend varnish at 7:13 UTC while trying to debug issues
  • 06:58 moritzm: installing freetype security updates
  • 06:26 marostegui@naos: Synchronized wmf-config/db-codfw.php: Depool tempdb2001, no longer needed - T161712 (duration: 01m 08s)
  • 06:17 marostegui: Stop MySQL on tempdb2001 to take a backup and prepare to decomission - T161712
  • 06:10 marostegui: Deploy alter table on wikidatawiki.wb_terms - db2066 - T162539 T163548
  • 06:10 marostegui@naos: Synchronized wmf-config/db-codfw.php: Depool db2066 - T162539 T163548 (duration: 01m 25s)
  • 06:09 Dereckson: CentralAuth: Removed MediaWiki 2FA for Alexsh (T164265)
  • 06:03 marostegui: Deploy alter table on wikidatawiki.wb_terms - dbstore2002 - T162539 T163548
  • 02:31 l10nupdate@naos: ResourceLoader cache refresh completed at Thu May 4 02:31:22 UTC 2017 (duration 5m 21s)
  • 02:26 l10nupdate@naos: scap sync-l10n completed (1.29.0-wmf.21) (duration: 08m 02s)
  • 02:08 mobrovac@naos: Finished deploy [restbase/deploy@4d04dfd]: blacklist dewiki page, take 3a (duration: 08m 37s)
  • 02:00 urandom: T160759: lowering tombstone threshold to 1000 on all eqiad nodes
  • 01:59 mobrovac@naos: Started deploy [restbase/deploy@4d04dfd]: blacklist dewiki page, take 3a
  • 01:58 mobrovac@naos: Finished deploy [restbase/deploy@4d04dfd]: blacklist dewiki page, take 3 (duration: 03m 29s)
  • 01:54 mobrovac@naos: Started deploy [restbase/deploy@4d04dfd]: blacklist dewiki page, take 3
  • 01:51 mobrovac@naos: Finished deploy [restbase/deploy@4d04dfd]: Blacklist a page on dewiki (duration: 03m 28s)
  • 01:47 mobrovac@naos: Started deploy [restbase/deploy@4d04dfd]: Blacklist a page on dewiki
  • 01:47 mobrovac@naos: Finished deploy [restbase/deploy@4d04dfd]: Blacklist a page on dewiki (duration: 04m 12s)
  • 01:42 mobrovac@naos: Started deploy [restbase/deploy@4d04dfd]: Blacklist a page on dewiki
  • 01:22 urandom_: T160759: lowering tombstone_threshold on restbase1013 & restbase1014
  • 01:09 urandom_: T160759: starting restbase1012-a

2017-05-03

  • 22:59 RainbowSprinkles: gerrit: Quick restart to pick up logging config change
  • 22:47 ejegg: updated fundraising tools from 20afe9d to f2522cd
  • 22:23 ejegg: updated fundraising tools from a1e9342 to 20afe9d
  • 21:06 demon@naos: Synchronized README: No-op, forcing co-master sync (duration: 02m 28s)
  • 20:35 mutante: mw1167 - same as mw1166 (jobrunners) - there was a hhvm[12547]: Fatal error: unknown exception followed by mysql slow query, SELECT MASTER_TID_WAIT... | systemctl restart hhvm recovers it
  • 20:30 mutante: mw1166 - restart hhvm service (Fatal error: request has exceeded memory limit)
  • 20:13 urandom: T160759: restoring default tombstone thresholds, restbase10{3,4,6}
  • 19:57 mutante: mw1287 - also restarting hhvm (with systemctl restart)
  • 19:56 mutante: mw1287 - restarted crashed apache (proxy_fcgi:error)
  • 19:48 demon@naos: Finished scap: Cleaning up some unused branches, no-op (duration: 15m 13s)
  • 19:33 demon@naos: Started scap: Cleaning up some unused branches, no-op
  • 19:32 demon@naos: Pruned MediaWiki: 1.29.0-wmf.18 (duration: 00m 19s)
  • 19:30 demon@naos: Pruned MediaWiki: 1.29.0-wmf.20 [keeping static files] (duration: 00m 44s)
  • 19:27 ppchelko@naos: Finished deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759 attempt #2 - checks timeout (duration: 01m 39s)
  • 19:26 ppchelko@naos: Started deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759 attempt #2 - checks timeout
  • 19:25 ppchelko@naos: Finished deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759 (duration: 07m 39s)
  • 19:18 ppchelko@naos: Started deploy [restbase/deploy@76d909f]: Blacklist a title to fix cassandra OOMs T160759
  • 18:48 papaul: db2084 - signing puppet certs, salt-key, initial run
  • 18:48 urandom: T160759: reducing tombstone threshold to 1000, restbase1014
  • 18:46 urandom: T160759: reducing tombstone threshold to 1000, restbase1016
  • 18:39 urandom: T160759: reducing tombstone threshold to 1000, restbase1013
  • 18:35 urandom: restarting restbase1016-c
  • 18:34 urandom: restarting restbase1013-b
  • 18:00 bblack: restart cp2005 backend (lag)
  • 17:34 moritzm: uploaded openjdk-8 u131 to apt.wikimedia.org
  • 17:14 jynus@naos: Synchronized wmf-config/InitialiseSettings.php: Disable cognate- it is causing an outage on x1 (duration: 01m 06s)
  • 16:30 jynus@naos: Synchronized wmf-config/db-eqiad.php: Fine-tune per-server load to reduce db connection errors (duration: 01m 27s)
  • 16:17 mutante: install2002 / db2084 - reverting live hack, re-enabling puppet. db2084 doesnt even talk to DHCP, all other new db servers are fine, just this one out of 22 is not. seems to be actually broken NIC, cable was switched, switch config was checked too
  • 16:08 mutante: install2002 - temp stop puppet to debug dhcp issue of db2084
  • 15:13 catrope@naos: Synchronized php-1.29.0-wmf.21/includes/logging/LogPager.php: Replace FORCE INDEX(ls_field_val) with IGNORE INDEX(ls_log_id) (https://gerrit.wikimedia.org/r/#/c/351653/ for T17441) (duration: 01m 14s)
  • 15:09 RoanKattouw: Live-hacked (cherry-picked) https://gerrit.wikimedia.org/r/#/c/351653/ onto naos and synced to mwdebug1002 for testing
  • 14:54 gehel: restart of elasticsearch on relforge
  • 14:43 END: (PASS) - Rolling restart of parsoid in codfw and eqiad - t09_restart_parsoid (switchdc/oblivian@neodymium)
  • 14:27 START: - Rolling restart of parsoid in codfw and eqiad - t09_restart_parsoid (switchdc/oblivian@neodymium)
  • 14:26 END: (PASS) - Update Tendril tree to start from the core DB masters in eqiad - t09_tendril (switchdc/oblivian@neodymium)
  • 14:25 START: - Update Tendril tree to start from the core DB masters in eqiad - t09_tendril (switchdc/oblivian@neodymium)
  • 14:25 godog: start swiftrepl on ms-fe1005
  • 14:24 END: (PASS) - Start MediaWiki jobrunners, videoscalers and maintenance in eqiad - t09_start_maintenance (switchdc/oblivian@neodymium)
  • 14:22 START: - Start MediaWiki jobrunners, videoscalers and maintenance in eqiad - t09_start_maintenance (switchdc/oblivian@neodymium)
  • 14:21 END: (PASS) - Restore the TTL of all the MediaWiki read-write discovery records and cleanup confd stale files - t09_restore_ttl (switchdc/oblivian@neodymium)
  • 14:21 START: - Restore the TTL of all the MediaWiki read-write discovery records and cleanup confd stale files - t09_restore_ttl (switchdc/oblivian@neodymium)
  • 14:20 END: (PASS) - Set MediaWiki in read-write mode in eqiad (db-eqiad config already merged and git pulled) - t08_stop_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 14:20 MediaWiki: read-only period ends at: 2017-05-03 14:20:28.286697 (switchdc/oblivian@neodymium)
  • 14:20 root@naos: Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-write mode in datacenter eqiad (duration: 00m 32s)
  • 14:19 START: - Set MediaWiki in read-write mode in eqiad (db-eqiad config already merged and git pulled) - t08_stop_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 14:19 END: (PASS) - Set core DB masters in read-write mode in eqiad, ensure masters in codfw are read-only - t07_coredb_masters_readwrite (switchdc/oblivian@neodymium)
  • 14:19 START: - Set core DB masters in read-write mode in eqiad, ensure masters in codfw are read-only - t07_coredb_masters_readwrite (switchdc/oblivian@neodymium)
  • 14:19 END: (PASS) - Switch the Redis masters from codfw to eqiad and invert the replication - t06_redis (switchdc/oblivian@neodymium)
  • 14:19 START: - Switch the Redis masters from codfw to eqiad and invert the replication - t06_redis (switchdc/oblivian@neodymium)
  • 14:18 END: (PASS) - Switch traffic flow to the appservers from codfw to eqiad - t05_switch_traffic (switchdc/oblivian@neodymium)
  • 14:17 START: - Switch traffic flow to the appservers from codfw to eqiad - t05_switch_traffic (switchdc/oblivian@neodymium)
  • 14:16 END: (FAIL) - Switch MediaWiki master datacenter and read-write discovery records from codfw to eqiad - t05_switch_datacenter (switchdc/oblivian@neodymium)
  • 14:16 root@naos: Synchronized wmf-config/CommonSettings.php: Switch MediaWiki active datacenter to eqiad (duration: 00m 31s)
  • 14:15 START: - Switch MediaWiki master datacenter and read-write discovery records from codfw to eqiad - t05_switch_datacenter (switchdc/oblivian@neodymium)
  • 14:15 END: (PASS) - Wipe and warmup caches in eqiad - t04_cache_wipe (switchdc/oblivian@neodymium)
  • 14:12 elukey: restart kafka-mirror-main-eqiad_to_analytics.service on kafka1012
  • 14:12 END: (PASS) - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium)
  • 14:09 START: - Wipe and warmup caches in eqiad - t04_cache_wipe (switchdc/oblivian@neodymium)
  • 14:08 START: - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium)
  • 14:08 END: (PASS) - Set core DB masters in read-only mode in codfw, ensure all masters are read-only - t03_coredb_masters_readonly (switchdc/oblivian@neodymium)
  • 14:08 START: - Set core DB masters in read-only mode in codfw, ensure all masters are read-only - t03_coredb_masters_readonly (switchdc/oblivian@neodymium)
  • 14:08 END: (PASS) - Set MediaWiki in read-only mode in codfw (db-codfw config already merged and git pulled) - t02_start_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 14:07 root@naos: Synchronized wmf-config/db-codfw.php: Set MediaWiki in read-only mode in datacenter codfw (duration: 00m 45s)
  • 14:07 MediaWiki: read-only period starts at: 2017-05-03 14:07:08.261300 (switchdc/oblivian@neodymium)
  • 14:07 START: - Set MediaWiki in read-only mode in codfw (db-codfw config already merged and git pulled) - t02_start_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 14:06 END: (PASS) - Stop MediaWiki jobrunners, videoscalers and cronjobs in codfw - t01_stop_maintenance (switchdc/oblivian@neodymium)
  • 14:01 START: - Stop MediaWiki jobrunners, videoscalers and cronjobs in codfw - t01_stop_maintenance (switchdc/oblivian@neodymium)
  • 14:00 godog: stop swiftrepl on ms-fe1005
  • 13:59 END: (PASS) - Reduce the TTL of all the MediaWiki read-write discovery records - t00_reduce_ttl (switchdc/oblivian@neodymium)
  • 13:59 START: - Reduce the TTL of all the MediaWiki read-write discovery records - t00_reduce_ttl (switchdc/oblivian@neodymium)
  • 13:59 END: (PASS) - Disabling puppet on selected hosts in codfw and eqiad - t00_disable_puppet (switchdc/oblivian@neodymium)
  • 13:58 START: - Disabling puppet on selected hosts in codfw and eqiad - t00_disable_puppet (switchdc/oblivian@neodymium)
  • 13:16 hashar: Restarting Jenkins
  • 13:06 marostegui: db1028: Increased /srv/ by 20G to clear the warning
  • 11:59 moritzm: rebooted kubernetes1002, not 1003
  • 11:59 moritzm: rebooting kubernetes1003 for update to Linux 4.9
  • 11:39 moritzm: rebooting kubernetes1001 for update to Linux 4.9
  • 11:37 oblivian@naos: Synchronized wmf-config: Changing the read-only reason for the DC switchover (T164177) (duration: 01m 20s)
  • 11:25 moritzm: uploaded nodepool 0.1.1+wmf7 to apt.wikimedia.org
  • 11:23 hashar: Upgrading Jenkins 2.46.1 -> 2.46.2 - T144106
  • 11:16 jynus: restarting replication on s*, and x1 eqiad -> codfw
  • 11:02 hashar: Restarting Nodepool
  • 10:58 moritzm: upgrading nodepool on labnodepool1001 to a package including https://gerrit.wikimedia.org/r/351608
  • 10:18 END: (PASS) - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/oblivian@neodymium)
  • 10:17 START: - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/oblivian@neodymium)
  • 10:14 END: (PASS) - Set MediaWiki in read-write mode in codfw (db-codfw config already merged and git pulled) - t08_stop_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 10:14 START: - Set MediaWiki in read-write mode in codfw (db-codfw config already merged and git pulled) - t08_stop_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 10:14 END: (PASS) - Set MediaWiki in read-only mode in eqiad (db-eqiad config already merged and git pulled) - t02_start_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 10:13 START: - Set MediaWiki in read-only mode in eqiad (db-eqiad config already merged and git pulled) - t02_start_mediawiki_readonly (switchdc/oblivian@neodymium)
  • 10:13 _joe_: testing reverted steps of switchdc, non-dry-run --dc-from eqiad --dc-to codfw (should be noop)
  • 10:05 moritzm: installing icu security updates on trusty (jessie already fixed)
  • 09:50 marostegui: Restart db1097 to change its binlog to STATEMENT - T155099
  • 09:19 elukey: reboot mc[1019-1036].eqiad.wmnet for kernel upgrades
  • 09:18 moritzm: rebooting restbase1018 for update to Linux 4.9
  • 09:05 godog: rebuild mismounted FSes on ms-be1035 - T163673
  • 08:53 _joe_: rebooting restbase1018 T163280
  • 08:24 _joe_: deactivating restbase1018-vg for RAID failover and rebuild T163280
  • 08:01 hashar: Rolling back Jenkins 2.46.2 -> 2.46.1 - T144106
  • 07:53 hashar: Upgrading Jenkins 2.46.1 -> 2.46.2 - T144106
  • 07:42 _joe_: rebuilding RAIDs on restbase1018 T163280
  • 07:35 hashar: Restarting Nodepool to catch up with python-jenkins 0.4.14
  • 07:35 moritzm: updated python-jenkins on labnodepool1001 to 0.4.14 (needed by latest Jenkins LTS)
  • 02:48 l10nupdate@naos: ResourceLoader cache refresh completed at Wed May 3 02:48:33 UTC 2017 (duration 5m 21s)
  • 02:43 l10nupdate@naos: scap sync-l10n completed (1.29.0-wmf.21) (duration: 14m 02s)
  • 01:41 mutante: kubernetes - puppet fails because "E: Unable to locate package cni

2017-05-02

  • 23:42 TimStarling: EtcdConfig changes all reverted
  • 23:17 tstarling@puppetmaster1001: conftool action : set/@read-only.yaml; selector: name=ReadOnly,scope=eqiad
  • 23:07 TimStarling: scap pull on mw2017 and mwdebug1001 for etcd testing
  • 23:00 TimStarling: locking scap on naos for deployment of EtcdConfig https://gerrit.wikimedia.org/r/#/c/351132/
  • 22:57 _joe_: upgrading python-conftool across the fleet
  • 22:38 mutante: gerrit (cobalt/gerrit2001) - deployed firewall change to allow ssh between gerrit servers for clustering, new iptables rules exist now (T152525)
  • 21:52 jynus: running previously failed alter tables on s3-eqiad T163912
  • 21:33 jynus: creating missing math table on bdwikimedia (s3)
  • 20:04 hashar: Restarting Jenkins for plugin rollback
  • 17:51 bblack: codfw->eqiad switchback: end-user edge traffic back to normal @ eqiad ( https://gerrit.wikimedia.org/r/#/c/351330/ ) - 10 minute TTL for bulk traffic pattern shift starts now.
  • 17:50 mobrovac@naos: Finished deploy [restbase/deploy@6adb0f2]: Include displaytitle and page_id in the summary output and bump the content type version - T163729 T164079 (duration: 06m 04s)
  • 17:48 papaul: new db servers signing puppet certs,salt-key, initial run
  • 17:44 mobrovac@naos: Started deploy [restbase/deploy@6adb0f2]: Include displaytitle and page_id in the summary output and bump the content type version - T163729 T164079
  • 17:40 END: (PASS) - Start MediaWiki jobrunners, videoscalers and maintenance in codfw - t09_start_maintenance (switchdc/volans@neodymium)
  • 17:39 mobrovac@naos: Finished deploy [restbase/deploy@6adb0f2]: (no justification provided) (duration: 01m 34s)
  • 17:38 START: - Start MediaWiki jobrunners, videoscalers and maintenance in codfw - t09_start_maintenance (switchdc/volans@neodymium)
  • 17:37 mobrovac@naos: Started deploy [restbase/deploy@6adb0f2]: (no justification provided)
  • 17:37 END: (PASS) - Restore the TTL of all the MediaWiki read-write discovery records and cleanup confd stale files - t09_restore_ttl (switchdc/volans@neodymium)
  • 17:37 START: - Restore the TTL of all the MediaWiki read-write discovery records and cleanup confd stale files - t09_restore_ttl (switchdc/volans@neodymium)
  • 17:35 END: (PASS) - Set MediaWiki in read-write mode in codfw - t08_stop_mediawiki_readonly (switchdc/volans@neodymium)
  • 17:35 MediaWiki: read-only period ends at: 2017-05-02 17:35:48.111079 (switchdc/volans@neodymium)
  • 17:35 START: - Set MediaWiki in read-write mode in codfw - t08_stop_mediawiki_readonly (switchdc/volans@neodymium)
  • 17:35 oblivian@puppetmaster1001: conftool action : set/val=test; selector: name=ReadOnly,scope=codfw
  • 17:33 END: (PASS) - Set core DB masters in read-write mode in codfw, ensure masters in eqiad are read-only - t07_coredb_masters_readwrite (switchdc/volans@neodymium)
  • 17:33 START: - Set core DB masters in read-write mode in codfw, ensure masters in eqiad are read-only - t07_coredb_masters_readwrite (switchdc/volans@neodymium)
  • 17:32 END: (PASS) - Switch the Redis masters from eqiad to codfw and invert the replication - t06_redis (switchdc/volans@neodymium)
  • 17:32 START: - Switch the Redis masters from eqiad to codfw and invert the replication - t06_redis (switchdc/volans@neodymium)
  • 17:31 END: (PASS) - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/volans@neodymium)
  • 17:31 START: - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/volans@neodymium)
  • 17:23 END: (FAIL) - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/volans@neodymium)
  • 17:23 START: - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/volans@neodymium)
  • 17:20 END: (PASS) - Switch traffic flow to the appservers from eqiad to codfw - t05_switch_traffic (switchdc/volans@neodymium)
  • 17:17 START: - Switch traffic flow to the appservers from eqiad to codfw - t05_switch_traffic (switchdc/volans@neodymium)
  • 17:08 END: (FAIL) - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/volans@neodymium)
  • 17:08 START: - Switch MediaWiki master datacenter and read-write discovery records from eqiad to codfw - t05_switch_datacenter (switchdc/volans@neodymium)
  • 17:05 catrope@naos: Synchronized php-1.29.0-wmf.21/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTarget.js: T164157 (duration: 01m 00s)
  • 17:03 END: (FAIL) - Set core DB masters in read-only mode in eqiad, ensure all masters are read-only - t03_coredb_masters_readonly (switchdc/volans@neodymium)
  • 17:03 START: - Set core DB masters in read-only mode in eqiad, ensure all masters are read-only - t03_coredb_masters_readonly (switchdc/volans@neodymium)
  • 16:58 END: (FAIL) - Set MediaWiki in read-only mode in eqiad - t02_start_mediawiki_readonly (switchdc/volans@neodymium)
  • 16:57 MediaWiki: read-only period starts at: 2017-05-02 16:57:37.952132 (switchdc/volans@neodymium)
  • 16:57 START: - Set MediaWiki in read-only mode in eqiad - t02_start_mediawiki_readonly (switchdc/volans@neodymium)
  • 16:56 ppchelko@naos: Finished deploy [restbase/deploy@6adb0f2]: Summary endpoint enhancements. Restart after a check timeout (duration: 07m 56s)
  • 16:53 END: (FAIL) - Stop MediaWiki jobrunners, videoscalers and cronjobs in eqiad - t01_stop_maintenance (switchdc/volans@neodymium)
  • 16:53 START: - Stop MediaWiki jobrunners, videoscalers and cronjobs in eqiad - t01_stop_maintenance (switchdc/volans@neodymium)
  • 16:52 END: (PASS) - Disabling puppet on selected hosts in eqiad and codfw - t00_disable_puppet (switchdc/volans@neodymium)
  • 16:51 START: - Disabling puppet on selected hosts in eqiad and codfw - t00_disable_puppet (switchdc/volans@neodymium)
  • 16:51 END: (PASS) - Reduce the TTL of all the MediaWiki read-write discovery records - t00_reduce_ttl (switchdc/volans@neodymium)
  • 16:50 START: - Reduce the TTL of all the MediaWiki read-write discovery records - t00_reduce_ttl (switchdc/volans@neodymium)
  • 16:50 END: (FAIL) - Reduce the TTL of all the MediaWiki read-write discovery records - t00_reduce_ttl (switchdc/volans@neodymium)
  • 16:50 START: - Reduce the TTL of all the MediaWiki read-write discovery records - t00_reduce_ttl (switchdc/volans@neodymium)
  • 16:48 ppchelko@naos: Started deploy [restbase/deploy@6adb0f2]: Summary endpoint enhancements. Restart after a check timeout
  • 16:47 volans: testing (not dry-run) tasks for tomorrow's switchover in reverse mode eqiad->codfw
  • 16:43 ppchelko@naos: Started deploy [restbase/deploy@6adb0f2]: Summary endpoint enhancements. Restart after a check fail
  • 16:42 ppchelko@naos: Finished deploy [restbase/deploy@6adb0f2]: Summary endpoint enhancements (duration: 05m 47s)
  • 16:37 ppchelko@naos: Started deploy [restbase/deploy@6adb0f2]: Summary endpoint enhancements
  • 16:36 END: (PASS) - Wipe and warmup caches in codfw - t04_cache_wipe (switchdc/oblivian@neodymium)
  • 16:32 END: (PASS) - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium)
  • 16:32 _joe_: message about cache warmup is wrong, it is being executed in eqiad
  • 16:29 START: - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium)
  • 16:29 START: - Wipe and warmup caches in codfw - t04_cache_wipe (switchdc/oblivian@neodymium)
  • 16:29 _joe_: testing (not dry-run) cache wipe/warmup and redis resync for the switchover codfw->eqiad
  • 16:25 papaul: OS install on new db servers
  • 16:16 elukey@naos: Synchronized wmf-config/ProductionServices.php: Replace Redis lock IPs after hw refresh (duration: 01m 16s)
  • 15:53 oblivian@puppetmaster1001: conftool action : set/@read-only.yaml; selector: name=ReadOnly,scope=eqiad
  • 15:36 ema: cache_misc: upgrade varnish to 4.1.6-1wm1
  • 15:24 _joe_: restarting confd in eqiad/esams to pick up the server change
  • 15:20 godog: add 100G to graphite1003 and graphite2002
  • 15:01 elukey: stop and masked memcached on mc10[01-18].eqiad.wmnet
  • 14:35 moritzm: rebooting rdb1007 for update to latest 4.4 kernel
  • 14:22 moritzm: rebooting rdb1005 for update to latest 4.4 kernel
  • 13:52 moritzm: rebooting rdb1003 for update to latest 4.4 kernel
  • 13:39 moritzm: rebooting rdb1001 for update to latest 4.4 kernel
  • 13:26 gehel: stopping load on elastic2020 - T149006
  • 13:15 ema: cache_maps: upgrade varnish to 4.1.6-1wm1
  • 13:13 gehel: load testing elastic2020 before putting it back in the cluster - T149006
  • 13:03 godog: rebuild mismounted FSes on ms-be1036 - T163673
  • 12:22 moritzm: rebooting rdb1008 for kernel update to Linux 4.9
  • 12:19 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=pdf,name=ocg1001.eqiad.wmnet
  • 12:15 _joe_: manually set ocg1001,3 to be redis slaves of ocg1002
  • 11:47 moritzm: rebooting rdb1006 for kernel update to Linux 4.9
  • 11:37 gehel: restart of relforge cluster to activate hebrew plugin
  • 11:30 moritzm: rebooting rdb1004 for kernel update to Linux 4.9
  • 11:23 hashar: Restarting Nodepool
  • 11:23 moritzm: downgraded python-jenkins on labnodepool1001 to 0.2.1 (0.4.11 is still broken with the new Jenkins LTS)
  • 11:06 moritzm: rebooting rdb1002 for kernel update to Linux 4.9
  • 10:51 hashar: Restarting Nodepool with python-jenkins 0.4.11
  • 10:50 moritzm: upgrading python-jenkins on labnodepool1001 to 0.4.11
  • 10:44 akosiaris: create new ganeti nodegroup called row_A holding ganeti2005, ganeti2006. Renamed the default nodegroup to row_B. T164011
  • 10:20 elukey: restart ocg on ocg1002 (localhost:8000 - frontend - not reachable)
  • 10:12 hashar: Upgrading Jenkins to 2.46.1 - T144106
  • 10:11 jynus: stopping replication on db1015
  • 09:58 END: (PASS) - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium)
  • 09:56 START: - Resync the redis for jobqueues in eqiad with the masters in codfw - t04_resync_redis (switchdc/oblivian@neodymium)
  • 09:55 _joe_: testing pre-switchover the step to restart & resync redises in dc_to (eqiad)
  • 09:48 jynus@naos: Synchronized wmf-config/db-codfw.php: Add db1097 (duration: 01m 00s)
  • 09:47 jynus@naos: Synchronized wmf-config/db-eqiad.php: Depool db1015 & add db1097 (duration: 01m 17s)
  • 09:36 hashar: Jenkins/CI is back up!
  • 09:34 hashar: Nodepool can not add instances to Jenkins any more. Roll backing Jenkins to 2.32.3
  • 09:29 akosiaris: Set description for ganeti2005, ganeti2006 on asw-a-codfw. T164011
  • 09:27 akosiaris: create interface range ganeti on asw-a-codfw. T164011
  • 09:24 akosiaris: remove configuration from ge-8/0/0, ge-8/0/3 from asw-b-codfw for ganeti2005, ganeti2006 move to row A. T164011
  • 09:21 hashar: Starting Nodepool
  • 09:16 hashar: Stopping Nodepool
  • 09:14 hashar: OpenStack / wmflabs fails to create new instances
  • 08:40 hashar: Upgrading Jenkins to 2.46.2 - T144106
  • 08:40 elukey: run puppet and restart nutcracker on eqiad hosts with profile::mediawiki::nutcracker
  • 08:33 hashar: Upgrading Jenkins to 2.32.3 - T144106
  • 08:32 elukey: stop and mask redis on mc1001-mc1018 - T137345
  • 08:26 hashar: Upgrading Jenkins to 2.19.4 - T144106
  • 08:14 hashar: Installing Jenkins Pipeline plugin
  • 08:04 hashar: Installing Jenkins plugin Pipeline: Stage View https://plugins.jenkins.io/pipeline-stage-view
  • 08:04 hashar: Upgrading Jenkins to 2.7.4 - T144106
  • 07:59 elukey: Swap mc1001->mc1012 with mc1019->mc2030 - T137345 (more informative :)
  • 07:58 elukey: wap mc1001->mc1012 with mc1019->mc2030
  • 07:36 _joe_: starting etcd replication codfw => eqiad
  • 06:46 _joe_: disabling etcd auth on conf1*, converting to use nginx for TLS/auth T159687
  • 03:10 mattflaschen@naos: Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs/: Urgent deploy: Fix FlaggedRevs fatal, and also a filter issue: T164096 and T164049 (duration: 00m 56s)
  • 02:45 tstarling@naos: Synchronized php-1.29.0-wmf.21/includes/config/EtcdConfig.php: EtcdConfig backported bug fixes (duration: 01m 02s)
  • 02:34 tstarling@naos: Synchronized wmf-config/CommonSettings.php: siteinfo hook (duration: 02m 39s)
  • 00:33 tstarling@puppetmaster1001: conftool action : set/@read-write.yaml; selector: name=ReadOnly
  • 00:33 tstarling@puppetmaster1001: conftool action : set/@dc-codfw.yaml; selector: name=WMFMasterDatacenter
  • 00:25 TimStarling: populating production etcd with initial mediawiki config keys

2017-05-01

  • 23:41 mutante: netmon1002 - signed puppet cert, initial puppet run, accept salt-key,.. (T159756)
  • 23:15 mutante: netmon1002 - boot into PXE, initial OS install (T159756)
  • 23:06 bd808: Ran puppet cert clean striker-deploy03.striker.eqiad.wmflabs on labcontrol1001
  • 19:43 ejegg: updated payments-wiki from 4c56302 to 57451de
  • 19:10 mobrovac@naos: Finished deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version (duration: 02m 08s)
  • 19:08 mobrovac@naos: Started deploy [mobileapps/deploy@b5afcb8]: Forced deploy to bring the targets to the current version
  • 18:46 mutante: temp. re-enabling puppet on restbase1018 and running it once to fix icinga config syntax error. then disabling it again. restbase service stopped before and after. this box has a broken disk.
  • 18:35 mutante: brought mc1018 back up, ran puppet on it and then on Icinga. parent was adjusted from asw-d-eqiad to asw2-2-eqiad. reduced icinga config errors by 50% :p (1 of 2 left, restbase1018)
  • 18:28 mutante: powercycling mc1018
  • 18:19 mutante: manually removed asw-d-eqiad remnants from /etc/icinga/puppet_hosts.cfg to fix icinga config after gerrit:351167 / T148506. fixes Icinga config error. then puppet adds it back
  • 18:03 andrewbogott: restarting nova-fullstack tests but saving instance 2d60e8c5-fb2a-4681-ac0a-ae2162bb13fb for future research
  • 17:03 mutante: phab2001 - start/stop phd service - that fixed "systemd state" icinga check, even though phd does not run just like before
  • 16:53 bblack: reverting inter-caching routing from codfw-switchover period: https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchback
  • 16:52 bblack@neodymium: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=cache_upload,name=cp107[1234].eqiad.wmnet
  • 16:19 mobrovac@naos: Finished deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514 (duration: 02m 19s)
  • 16:17 mobrovac@naos: Started deploy [citoid/deploy@747777f]: Remove mwDeprecated - T93514
  • 15:46 jynus: shutting down db1063 for maintenance T164107
  • 15:13 bblack: restarting varnish backend on cp2002 (mailbox issues)
  • 12:58 Amir1: cleaning ores_classification rows half an hour or so (T159753)
  • 11:31 jynus: running alter table on categorylinks on db1054, 68, 62 T164185
  • 11:25 jynus: running alter table on enwiki.categorylinks on db1052 T164185
  • 03:46 tstarling@naos: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/347537/ (duration: 01m 01s)
  • 03:44 tstarling@naos: Synchronized wmf-config/etcd.php: https://gerrit.wikimedia.org/r/#/c/347537/ (duration: 02m 39s)

2017-04-30

  • 16:35 urandom: T160759: Restoring default tombstone_threshold on restbase1009
  • 16:29 ppchelko@naos: Finished deploy [restbase/deploy@4f96ae3]: Blacklist a zhwiki page that's causing issues (duration: 07m 27s)
  • 16:21 ppchelko@naos: Started deploy [restbase/deploy@4f96ae3]: Blacklist a zhwiki page that's causing issues
  • 15:31 elukey: set tombstone_failure_threshold=1000 to restbase1009-a with P5165 on restbase1009-a - T160759
  • 15:24 elukey: set tombstone_failure_threshold=10000 to restbase1009-a with P5165 on restbase1009-a - T160759
  • 07:45 elukey: deleted /srv/cassandra-a/commitlog/CommitLog-5-1490738321543.log from restbase1009-a (empty commit log file created before OOM - backup in /home/elukey)

2017-04-29

  • 10:50 elukey: set sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 to kafka[1018,1020,1022].eqiad.wmnet (was 120 - maybe related to T136094 ?)
  • 10:39 elukey: start ferm on kafka1020/18 (nodes were previously down for maintenance, not sure why ferm wasn't started)
  • 09:59 reedy@naos: Synchronized wmf-config/CommonSettings.php: Revert pdf processor firejails T164045 (duration: 02m 41s)

2017-04-28

  • 21:24 Dereckson: End of live debug on mwdebug1001, restored previous state with a local scap pull
  • 21:00 ejegg: updated payments-wiki from 1620b82 to 4c56302
  • 20:23 Dereckson: Live debug on mwdebug1001 for T164059
  • 19:30 jynus: shutting down db1063 - I see high temperatures reported, and going up T164107
  • 19:09 urandom: T163936: reenabling puppet on restbase-dev1001
  • 18:14 urandom: T163936: disabling puppet on restbase-dev1001 (t-shooting c-m-c)
  • 17:09 jynus: restarting replication on all nodes on s7-eqiad T164092
  • 16:38 jynus: stopping replication on all nodes on s7-eqiad in case db1062 boots up in a corrupted state
  • 16:36 jynus: restarting db1062 once more T164092
  • 15:56 godog: poweroff prometheus1004 for ram upgrade - T163385
  • 15:40 jynus: deploying new events_coredb_slave.sql on codfw T160984
  • 15:21 godog: poweroff prometheus1003 for ram upgrade - T163385
  • 14:55 gehel: shutting down elastic2020 for mainboard replacement - T149006
  • 14:32 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Change db1063 IP and rack - T163895 (duration: 00m 48s)
  • 14:31 marostegui@naos: Synchronized wmf-config/db-codfw.php: Change db1063 IP and rack - T163895 (duration: 00m 50s)
  • 14:04 marostegui: Stop and shutdown db1063 - T163895
  • 14:04 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Change db1062 rack location - T163895 (duration: 00m 52s)
  • 13:59 moritzm: installing ghostscript security updates
  • 13:56 urandom: T163936: restarting cassandra-metrics-collector, restbase production
  • 13:55 urandom: $ readlink /usr/local/lib/cassandra-metrics-collector/cassandra-metrics-collector.jar
  • 13:50 ema: varnish 4.1.6-1wm1 uploaded to apt.w.o
  • 13:46 urandom: T163936: restarting cassandra-metrics-collector on restbase1007
  • 13:46 marostegui@naos: Synchronized wmf-config/db-codfw.php: Change db1061 IP - T163895 (duration: 01m 00s)
  • 13:44 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Change db1061 IP - T163895 (duration: 01m 19s)
  • 13:44 urandom: T163936: forcing puppet run on restbase1007
  • 13:30 marostegui: Stop MySQL and shutdown db1061 - T163895
  • 13:26 marostegui: Stop MySQL and shutdown db1062 - T163895
  • 10:47 akosiaris: migrate/evacuate ganeti2005, ganeti2006 for T164011
  • 10:42 akosiaris: reboot oresrdb1002 for kernel upgrade
  • 09:56 moritzm: installing libxslt security updates on trusty
  • 09:29 marostegui: upgrade mariadb db1059,db1056 from 10.0.22 to 10.0.28
  • 09:17 marostegui: upgrade mariadb db1071 from 10.0.23 to 10.0.28
  • 09:15 akosiaris: reboot oresrdb1001 for kernel upgrade
  • 09:02 marostegui: Upgrade mariadb on db1081 and db1084 from 10.0.23 to 10.0.28
  • 08:03 Amir1: cleanup done, 4M rows deleted (T159753)
  • 07:58 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1045 - T162539 T163548 (duration: 02m 38s)
  • 06:48 Amir1: cleaning around 5-10M rows in ores_classification in enwiki (half-an-hour script, T159753)
  • 01:18 ejegg: rolled payments-wiki back to 1620b82
  • 01:15 ejegg: udated payments-wiki from 1620b82 to 4c56302

2017-04-27

  • 23:36 catrope@naos: Synchronized php-1.29.0-wmf.21/extensions/SecurePoll/includes/pages/CreatePage.php: Stop gap for fix global election creation (T164043) (duration: 00m 43s)
  • 23:34 catrope@naos: Synchronized wmf-config/InitialiseSettings.php: Enable WikidataPageBanner on viwikivoyage (T163662) (duration: 00m 46s)
  • 23:29 ejegg: rolled back payments-wiki to 1620b82
  • 23:29 catrope@naos: Synchronized wmf-config/InitialiseSettings.php: Enable responsive references on elwiki (T163074) (duration: 00m 49s)
  • 23:27 ejegg: udated payments-wiki from 1620b82 to 4c56302
  • 23:22 catrope@naos: Synchronized wmf-config/InitialiseSettings.php: Set ORES thresholds in new format for all enabled wikis (T162760) (duration: 00m 53s)
  • 23:16 catrope@naos: Synchronized php-1.29.0-wmf.21/includes/deferred/LinksUpdate.php: Release prior row locks beforehand in LinksUpdate::updateCategoryCounts (T163801) (duration: 01m 01s)
  • 23:13 catrope@naos: Synchronized wmf-config/CirrusSearch-common.php: Enable sistersearch title profile for wikivoyage (duration: 01m 19s)
  • 21:57 cwd: updated process-control to 1.0.6
  • 21:56 volans: shutting down gadolinium, it came up 1h25m ago and stole the public IP from meitnerium
  • 21:08 ppchelko@naos: Finished deploy [restbase/deploy@61c1ceb]: Automatically rerender parsoid, only store summaries if they are changed, don't rerender data-parsoid (duration: 07m 16s)
  • 21:01 ppchelko@naos: Started deploy [restbase/deploy@61c1ceb]: Automatically rerender parsoid, only store summaries if they are changed, don't rerender data-parsoid
  • 20:53 ppchelko@naos: Finished deploy [restbase/deploy@fcfc537]: Automatically rerender parsoid, only store summaries if they are changed (duration: 11m 33s)
  • 20:53 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.21
  • 20:47 twentyafterfour@naos: Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs: deploy fix for T163994 (duration: 01m 17s)
  • 20:42 ppchelko@naos: Started deploy [restbase/deploy@fcfc537]: Automatically rerender parsoid, only store summaries if they are changed
  • 20:37 mutante: ocg1001 - has been reinstalled but ocg package deployment fails currently "has the minion key been accepted", should not be repooled just yet
  • 20:32 mutante: ores/cache::misc: switch ores back to codfw-only - everything is like it was before the failed deploy yesterday again
  • 20:21 andrewbogott: stripping a bunch of unneeded extensions from wikitech-static
  • 20:20 mutante: ocg1001 - re-added to puppet, initial run, reinstall ongoing (T161158)
  • 20:18 mutante: ores is active/active now, for a short time
  • 20:16 mutante: ocg1001 - revoke old puppet cert, salt key
  • 20:15 mutante: run puppet on cache::misc to push ores change - cumin -b 5 -s 10 'R:class = role::cache::misc' 'run-puppet-agent -q'
  • 20:03 twentyafterfour: 1.29.0-wmf.21 is blocked by T163994
  • 20:01 mutante: ocg1001 - reboot into PXE, re-install
  • 19:59 twentyafterfour@naos: Synchronized php-1.29.0-wmf.21/extensions/FlaggedRevs/frontend/FlaggedRevsUI.hooks.php: deploy fix for T163994 (duration: 01m 04s)
  • 19:33 twentyafterfour: start mediawiki deployment train group 2 - all wikis to 1.29.0-wmf.21
  • 19:24 reedy@naos: Synchronized wmf-config/CommonSettings.php: Run pdf processors in firejails T164000 (duration: 01m 20s)
  • 19:20 XenoRyet: Updated paymentswiki from ee7d402 to 1620b82
  • 18:47 addshore: Morning SWAT Done!
  • 18:46 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT WMDE Spring campaign - Remove logging (no longer needed) (duration: 00m 47s)
  • 18:44 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT wmgUseGettingStarted true for dewiki (duration: 00m 48s)
  • 18:41 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT Enable Cognate Logging (duration: 00m 48s)
  • 18:40 XenoRyet: Roll back paymentswiki from 030b2f9 to ee7d402
  • 18:34 addshore@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch: SWAT #1 #2 (duration: 00m 59s)
  • 18:31 addshore@naos: Synchronized wmf-config/CirrusSearch-common.php: SWAT update name of sistersearch profile for wikivoyage (duration: 00m 49s)
  • 18:24 addshore@naos: Synchronized php-1.29.0-wmf.21/extensions/WikimediaEvents/WikimediaEventsHooks.php: SWAT WMDE Spring campaign - Remove hook PT2/2 (duration: 00m 52s)
  • 18:23 urandom: T163936: restarting cassandra-metrics-collector, restbase production
  • 18:22 addshore@naos: Synchronized php-1.29.0-wmf.21/extensions/WikimediaEvents/extension.json: SWAT WMDE Spring campaign - Remove hook PT1/2 (duration: 00m 57s)
  • 18:21 urandom: T163936: restarting cassandra-metrics-collector, restbase staging
  • 18:20 addshore@naos: Synchronized php-1.29.0-wmf.21/includes/api/ApiQueryPagePropNames.php: SWAT Do not add limit to ApiQueryPagePropNames when database type is mysql (duration: 01m 04s)
  • 18:17 twentyafterfour: restarting apache on iridium to hotfix T164005
  • 18:07 addshore@naos: Synchronized wmf-config/Wikibase-production.php: SWAT Fix echoIcon for wikibase in testwikis (duration: 01m 27s)
  • 17:44 XenoRyet: Updated paymentswiki from ee7d402 to 030b2f9
  • 17:36 ladsgroup@naos: Finished deploy [ores/deploy@68cca85]: (no justification provided) (duration: 21m 50s)
  • 17:30 _joe_: started pybal on lvs1006 after network was fixed
  • 17:25 XenoRyet: reverted paymentswiki from 030b2f9 to ee7d402
  • 17:20 XenoRyet: Updated paymentswiki from ee7d402 to 030b2f9
  • 17:15 ladsgroup@naos: Started deploy [ores/deploy@68cca85]: (no justification provided)
  • 17:15 Amir1: ladsgroup@naos:/srv/deployment/ores/deploy$ scap deploy (T163950)
  • 17:12 demon@naos: Pruned MediaWiki: 1.29.0-wmf.18 [keeping static files] (duration: 00m 20s)
  • 17:08 _joe_: stop pybal on lvs1006 to stop announcing via BGP
  • 17:08 demon@naos: Pruned MediaWiki: 1.29.0-wmf.16 (duration: 00m 13s)
  • 17:04 demon@naos: Synchronized scap/plugins/clean.py: One last fix (duration: 01m 04s)
  • 16:53 gehel: unbanning all elasticsearch servers in eqiad row D - T148506
  • 16:48 demon@naos: Synchronized scap/plugins/clean.py: --keep-static is nice now. Also need a co-master sync (duration: 01m 28s)
  • 16:45 andrewbogott: re-enabling labs instance creation/deletion
  • 16:42 demon@naos: Pruned MediaWiki: 1.29.0-wmf.19 [keeping static files] (duration: 00m 15s)
  • 16:32 gehel: unbanning elasticsearch servers in eqiad row D - elastic10(17|18|19|20) - T148506
  • 15:56 elukey: restart of jmxtrans on all the hadoop worker nodes
  • 15:51 andrewbogott: disabling labs instance create/delete to avoid hilarity during network maintenance
  • 15:50 elukey: forced 'service ferm start' on the failed analytics hosts
  • 15:46 marostegui: Upgrade db1091 mariadb from 10.0.23 to 10.0.28
  • 15:39 marostegui: Upgrade db1089 mariadb from 10.0.23 to 10.0.28
  • 15:34 marostegui: Upgrade db1090 mariadb from 10.0.23 to 10.0.28
  • 15:22 jynus: stopping all replication channels on dbstore1001 for topology changes
  • 14:34 ema: upgrade upload-codfw to varnish 4.1.5-1wm4 T145661
  • 14:29 marostegui: Stop MySQL and shutdown es2019 for HW replacement - T149526
  • 14:26 ema: varnish 4.1.5-1wm4 uploaded to apt.w.o T145661
  • 14:08 marostegui: Deploy alter table labswiki.revision on labtestweb2001 - T132416
  • 14:04 marostegui: Deploy alter table labswiki.revision on silver - T132416
  • 13:57 _joe_: restarting HHVM on mw2213, stuck in HPHP::Treadmill::getAgeOldestRequest
  • 13:52 ladsgroup@naos: Synchronized wmf-config/Wikibase-production.php: SWAT: Set echoIcon for notification of wikibase in test wikis (T142102) (duration: 00m 57s)
  • 13:52 Amir1: start of scap sync-file wmf-config/Wikibase-production.php 'SWAT: Set echoIcon for notification of wikibase in test wikis (T142102)'
  • 13:45 ladsgroup@naos: Synchronized portals: (no justification provided) (duration: 01m 05s)
  • 13:44 ladsgroup@naos: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 01m 21s)
  • 13:43 Amir1: ladsgroup@naos:/srv/mediawiki-staging$ portals/sync-portals (T128546)
  • 12:53 volans: disabled puppet on rdb*
  • 12:06 marostegui: Upgrade es1011 and es1014 from mariadb 10.0.22 to mariadb 10.0.28
  • 11:50 marostegui: Upgrade mariadb from 10.0.22 to 10.0.28 on es1015
  • 09:46 moritzm: upgrading mysql on bohrium/piwik
  • 09:25 _joe_: restarting all redis instances for jobqueues on eqiad to force a full resync with masters in codfw T163337
  • 08:55 jynus: deploying alter table to all wikis on s6 T163979
  • 08:54 _joe_: restarting redis rdb1001:6380 after cleaning up the current AOF files for investigation of T163337
  • 08:50 moritzm: installing django security updates
  • 08:29 godog: ms-be1039 issue "controller slot=3 pd 1I:1:5 modify disablepd" to force failed sdc - T163690
  • 08:25 ema: restart varnish-be on cp2024 with expiry thread RT experiment enabled
  • 08:19 ema: upgrade varnish to 4.1.5-1wm3 on cp2024
  • 07:56 elukey: aqs100[69] back serving AQS traffic
  • 07:55 ema: varnish 4.1.5-1wm3 uploaded to apt.w.o T145661
  • 07:16 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool hosts that needed to be moved for the network maintenance - T162681 (duration: 02m 32s)
  • 06:53 marostegui: Reboot es1014 for kernel upgrade - T162029
  • 06:50 elukey: executed kafka preferred-replica-election to rebalance topic leaders in the analytics cluster after maintenance
  • 06:45 marostegui: Reboot es1011 for kernel upgrade - T162029
  • 06:39 marostegui: Logging for the record: drop table hashs from s2, s3 and s7 (only places where it existed) - T54927
  • 06:23 _joe_: moving orphaned objects in ms-be1039's root partition in sdc1/stale_root to save space
  • 06:17 marostegui: Deploy schema change on s7 metawiki.pagelinks to remove partitioning on db1041 - T153300
  • 06:14 marostegui: Deploy alter table on s5 (wikidatawiki) on db1049 - T163548
  • 06:14 marostegui: Deploy alter table on s5 (wikidatawiki) on db1070 (running locally instead of neodymium as this host will be affected by the network maintenance) - T163548
  • 06:11 marostegui: Deploy alter table on s5 (wikidatawiki) on db1070 (running locally instead of neodymium as this host will be affected by the network maintenance) - T130067 T162539
  • 06:09 marostegui: Deploy alter table on s5 (wikidatawiki) on db1049 - T130067 T162539
  • 05:59 marostegui: Deploy alter table labsdb1003 (wikidatawiki) https://phabricator.wikimedia.org/T162539%C2%A0https://phabricator.wikimedia.org/T163548
  • 05:24 Amir1: cleaning some rows in ores_classification in enwiki (T159753)
  • 03:44 ottomata: starting kafka broker on kafka1020
  • 03:40 ottomata: running kafka replica election to bring kafka1018 back as preferred leader
  • 02:21 Jamesofur: running populateEditCount.php in screen on wast for T163854, counting edits for board vote eligibility
  • 02:16 RoanKattouw: Reset 2FA for T163931 on labswiki
  • 00:14 twentyafterfour: starting phabricator update
  • 00:05 ebernhardson@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch/includes/Searcher.php: cirrus: align sister search boost template config variable with documentation (duration: 00m 50s)

2017-04-26

  • 23:51 niharika29@naos: Synchronized php-1.29.0-wmf.21/includes/interwiki/ClassicInterwikiLookup.php: Interwiki: Dont override interwiki map order (T145337) (duration: 01m 00s)
  • 23:38 niharika29@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch/: Align other index template boosting config names (duration: 00m 57s)
  • 23:34 niharika29@naos: Synchronized wmf-config/InitialiseSettings.php: Increase max field count for wikidata; Enable Flow beta feature on arwiki (T155720) (duration: 00m 58s)
  • 23:31 niharika29@naos: Synchronized wmf-config/InitialiseSettings.php: Increase max field count for wikidata; Enable Flow beta feature on arwiki (T155720) (duration: 01m 04s)
  • 23:29 niharika29@naos: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] Increase max field count for wikidata (duration: 01m 23s)
  • 21:42 mutante: running puppet on all cache::misc nodes via cumin to switch ORES to eqiad
  • 21:30 mutante: restarting uwsgi-ores service on all scb2* with systemctl restart
  • 21:15 twentyafterfour: finished with mediawiki deployment train for group1. Everything appears stable, no increase in logspam.
  • 21:12 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.21
  • 21:09 halfak@naos: Started restart [ores/deploy@cc12103]: (no justification provided)
  • 21:08 twentyafterfour@naos: Synchronized php-1.29.0-wmf.21/extensions/Flow/Hooks.php: sync https://gerrit.wikimedia.org/r/#/c/350481/ refs T163896 T161733 (duration: 01m 20s)
  • 21:05 arlolra: Updated Parsoid to 4949857a (T116508, T64270, T133673)
  • 20:55 arlolra@naos: Finished deploy [parsoid/deploy@8d109eb]: Updating Parsoid to 4949857a (duration: 06m 52s)
  • 20:48 arlolra@naos: Started deploy [parsoid/deploy@8d109eb]: Updating Parsoid to 4949857a
  • 20:48 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/350481/1 to get the train back on track refs T161733
  • 20:35 bsitzmann@naos: Finished deploy [mobileapps/deploy@b5afcb8]: Update mobileapps to 14bd4a5 (duration: 15m 17s)
  • 20:34 halfak@naos: Finished deploy [ores/deploy@cc12103]: T162892 (duration: 21m 28s)
  • 20:31 elukey: restart zookeeper on conf1003 after network maintenance
  • 20:20 bsitzmann@naos: Started deploy [mobileapps/deploy@b5afcb8]: Update mobileapps to 14bd4a5
  • 20:12 halfak@naos: Started deploy [ores/deploy@cc12103]: T162892
  • 19:50 elukey: restart kafka nodes (kafka1018 and kafka1020) after network maintenance
  • 19:45 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.20
  • 19:42 twentyafterfour: rolling back group1 to wmf.20 due to T163896 refs T161733
  • 19:31 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.21
  • 19:24 twentyafterfour: begin deployment train: group1 wikis to 1.29.0-wmf.21 refs T161733
  • 19:22 bblack: initiating cumin-based restart of all varnish backends for cache_upload in codfw to downgrade from experimental package. 30 minute spacing, 10 hosts, ~5h to completion...
  • 19:17 thcipriani@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable collectionsaveascommunitypage right on es.wikipedia T163767 (duration: 00m 49s)
  • 19:05 bblack: restarting varnish frontend and backend on cp3033 to downgrade
  • 19:03 bblack: restaring varnish-frontend on cp2014 to downgrade
  • 18:58 thcipriani@naos: Synchronized wmf-config/CommonSettings.php: SWAT: Workaround issue of overriding whitelist config variable T163114 (duration: 00m 53s)
  • 18:56 bblack: downgrading varnish back to 4.1.5-wm1 on all -wm2 hosts
  • 18:50 thcipriani@naos: Synchronized php-1.29.0-wmf.21/extensions/CirrusSearch: SWAT: Provide a way to blacklist a set of wikis for crosswiki search T163546 (duration: 01m 02s)
  • 18:44 thcipriani@naos: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Adjust sistersearch against wikivoyage to require title matching T163547 (duration: 01m 11s)
  • 18:38 thcipriani@naos: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Configure multimedia search template boosting T163223 (duration: 00m 53s)
  • 18:30 thcipriani@naos: Synchronized php-1.29.0-wmf.20/extensions/SecurePoll: SWAT: Add voter scripts for board/fdc election 2017 T163854 (duration: 00m 57s)
  • 18:26 thcipriani@naos: Synchronized php-1.29.0-wmf.21/extensions/SecurePoll: SWAT: Add voter scripts for board/fdc election 2017 T163854 (duration: 01m 00s)
  • 18:23 thcipriani@naos: Synchronized dblists/commonsuploads.dblist: SWAT: Enable local uploads on knwiki T133137 (duration: 01m 06s)
  • 18:16 ema: start varnish-frontend on cp2014
  • 18:14 jynus: running alter table on all wikis of s3 T163912
  • 17:49 jynus: rebooting es1019 for upgrading and to fix race condition on services
  • 17:46 elukey: restart nutcracker on the eqiad mw hosts to pick up the new shard config (spamming elasticsearch memcached and triggering alarms)
  • 17:44 elukey: unmasking and starting daemons on restbase-dev1003
  • 17:41 reedy@naos: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 01m 23s)
  • 17:02 mobrovac@naos: Started restart [trending-edits/deploy@7112062]: Restart for ICU lib update
  • 17:01 mobrovac@naos: Started restart [mobileapps/deploy@5c2b9a9]: Restart for ICU lib update
  • 17:00 mobrovac@naos: Started restart [mathoid/deploy@7eb4092]: Restart for ICU lib update
  • 16:43 mobrovac@naos: Started restart [electron-render/deploy@9156760]: Restart for ICU lib update
  • 16:39 mobrovac@naos: Started restart [graphoid/deploy@128206b]: Restart for ICU lib update
  • 16:37 mobrovac@naos: Started restart [eventstreams/deploy@05bcc8f]: Restart for ICU lib update
  • 16:37 mobrovac@naos: Started restart [electron-render/deploy@9156760]: Restart for ICU lib update
  • 16:36 mobrovac@naos: Started restart [cxserver/deploy@6899032]: Restart for ICU lib update
  • 16:34 mobrovac@naos: Started restart [citoid/deploy@b8c4cb2]: Restart for ICU lib update
  • 16:14 elukey: stop and mask cassandra and restbase on restbase-dev1003 for row-d maintenance
  • 16:07 _joe_: disabled and masked strongswan, memcached, redis on mc1013-17 for decommissioning
  • 15:43 XioNoX: VRRP priority removed, interfaces cr2/asw2 renamed - T148506
  • 15:40 _joe_: shutting down conf1003 T148506
  • 15:33 XioNoX: "cr2-eqiad# delete interfaces ae4 disable" done, confirmed links and LACP are up - T148506
  • 15:33 XioNoX: "cr2-eqiad# delete interfaces ae4 disable" done, confirmed links and LACP are up
  • 15:24 marostegui: Shutdown es2019 for maintenance with papaul and Dell - T149526
  • 15:12 XioNoX: switch ports for rack D7 and D8 configured - T148506
  • 14:47 marostegui: Stop MySQL db1070 (just in case) to test drac cold restart
  • 14:47 bblack@neodymium: conftool action : set/pooled=no; selector: dc=eqiad,cluster=cache_upload,name=cp107[1234].eqiad.wmnet
  • 14:26 elukey: depooling aqs100[69] from AQS for network maintenance
  • 14:20 elukey: stop zookeeper on conf1003 for row-d maintenance (Hadoop, Kafka related)
  • 14:04 XioNoX: "cr2-eqiad# set interfaces ae4 disable" done, (1 ping loss) - T148506
  • 14:00 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1026, depool db1045 - T162539 T163548 (duration: 00m 53s)
  • 13:59 XioNoX: lowered VRRP priority for T148506
  • 13:58 andrewbogott: put labservices1001 into downtime to minimize (but probably not totally eliminate) alert spam
  • 13:56 andrewbogott: disabled instance creation on Horizon via https://gerrit.wikimedia.org/r/#/c/350414/ and on wikitech via a strategic edit in extensions/OpenStackManager/special/SpecialNovaInstance.php
  • 13:56 godog: downtime and poweroff ms-be 21 26 27 37 38 39 before switch relocation - T148506
  • 13:54 gehel: downtime "ElasticSearch health check for shards" checks for logstash and elasticsearch eqiad - T148506
  • 13:53 elukey: stop kafka on kafka1020 and kafka1018 for row-d extended maintenance (D2)
  • 13:44 _joe_: shutting down mc1013-18 for row D maintenance
  • 13:40 aude@naos: Synchronized wmf-config/CommonSettings-labs.php: (no justification provided) (duration: 00m 57s)
  • 13:32 aude@naos: Synchronized wmf-config/Wikibase-production.php: disable tabular-data for now on wikidata and enable echo notification on test wikis (duration: 01m 06s)
  • 13:29 marostegui: Deploy alter table on db1069 (wikidatawiki) https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 13:27 marostegui: Deploy alter table labsdb1001 https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 13:23 marostegui: Deploy alter table db1045 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 13:22 elukey: restart HDFS on analytics100[12] (Hadoop master nodes) to pick up recent topology changes for the cluster
  • 13:10 aude@naos: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 23s)
  • 13:02 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 13:00 ema: cp2017: restart varnish-be
  • 12:56 marostegui: Shutdown db1092 for maintenance - https://phabricator.wikimedia.org/T162681
  • 12:55 gehel: restart elasticsearch on relforge1001 to validate new config - T161830
  • 12:46 moritzm: installing mysql security updates (5.5 as packaged in Debian jessie)
  • 12:43 ema@neodymium: conftool action : set/pooled=no; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 11:32 jynus: applying new events_coredb_slave.sql on db2055 T160984
  • 11:31 moritzm: rebooting mwlog2001 for update to Linux 4.9
  • 10:47 ladsgroup@naos: Synchronized wmf-config/Wikibase-labs.php: T142104, part II (duration: 00m 56s)
  • 10:45 ladsgroup@naos: Synchronized static/images/wikibase/echoIcon.svg: T142104, part I (duration: 01m 04s)
  • 10:44 marostegui: Deploy alter table on s5, on db1063 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 10:39 jynus@naos: Synchronized wmf-config/db-eqiad.php: switch s5 eqiad master from db1049 to db1063 (duration: 01m 24s)
  • 09:48 jynus: migrating s5 eqiad replicas under db1063
  • 09:42 jynus: restarting mariadb at db1063
  • 09:24 marostegui: Shutdown db1094, db1093, db1091 for maintenance - T162681
  • 09:16 marostegui: Shutdown es1019 for maintenance - T162681
  • 08:32 elukey: Gracefully stopping hadoop daemons on Hadoop nodes affected by Row-D maintenance
  • 08:30 marostegui: Deploy alter table on change_tag and tag_summary on silver and labtestweb2001 - T147166
  • 08:27 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Depool hosts that need to be moved for the network maintenance - T162681 (duration: 02m 25s)
  • 08:22 moritzm: reimaging terbium to jessie
  • 07:59 jynus: shutting down mariadb on db1040 as a backup before decommissioning
  • 07:48 marostegui: Deploy alter table on s1, on db1052 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 07:30 marostegui: Deploy alter table on s7, on db1062 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 07:24 marostegui: Deploy alter table on s4, on db1068 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 07:09 marostegui: Deploy alter table on s6, on db1061 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 06:56 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1071 - T162539 T163548 (duration: 02m 24s)
  • 06:45 marostegui: Deploy alter table on s2, on db1054 (eqiad master) for tables: change_tag and tag_summary - https://phabricator.wikimedia.org/T147166
  • 06:10 marostegui: Deploy alter table on s3, on db1075 (eqiad master) for tables: change_tag and tag_summary - T147166
  • 05:57 marostegui: Deploy alter table enwiki.revision on labsdb1011 - T132416
  • 00:20 catrope@naos: Synchronized php-1.29.0-wmf.21/extensions/Flow/modules/flow/ui/widgets/mw.flow.ui.ReplyWidget.js: T163749 (duration: 01m 24s)

2017-04-25

  • 22:24 mutante: mediawiki maintenance servers: last log entry was _before_ merging https://gerrit.wikimedia.org/r/#/c/342777/ and making a change
  • 22:23 andrewbogott: re-enabling dns on labservices1001
  • 22:22 mutante: mediawiki maintenance servers: making wasat identical to terbium. wasat is currently the active server running crons. no change there at all. on terbium where crons are inactive, some log files were removed
  • 22:13 twentyafterfour@naos: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.21
  • 22:08 madhuvishy: Reenabled labs instance creation and deletion on horizon
  • 22:05 twentyafterfour@naos: Finished scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #5) (duration: 21m 52s)
  • 22:02 andrewbogott: causing an intentional outage of labs-ns0 and labs-recursor0 to make sure we're properly girded for tomorrow's switch replacement.
  • 21:43 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #5)
  • 21:41 twentyafterfour@naos: scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_66989801"/* "/srv/mediawiki-staging/php-1.29.0-wmf.21/cache/l10n"' returned non-zero exit status 1 (duration: 03m 38s)
  • 21:38 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #4)
  • 21:33 twentyafterfour@naos: scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_930292683"/* "/srv/mediawiki-staging/php-1.29.0-wmf.21/cache/l10n"' returned non-zero exit status 1 (duration: 03m 46s)
  • 21:30 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #3)
  • 21:23 twentyafterfour@naos: scap failed: CalledProcessError Command 'cp -r "/tmp/scap_l10n_2414756836"/* "/srv/mediawiki-staging/php-1.29.0-wmf.21/cache/l10n"' returned non-zero exit status 1 (duration: 00m 54s)
  • 21:23 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733 (attempt #2)
  • 21:09 twentyafterfour@naos: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_3498979833" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 01m 56s)
  • 21:07 twentyafterfour@naos: Started scap: sync 1.29.0-wmf.21 to testwikis (pre-group0) refs T161733
  • 20:00 madhuvishy: Labs instance creation and deletion on horizon temporarily disabled via https://gerrit.wikimedia.org/r/350266
  • 19:50 demon@naos: Synchronized wmf-config/CommonSettings-labs.php: no-op, beta change (duration: 01m 58s)
  • 18:55 chasemp: restart nova-fullstack on labnet1001
  • 18:50 chasemp: downtime labservices1001 as we fail away from it and puppet staleness on labservices1002
  • 18:38 andrewbogott: disabling nova-api for another try at labservices failover
  • 18:33 twentyafterfour: Deployment Train: Branching mediawiki wmf/1.29.0-wmf.21 from master refs T161733
  • 17:36 jynus: running test schema change on etwiki on eqiad (depooled) T17441
  • 17:35 RainbowSprinkles: gerrit: Quick reboot to pick up new bouncycastle library
  • 17:25 arlolra: Updated Parsoid to 55b90511 (T153885, T163330, T89262, T154709, T162919, T161306)
  • 17:20 moritzm: rebooting ruthenium for update to Linux 4.9
  • 17:19 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 07s)
  • 17:19 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:18 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 05s)
  • 17:18 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:18 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 08s)
  • 17:18 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:18 arlolra@naos: Finished deploy [parsoid/deploy@719d7bd]: Updating Parsoid to 55b90511 (duration: 08m 02s)
  • 17:17 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 07s)
  • 17:17 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 17:11 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 02m 18s)
  • 17:09 arlolra@naos: Started deploy [parsoid/deploy@719d7bd]: Updating Parsoid to 55b90511
  • 17:08 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:54 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 25s)
  • 16:53 godog: flush wikiwix cache from planet2001 and rebuild files
  • 16:53 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:53 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 07s)
  • 16:53 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:50 andrewbogott: labservices failover aborted due to cryptic routing/firewall issue
  • 16:45 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2255.codfw.wmnet,service=apache2
  • 16:44 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config (duration: 00m 20s)
  • 16:44 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config
  • 16:42 godog: flush wikiwix cache from planet1001 and rebuild files
  • 16:41 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet,service=apache2
  • 16:41 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2256.codfw.wmnet,service=apache2
  • 16:40 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=apache2
  • 16:38 andrewbogott: stopping nova-api for labservices switchover
  • 16:36 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config (duration: 00m 53s)
  • 16:35 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config
  • 16:29 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config (duration: 00m 04s)
  • 16:29 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: enable wildcard topic config
  • 16:18 otto@naos: Finished deploy [eventlogging/eventbus@e7da0cc]: (no justification provided) (duration: 00m 06s)
  • 16:17 otto@naos: Started deploy [eventlogging/eventbus@e7da0cc]: (no justification provided)
  • 16:09 thcipriani@naos: Synchronized README: test new scap version (duration: 01m 03s)
  • 15:59 akosiaris: restart pybal on lvs[2001-2002].codfw.wmnet,lvs[3001-3002].esams.wmnet,lvs[4001-4002].ulsfo.wmnet,lvs[1001-1002].wikimedia.org T159687
  • 15:50 moritzm: installing libav security updates
  • 15:48 bawolff@naos: Synchronized wmf-config/CommonSettings-labs.php: Test account creation limits on labs (duration: 01m 14s)
  • 15:47 akosiaris: restart pybal on lvs2003.codfw.wmnet,lvs3003.esams.wmnet,lvs4003.ulsfo.wmnet,lvs1003.wikimedia.org T159687
  • 15:46 marostegui: Stop replication on db1086 and db1094 in sync - https://phabricator.wikimedia.org/T130067
  • 15:36 mobrovac@naos: Finished deploy [changeprop/deploy@7521b2f]: Bring back the concurrency level - T163292 (duration: 01m 13s)
  • 15:35 mobrovac@naos: Started deploy [changeprop/deploy@7521b2f]: Bring back the concurrency level - T163292
  • 15:33 jynus: stopping replication on dbstore1001 to change its replication topology
  • 15:33 akosiaris: restart pybal on lvs[2004-2006].codfw.wmnet,lvs3004.esams.wmnet,lvs4004.ulsfo.wmnet,lvs[1004-1006].wikimedia.org T159687
  • 15:28 filippo@neodymium: conftool action : set/pooled=yes; selector: name=mw2017.codfw.wmnet
  • 15:27 mobrovac@naos: Finished deploy [changeprop/deploy@e0e3684]: Bring back the concurrency level - T163292 (duration: 00m 10s)
  • 15:26 mobrovac@naos: Started deploy [changeprop/deploy@e0e3684]: Bring back the concurrency level - T163292
  • 15:18 ema: start cache_text upgrade to linux 4.9 T162029
  • 15:14 marostegui: Deploy alter table s7 on watchlist table directly on the master (db1062) - https://phabricator.wikimedia.org/T130067
  • 15:14 filippo@neodymium: conftool action : set/pooled=no; selector: name=mw2017.codfw.wmnet
  • 14:59 jynus@naos: Synchronized wmf-config/db-eqiad.php: switch s7 eqiad master from db1041 to db1062 (duration: 00m 54s)
  • 14:54 bblack: upgrading nginx on cp1008
  • 14:30 bawolff@naos: Synchronized private/PrivateSettings.php: rv change to T163477 to see if it fixes logging (duration: 01m 14s)
  • 14:27 bawolff: Logging has seemed to stop after last deploy to private settings :(
  • 14:20 bblack: uploaded WMF nginx-1.11.10-1+wmf1 packages to jessie-wikimedia repo
  • 14:17 marostegui: Stop replication in sync on db1089 and db1083 for maintenance - https://phabricator.wikimedia.org/T130067
  • 14:08 jynus: restarting mariadb on db1062
  • 14:07 jynus: moving s7 eqiad replicas under db1062
  • 14:02 godog: poweroff ms-be1016 for controller swap - T150206
  • 14:02 bawolff@naos: Synchronized wmf-config/PrivateSettings.php: Hopefully cause previous changes to be picked up try2 (duration: 00m 44s)
  • 13:58 bawolff@naos: Synchronized wmf-config/PrivateSettings.php: Hopefully cause previous changes to be picked up (duration: 00m 44s)
  • 13:51 hashar: European SWAT complete
  • 13:49 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Re-enable ContentTranslation - T163344 (duration: 00m 44s)
  • 13:37 hashar@naos: Synchronized php-1.29.0-wmf.20/includes/media/TransformationalImageHandler.php: media: Capture stderr when running convert --version - T158649 (duration: 00m 47s)
  • 13:35 moritzm: rebooting einsteinium for update to Linux 4.9
  • 13:31 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Fix namespace Wikipedia_talk for zh_classicalwiki - T162547 (duration: 00m 48s)
  • 13:24 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Two namespace aliases for zh_classicalwiki - T162547 (duration: 00m 49s)
  • 13:22 marostegui: Deploy alter table on s3 (only etwiki) for tag_summary and change_tag tables - T147166
  • 13:20 hashar@naos: Synchronized php-1.29.0-wmf.20/includes: Fix bogus field reference in Category::getCountMessage() callback - T162941 (duration: 01m 14s)
  • 13:16 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Add NS aliases for zh_classicalwiki - T162547 (duration: 01m 00s)
  • 13:15 marostegui: Deploy alter table on silver.watchlist and labtestweb2001.labtestwiki for the watchlist table - T130067
  • 13:12 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Add Draft namespace to zh_classicalwiki - T163655 (duration: 01m 19s)
  • 13:10 hashar: zh_classicalwiki : renamed broken page via namespaceDupes.php : id=73504 ns=0 dbk=模板:Protected_logo -> 模板:Protected_logobroken
  • 12:35 marostegui: Stop replication in sync on db1092 and db1087 for maintenance - https://phabricator.wikimedia.org/T130067
  • 11:57 gehel: banning elasticsearch row D node in preparation for maintenance
  • 11:46 marostegui: Deploy alter table s5 on watchlist table directly on the master (db1049) - https://phabricator.wikimedia.org/T130067
  • 11:28 jynus@naos: Synchronized wmf-config/db-eqiad.php: Depool db1022, promote db1061 as the s6 eqiad master (duration: 01m 17s)
  • 11:27 marostegui: Deploy alter table s1 on watchlist table directly on the master (db1052) - https://phabricator.wikimedia.org/T130067
  • 11:01 jynus: switching eqiad s6 master to db1061
  • 10:45 jynus: stopping replication on db1050
  • 10:39 marostegui: Stop replication in sync on db1090 and db1076 for maintenance - https://phabricator.wikimedia.org/T130067
  • 10:15 jynus: restarting db1061's mysql process
  • 10:12 jynus: moving all slaves of s6 eqiad under db1061
  • 09:49 marostegui: Stop replication in sync on db1091 and db1084 for maintenance - T130067
  • 09:46 marostegui: Deploy alter table s2 on watchlist table directly on the master (db1054) - T130067
  • 09:10 jynus@naos: Synchronized wmf-config/db-eqiad.php: Promote db1054 as the new s2 master on eqiad (duration: 01m 19s)
  • 08:56 marostegui: Stop replication on db1088 and db1093 in sync - T130067
  • 08:53 jynus: restarting stopping replication on s2-eqiad and restarting db1054
  • 08:52 marostegui: Deploy alter table s4 commonswiki.watchlist directly on db1068 (eqiad master) - T130067
  • 08:24 marostegui: Stop MySQL db1041 (eqiad master) to reclone db1062 from it - T163665
  • 08:03 jynus: moving all slaves of s2 eqiad under db1054
  • 07:14 ema: upgrade cp3033 varnish-be to varnish 4.1.5-1wm2, expiry thread lock/priority workaround T145661
  • 06:34 marostegui: Deploy alter table on s3, all the wikis to the watchlist table on db1075, eqiad master - T130067
  • 06:10 marostegui@naos: Synchronized wmf-config/db-codfw.php: Restore db2061 original weight (duration: 00m 57s)
  • 06:06 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1071, depool db1026 - T162539 T163548 (duration: 01m 17s)
  • 05:41 marostegui: Deploy alter table enwiki.revision on labsdb1009 and labsdb1010 - T132416
  • 02:22 bawolff: deployed patch for T163477
  • 01:42 MaxSem: Deployed security patches for T163166
  • 00:53 bawolff: unconfirming emails associated with T163477
  • 00:38 mutante: ocg1001 - powercycle into installer, was sitting at partman step with "failure to read from sda"...
  • 00:25 twentyafterfour: restarted apache2 on iridium to tune rate limiting value
  • 00:16 twentyafterfour@naos: Synchronized wmf-config/CommonSettings.php: fix "Notice: Undefined variable: wmgRelatedArticlesFooterWhitelistedSkins" (duration: 01m 11s)

2017-04-24

  • 23:41 twentyafterfour@naos: Synchronized wmf-config/: deploy https://gerrit.wikimedia.org/r/#/c/348472/ refs T163114 (duration: 01m 05s)
  • 23:22 ejegg: updated civicrm from 40d88c0 to 061cd61
  • 23:08 ejegg: updated civicrm from a11c108 to 40d88c0
  • 22:46 bawolff: deploy patch for T155277
  • 21:53 hoo: Updated the sites and site_identifiers tables on all Wikidata clients for dtywiki T161529.
  • 21:41 ejegg: updated civicrm from 51dbbad to a11c108
  • 19:52 mattflaschen@naos: Finished scap: Full scap (due to ORES i18n change earlier), plus additional $wgHiddenPrefs change (duration: 17m 06s)
  • 19:35 mattflaschen@naos: Started scap: Full scap (due to ORES i18n change earlier), plus additional $wgHiddenPrefs change
  • 19:10 bblack: cp2026: restart to wm2 varnish package
  • 18:42 thcipriani@naos: Synchronized wmf-config/throttle.php: SWAT: New throttle rule T163726 (duration: 01m 03s)
  • 18:19 thcipriani@naos: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove defunct $wgForeignUploadTestEnabled for cross-wiki upload A/B test (duration: 00m 53s)
  • 18:18 jynus: disabling mysql replication eqiad -> codfw on s[1-7] and x1 shards T155099
  • 18:10 thcipriani@naos: Synchronized wmf-config/CommonSettings-labs.php: SWAT: Full path to xvfb-run (beta only change) (duration: 01m 07s)
  • 17:53 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase db2061 weight (duration: 00m 47s)
  • 17:46 marostegui: Alter table labtestwiki.user_groups on labtestweb2001 - T155605
  • 17:43 bblack: installing varnish 4.1.5-1wm2 on all cache_upload hosts @ codfw (no restarts)
  • 17:41 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase db2043 and db2061 weight (duration: 00m 49s)
  • 17:36 demon@naos: Synchronized dblists/group0.dblist: moving labstestwiki to group0 (duration: 00m 54s)
  • 17:35 bblack: upgrade cp2024 varnish-be to varnish 4.1.5-1wm2, expiry thread lock/priority workaround T145661
  • 17:28 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase db2043 and db2061 weight - T163339 (duration: 00m 58s)
  • 17:19 gehel: restarting wdqs-updater for new configuration
  • 17:10 gehel@naos: Finished deploy [wdqs/wdqs@481346a]: (no justification provided) (duration: 01m 47s)
  • 17:08 gehel@naos: Started deploy [wdqs/wdqs@481346a]: (no justification provided)
  • 16:58 marostegui@naos: Synchronized wmf-config/db-codfw.php: Repool db2043 and db2061 with less weight - T163339 (duration: 01m 16s)
  • 16:56 godog: poweroff prometheus2004 for memory upgrade - T163386
  • 16:11 ema: upgrade cp2017 varnish-be to varnish 4.1.5-1wm2, expiry thread lock/priority workaround T145661
  • 15:44 jynus: stopping all slaves on dbstore1001 for maintenance
  • 15:44 godog: poweroff prometheus2003 for memory upgrade - T163386
  • 15:28 mattflaschen@naos: Synchronized wmf-config/CommonSettings.php: T163696: Only copy filter thresholds if they are set (duration: 01m 10s)
  • 15:10 matt_flaschen: GuidedTour/RCFilters/ORES deployment complete and tested
  • 15:09 XioNoX: disabling the bgp session between pfw-codfw and cr2 for T163447
  • 15:07 ema: varnish 4.1.5-1wm2 uploaded to apt.w.o T145661
  • 15:06 matt_flaschen: Preference updates (for ORES on enwiki) done, using naos instead of terbium
  • 14:54 mattflaschen@naos: Synchronized php-1.29.0-wmf.20/extensions/ORES: Make the preference for the "r" flag on the RC page also control highlighting (duration: 00m 48s)
  • 14:50 mattflaschen@naos: Synchronized wmf-config/: Release RC Filters on more wikis and prep changes for that (duration: 00m 53s)
  • 14:39 matt_flaschen: Deployment of T152827 ("Enable GuidedTour on all wikis") complete and tested
  • 14:38 Dereckson: Created linter table on ptwikimedia and dtywiki
  • 14:34 mattflaschen@naos: Synchronized wmf-config/InitialiseSettings.php: Enable GuidedTour on all wikis (duration: 00m 59s)
  • 14:27 marostegui: Deploy alter table on s3 etwiki on watchlist table directly on the master (db1075) - T130067
  • 14:17 marostegui: Stop MySQL db2043 and db2061 for maintenance - https://phabricator.wikimedia.org/T163339
  • 14:14 marostegui@naos: Synchronized wmf-config/db-codfw.php: Depool db2043 and db2061 - T163339 (duration: 01m 08s)
  • 14:14 moritzm: rebooting ms1001 for kernel update to Linux 4.9
  • 14:10 hashar@naos: Finished scap: Full scap for namespaces related changes (T161529 and https://gerrit.wikimedia.org/r/#/c/349864/1) (duration: 16m 06s)
  • 14:09 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 14:08 ema: re-pooling cp2002's varnish-be with increased priority for expiry thread T145661
  • 13:57 ema@neodymium: conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 13:54 hashar@naos: Started scap: Full scap for namespaces related changes (T161529 and https://gerrit.wikimedia.org/r/#/c/349864/1)
  • 13:50 addshore: Initial run of populateCognatePages.php complete. 27,595,121 rows in cognate_pages & 17,263,411 in cognate_titles
  • 13:49 godog: swift eqiad-prod: more weight on ms-be1028 -> ms-be1039 - T160640
  • 13:47 elukey: reimage analytics1003 to Jessie (Oozie/Hive/Camus not available during this timeframe in the Analytics Hadoop cluster)
  • 13:47 marostegui: Deploy unscheduled alter table on silver (labswiki.user_groups) - T159416
  • 13:26 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Enable user group expiry in production - T159416 (duration: 00m 49s)
  • 13:16 marostegui: Remove replication codfw - eqiad on s3 (db2018 codfw master will not be a slave of eqiad master) - https://phabricator.wikimedia.org/T130067 https://phabricator.wikimedia.org/T147166 T162133
  • 13:14 hashar@naos: Synchronized php-1.29.0-wmf.20/extensions/ProofreadPage/ProofreadPage.namespaces.php: Fix language code for Norwegian (duration: 00m 54s)
  • 13:12 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1082 - T162539 - T163548
  • 13:11 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1063 - T162539 https://phabricator.wikimedia.org/T163548
  • 13:10 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Make sysops able to grant/remove confirmed user group at cswiki - T163206 (duration: 00m 55s)
  • 13:09 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Raise autoconfirmed status requirements to 4 days, 10 edits at cswiki - T163207 (duration: 01m 09s)
  • 13:06 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Set timezone to Asia/Kolkata on wb.wikimedia - T163322 (duration: 00m 44s)
  • 13:05 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Remove all feeds added in T127176 from RSS whitelist for mw.org - T163217 (duration: 00m 45s)
  • 13:03 hashar@naos: Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage on zh_classicalwiki - T163043 (duration: 00m 46s)
  • 12:52 aude@naos: Synchronized wmf-config/Wikibase-production.php: Disable use of new column in wb_terms table for now (duration: 00m 48s)
  • 12:46 aude@naos: Synchronized wmf-config/Wikibase-production.php: (no justification provided) (duration: 00m 47s)
  • 12:41 Dereckson: pt.wikimedia.org and dty.wikipedia.org wikis creation done
  • 12:38 dereckson@naos: Synchronized wmf-config/interwiki.php: +dty +wmpt and other fixes (duration: 00m 48s)
  • 12:28 Dereckson: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php dtywiki --backend=local-multiwrite (T162874)
  • 12:14 dereckson@naos: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for dty.wikipedia (T161529) (duration: 00m 49s)
  • 12:13 dereckson@naos: Synchronized langlist: +dty (T161529) (duration: 00m 50s)
  • 12:09 dereckson@naos: rebuilt wikiversions.php and synchronized wikiversions files: +dtywiki
  • 12:08 Dereckson: Creata dtywiki database (T161529)
  • 12:08 dereckson@naos: Synchronized dblists: +dtywiki (duration: 00m 56s)
  • 12:07 dereckson@naos: Synchronized static/images/project-logos/: Logo for dty.wikipedia (T161529) (duration: 01m 13s)
  • 11:59 Dereckson: Purged https://pt.wikimedia.org/ URL (T126832)
  • 11:55 dereckson@naos: Synchronized multiversion/MWMultiVersion.php: Entry point for pt.wikimedia.org (T126832) (duration: 00m 44s)
  • 11:50 Dereckson: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php ptwikimedia --backend=local-multiwrite (T126832)
  • 11:48 dereckson@naos: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for pt.wikimedia (T126832)
  • 11:42 dereckson@naos: rebuilt wikiversions.php and synchronized wikiversions files: +pt.wikimedia (T126832)
  • 11:42 dereckson@naos: Synchronized dblists/: Respawn pt.wikimedia configuration (duration: 00m 44s)
  • 11:41 Dereckson: Recreate database for ptwikimedia (T126832)
  • 11:28 dereckson@naos: Synchronized php-1.29.0-wmf.20/languages/messages/MessagesDty.php: Localize namespaces in Doteli (T162872) (duration: 00m 50s)
  • 11:27 dereckson@naos: Synchronized php-1.29.0-wmf.20/extensions/Gadgets/Gadgets.namespaces.php: Localize namespaces in Doteli (T162873) (duration: 00m 44s)
  • 11:26 dereckson@naos: Synchronized php-1.29.0-wmf.20/extensions/Scribunto/Scribunto.namespaces.php: Localize namespaces in Doteli (T162874) (duration: 00m 46s)
  • 11:16 addshore: addshore@wasat:~$ mwscriptwikiset extensions/Cognate/maintenance/populateCognatePages.php wiktionary.dblist --batch-size=1000
  • 11:14 addshore@naos: Synchronized wmf-config/InitialiseSettings-labs.php: Deploy Cognate to production wiktionaries T150182 PT 4/4 (duration: 00m 47s)
  • 11:12 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Deploy Cognate to production wiktionaries T150182 PT 3/4 (touched) (duration: 00m 52s)
  • 11:02 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Deploy Cognate to production wiktionaries T150182 PT 3/4 (duration: 00m 57s)
  • 11:01 addshore@naos: Synchronized wmf-config/CommonSettings-labs.php: Deploy Cognate to production wiktionaries T150182 PT 2/4 (duration: 01m 01s)
  • 10:57 addshore@naos: Synchronized wmf-config/CommonSettings.php: Deploy Cognate to production wiktionaries T150182 PT 1/4 (duration: 01m 18s)
  • 10:28 addshore: addshore@wasat:~$ mwscriptwikiset extensions/Cognate/maintenance/populateCognatePages.php wiktionary.dblist
  • 10:27 addshore: 180 rows added to cognate_titles & cognate_pages
  • 10:25 addshore: addshore@wasat:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php zawiktionary
  • 10:25 addshore: 172 sites added to cognate_sites
  • 10:24 addshore: addshore@wasat:~$ mwscript extensions/Cognate/maintenance/populateCognateSites.php enwiktionary --site-group=wiktionary
  • 10:16 addshore@naos: Finished scap: Add Cognate to extension-list T150182 (duration: 15m 26s)
  • 10:01 addshore@naos: Started scap: Add Cognate to extension-list T150182
  • 10:00 jynus: disabling puppet on app servers for apache config deploy T126832
  • 09:56 addshore@naos: Synchronized wmf-config/InitialiseSettings-labs.php: wmgUseInterwikiSorting true for wiktionaries PT 2/2 (duration: 00m 46s)
  • 09:54 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: wmgUseInterwikiSorting true for wiktionaries PT 1/2 (duration: 00m 47s)
  • 09:51 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Configure InterwikiSorting orders for Wiktionaries PT 2/2 (duration: 00m 48s)
  • 09:50 addshore@naos: Synchronized wmf-config/InterwikiSortOrders.php: Configure InterwikiSorting orders for Wiktionaries PT 1/2 (duration: 00m 53s)
  • 09:49 jynus: testing mediawiki changes on mwdebug1001
  • 09:44 addshore@naos: Synchronized docroot/noc/conf/InterwikiSortOrders.php.txt: NOOP Add InterwikiSortOrders to noc docroot (docs only) (duration: 01m 00s)
  • 09:42 addshore@naos: Synchronized wmf-config/InitialiseSettings.php: Use group0 to reduce lines for WMDE related config settings (duration: 01m 18s)
  • 09:15 marostegui: Stop MYSQL on db1062 to backup its mysql - T163665
  • 09:14 jynus: dropping ptwikimedia from es1012,es1016,es1018,es2011,es2012,es2013, T126832
  • 09:11 jynus: dropping ptwikimedia from es3 T126832
  • 09:08 jynus: dropping ptwikimedia from es2 T126832
  • 09:04 jynus: dropping ptwikimedia from x1 T126832
  • 08:55 jynus: dropping ptwikimedia from s3 T126832
  • 08:03 marostegui: Deploy alter table enwiki.revision on db1095 (sanitarium2) - T132416
  • 07:34 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1080 and db1067 (duration: 01m 18s)
  • 06:23 marostegui: Deploy alter table enwiki.revision db1052 (eqiad master) - T132416
  • 06:12 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1087 - https://phabricator.wikimedia.org/T162539 https://phabricator.wikimedia.org/T163548
  • 06:12 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1092, depoll db1087 - T162539 T163548 (duration: 02m 19s)

2017-04-23

  • 19:13 ema: cp2020: restart varnish-be
  • 17:49 jynus: disabling puppet on db2062 and upgrading MariaDB package to 10.1 T116557
  • 03:12 andrewbogott: removing files in /srv/deployment/ocg/postmortem on ocg1003, another case of T162780

2017-04-22

  • 13:41 ema@neodymium: conftool action : set/pooled=no; selector: name=cp2024.codfw.wmnet,service=varnish-be
  • 07:53 jynus: restarting es2019.codfw.wmnet after upgrade
  • 07:43 jynus: powercycling es2019.codfw.wmnet, unresponsive
  • 07:21 jynus@naos: Synchronized wmf-config/db-codfw.php: Depool es2019 (duration: 02m 16s)
  • 03:21 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2024.codfw.wmnet,service=varnish-be
  • 02:56 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2024.codfw.wmnet,service=varnish-be
  • 02:18 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 00:34 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be

2017-04-21

  • 23:52 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2026.codfw.wmnet,service=varnish-be
  • 22:49 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2026.codfw.wmnet,service=varnish-be
  • 15:06 marostegui@naos: Synchronized wmf-config/db-codfw.php: Increase weight db2071 (duration: 01m 17s)
  • 14:32 marostegui: Analyze revision, logging and page table on s1 db1067 - https://phabricator.wikimedia.org/T116557
  • 14:26 ema: ban objects with CT < 1024 on codfw cache_upload T145661
  • 14:00 moritzm: installing postgresql bugfix update from jessie point release on labsdb1004
  • 13:35 marostegui: Deploy alter table on wikidatawiki.wb_terms on db1092 - T162539 T163548
  • 13:20 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T162539 T163548 (duration: 01m 18s)
  • 12:51 akosiaris: reboot puppetmaster1002 for kernel upgrade
  • 12:07 marostegui: Analyze revision, logging and page table on s1 db1080 - T116557
  • 12:07 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Update db1080 depool reason (duration: 01m 18s)
  • 10:35 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Repool db1071 - T163109 (duration: 01m 20s)
  • 09:20 moritzm: rebooting etherpad1001 (running etherpad.wikimedia.org) for update to Linux 4.9
  • 09:10 jynus: stopping and upgrading/reconfiguring db2062 (depooled) T116557
  • 08:49 jynus@naos: Synchronized wmf-config/db-codfw.php: Depool db2062 (duration: 01m 20s)
  • 08:32 akosiaris: looking at tcpircbot (logmsgbot) problems at tegmen
  • 08:20 elukey: rolling restart of aqs (nodejs) on aqs* to pick up upgrades
  • 08:01 moritzm: rolling restart of hhvm on application servers in eqiad to pick up ICU security update
  • 07:47 marostegui: Stop MySQL on db1071 and db1063 to reclone db1063 - T163109
  • 07:43 moritzm: installing further icu security updates
  • 06:21 marostegui: Restart MySQL on db1065 for maintenance - T163351
  • 06:09 marostegui: Deploy alter table enwiki.revision db1067 - T132416

2017-04-20

  • 22:28 twentyafterfour: enable rate limiting in phabricator
  • 22:17 paravoid: setting tw_reuse to 1 on dbproxy1003
  • 21:47 twentyafterfour: started phd on iridium
  • 21:31 twentyafterfour: stopped phd on iridium to reduce load on the database
  • 19:26 Amir1: deploy finished
  • 19:24 Amir1: start of ladsgroup@naos:/srv/mediawiki-staging/php-1.29.0-wmf.20$ scap sync-file php-1.29.0-wmf.20/extensions/ORES/includes/Hooks.php 'Disable ORES in Recentchangeslinked (T163063)'
  • 19:15 mutante: test logging in fundraising channel
  • 19:06 mutante: fixing duplicate ircecho situation - since today it should run from tegmen, the active icinga server
  • 17:51 mutante: restarted icinga-wm (ircecho) to pick up config change
  • 17:13 jynus: stopping replication on db1040
  • 17:09 andrewbogott: disabling puppet on serpens, seaborgium, pollux, dubnium, labservices1001, labservices1002 for tentative rollout of https://gerrit.wikimedia.org/r/#/c/348920/
  • 16:58 jynus: moving GTID s4 eqiad replicas under db1068
  • 16:46 ema: repool varnish-be on cp2017
  • 16:18 ema: depool varnish-be on cp2017
  • 16:08 elukey: uploaded piwik 2.17.1-1 to jessie-wikimedia main
  • 15:17 Amir1: deleting duplicate rows in ores_classification dated after revision 775502802 (dated April 15th) (T163337)
  • 15:16 XioNoX: disabling pybal on lvs2002 for T163323
  • 14:32 moritzm: upgrading tor on radium to 0.2.9.10
  • 14:23 moritzm: rebooting radium (tor relay) for kernel update to Linux 4.9
  • 14:09 moritzm: rebooting osmium for kernel update to Linux 4.9
  • 14:06 gehel: rolling restart of kartotherian / tilerator on maps codfw cluster
  • 13:58 gehel: rolling restart of kartotherian / tilerator on maps eqiad cluster
  • 13:58 marostegui: Stop MySQL on db1068 and db1081 for maintenance - T163110
  • 13:57 jynus: running reset slave all on db2019
  • 13:53 gehel: rolling restart of kartotherian / tilerator on maps-test cluster
  • 13:18 moritzm: restarting hhvm on mw2097/2098 to pick up icu security update
  • 13:11 elukey: upgrading Piwik to 2.17.1 (brief downtime during the maintenance announced)
  • 12:12 elukey: restart Yarn Resource manager on analytics1001 (hadoop master) to pick up new JVM settings
  • 12:11 moritzm: installing icu security updates
  • 11:32 _joe_: removing hack for jobqueue's refreshlinks T163418 from the jobrunners
  • 11:23 jynus: changing db2071 to replicate from db2016
  • 10:32 moritzm: installing remaining dbus updates from jessie point update
  • 10:07 elukey: restart Yarn Resource manager on analytics1002 (hadoop master standby) to pick up new JVM settings
  • 09:47 Amir1: running the cleanup script for ores_classification in enwiki
  • 09:38 _joe_: live-hack redeployed, running scap pull on codfw jobrunners T163418
  • 09:38 _joe_: live-hack redeployed, running scap pull on codfw jobrunners
  • 09:34 hashar@naos: Synchronized rpc/RunJobs.php: Revert "rpc: raise exception instead of die" - causes monitoring spam (duration: 01m 20s)
  • 09:17 _joe_: removed the live hack, running scap pull again on mw2154
  • 09:14 _joe_: scap pull of live hack for T163418 on mw2154
  • 08:47 _joe_: live-patching ./includes/jobqueue/jobs/RefreshLinksJob.php to drop all recursive jobs, T163418
  • 07:59 jynus: shutting down db1080 for cloning and upgrade T163413
  • 07:54 jynus@naos: Synchronized wmf-config/db-codfw.php: Add db2071, depooled (duration: 00m 53s)
  • 07:53 jynus@naos: Synchronized wmf-config/db-eqiad.php: Depool db1080 (duration: 01m 02s)
  • 07:53 marostegui: Deploy alter table enwiki.revision db1065 - https://phabricator.wikimedia.org/T132416
  • 07:31 marostegui@naos: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T132416 (duration: 02m 18s)
  • 07:12 marostegui: Deploy alter table on s4.image on eqiad master db1040 (this will create lag on eqiad - all hosts have been silenced) - https://phabricator.wikimedia.org/T73563
  • 06:39 marostegui: Deploy alter table on s4.oldimage on eqiad master db1040 (this will create lag on eqiad - all hosts have been silenced) - T73563
  • 01:37 mutante: mw2150 - restarted hhvm (had 'thread leakage' alert)
  • 01:28 mutante: ran puppet on all (16) Dell R320 via cumin to add CPU frequency check
  • 00:37 ejegg: updated CiviCRM from 90d679b to 51dbbad

2017-04-19

  • 23:58 ejegg: updated payments-wiki from ccfbf98 to ee7d402
  • 22:37 papaul: OS installation on db2071
  • 21:44 ejegg: updated SmashPig from 17c56b0 to 200f63e
  • 21:37 krinkle@naos: Synchronized php-1.29.0-wmf.20/resources/src/startup.js: I34bbe8edf - Fix js fatal (duration: 01m 20s)
  • 20:08 ejegg: updated payments-wiki from 5398b23 to ccfbf98
  • 19:22 krinkle@naos: Synchronized php-1.29.0-wmf.20/resources/src/mediawiki/mediawiki.js: Ie50bdd (duration: 00m 58s)
  • 19:20 krinkle@naos: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/extension.json: T162604 (duration: 01m 20s)
  • 19:17 XenoRyet: Updated SmashPig from 3db064d to 17c56b0
  • 18:58 ejegg: rolled back payments-wiki to 5398b23
  • 18:56 ejegg: updated payments-wiki from 5398b23 to 68e3ac6
  • 18:27 ariel@naos: Finished deploy [dumps/dumps@ad621e6]: doc fixes thanks to awight (duration: 00m 04s)
  • 18:27 ariel@naos: Started deploy [dumps/dumps@ad621e6]: doc fixes thanks to awight
  • 18:25 ejegg: updated payments-wiki from 36f38f6 to 5398b23
  • 18:19 mobrovac: restbase stopping RB and disabling puppet on restbase1018 due to T163292
  • 18:18 ariel@naos: Finished deploy [dumps/dumps@101f8a4]: page range fixes and standalone scripts (duration: 00m 18s)
  • 18:18 ariel@naos: Started deploy [dumps/dumps@101f8a4]: page range fixes and standalone scripts
  • 17:27 Amir1: mwscript extensions/ORES/maintenance/CleanDuplicateScores.php on all wikis with ORES review tool enabled (T163337)
  • 17:26 thcipriani@naos: Synchronized docroot/noc/index.html: test scap on naos.codfw.wmnetdocroot/noc/index.html: trailing whitespace (duration: 02m 02s)
  • 17:25 mobrovac@naos: Started restart [restbase/deploy@1bfada4]: Restart to stop trying to connect to dead restbase1018 Cassandra instances - T163292
  • 17:08 thcipriani@naos.codfw.wmnet: test
  • 17:03 filippo@naos: Finished deploy [prometheus/jmx_exporter@7327459]: test deploy from naos (duration: 00m 03s)
  • 17:03 filippo@naos: Started deploy [prometheus/jmx_exporter@7327459]: test deploy from naos
  • 17:02 godog: bounce tcpircbot on einsteinium to pick up changes
  • 17:02 _joe_: running manally enwiki refreshLinks jobs to catch up a bit
  • 16:59 papaul: power balancing on mw2215
  • 16:58 Amir1: ladsgroup@naos:~$ mwscript extensions/ORES/maintenance/CleanDuplicateScores.php --wiki=enwiki froze
  • 16:49 Amir1: ladsgroup@naos:~$ mwscript extensions/ORES/maintenance/CleanDuplicateScores.php --wiki=enwiki (T163337)
  • 16:33 godog: deploy.fixurl on G@deployment_target:* after deployment server switchover
  • 16:20 gehel: disabling deprecation warning logs on elasticsearch eqiad - T163345
  • 16:19 jynus: setting db2033 as read write
  • 16:13 godog: run puppet on naos.codfw.wmnet - new deployment server
  • 16:03 gehel: disabling deprecation warning logs on elasticsearch codfw - T163345
  • 15:51 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=elasticsearch,name=elastic2020.*
  • 15:49 jynus: shutting down db2033 (x1-master)
  • 15:48 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=appserver,name=mw2256.*
  • 15:48 jynus@tin: Synchronized wmf-config/db-codfw.php: Failing over x1-master (duration: 00m 41s)
  • 15:46 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2020.codfw.wmnet
  • 15:42 jynus@tin: Synchronized wmf-config/InitialiseSettings.php: Disable cx_translation- it is causing an outage on x1 (duration: 02m 44s)
  • 15:40 dzahn@puppetmaster2001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet
  • 15:32 mutante: mw2256 went down and showed " PANIC: double fault, error_code: 0x0"
  • 15:16 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2055 as an additional API server (duration: 01m 02s)
  • 15:11 _joe_: ran cumin 'R:class = role::mediawiki::jobrunner and *.eqiad.wmnet' 'systemctl reset-failed' manually
  • 15:07 godog: start swiftrepl on ms-fe1005 for codfw switchover
  • 15:04 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_restart_parsoid(eqiad, codfw) Successfully completed
  • 14:53 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet,service=apache2
  • 14:53 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet,service=nginx
  • 14:48 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=nginx
  • 14:48 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=apache2
  • 14:46 gehel: banning elastic2020 from codfw cluster - T149006
  • 14:46 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_restart_parsoid(eqiad, codfw) Rolling restart parsoid in eqiad and codfw
  • 14:44 oblivian@tin: Synchronized wmf-config/ProductionServices.php: Fix redis locks (duration: 02m 24s)
  • 14:41 akosiaris: powercycle mw2256
  • 14:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(eqiad, codfw) Successfully completed
  • 14:33 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(eqiad, codfw) Update Tendril configuration for the new masters
  • 14:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_start_maintenance(eqiad, codfw) Successfully completed
  • 14:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_start_maintenance(eqiad, codfw) Start MediaWiki maintenance in the new master DC
  • 14:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_restore_ttl(eqiad, codfw) Successfully completed
  • 14:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_restore_ttl(eqiad, codfw) Restore the TTL of all the MediaWiki discovery records
  • 14:30 switchdc: (volans@sarin) END TASK - switchdc.stages.t08_stop_mediawiki_readonly(eqiad, codfw) Successfully completed
  • 14:30 switchdc: (volans@sarin) MediaWiki read-only period ends at: 2017-04-19 14:30:05.678665
  • 14:30 root@tin: Synchronized wmf-config/db-codfw.php: Set MediaWiki in read-write mode in datacenter codfw (duration: 00m 18s)
  • 14:29 switchdc: (volans@sarin) START TASK - switchdc.stages.t08_stop_mediawiki_readonly(eqiad, codfw) Set MediaWiki in read-write mode (db_to config already merged and git pulled)
  • 14:28 switchdc: (volans@sarin) END TASK - switchdc.stages.t07_coredb_masters_readwrite(eqiad, codfw) Successfully completed
  • 14:28 switchdc: (volans@sarin) START TASK - switchdc.stages.t07_coredb_masters_readwrite(eqiad, codfw) set core DB masters in read-write mode
  • 14:25 switchdc: (volans@sarin) END TASK - switchdc.stages.t06_redis(eqiad, codfw) Successfully completed
  • 14:25 switchdc: (volans@sarin) START TASK - switchdc.stages.t06_redis(eqiad, codfw) Switch the Redis replication
  • 14:25 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Successfully completed
  • 14:22 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Switch traffic flow to the appservers in the new datacenter
  • 14:22 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_datacenter(eqiad, codfw) Successfully completed
  • 14:22 root@tin: Synchronized wmf-config/CommonSettings.php: Switch MediaWiki active datacenter to codfw (duration: 00m 19s)
  • 14:21 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_datacenter(eqiad, codfw) Switch MediaWiki configuration to the new datacenter
  • 14:21 switchdc: (volans@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 14:15 switchdc: (volans@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 14:15 switchdc: (volans@sarin) END TASK - switchdc.stages.t03_coredb_masters_readonly(eqiad, codfw) Successfully completed
  • 14:15 switchdc: (volans@sarin) START TASK - switchdc.stages.t03_coredb_masters_readonly(eqiad, codfw) set core DB masters in read-only mode
  • 14:14 switchdc: (volans@sarin) END TASK - switchdc.stages.t02_start_mediawiki_readonly(eqiad, codfw) Successfully completed
  • 14:14 root@tin: Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-only mode in datacenter eqiad (duration: 01m 29s)
  • 14:13 switchdc: (volans@sarin) MediaWiki read-only period starts at: 2017-04-19 14:12:54.007017
  • 14:12 switchdc: (volans@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(eqiad, codfw) Set MediaWiki in read-only mode (db_from config already merged and git pulled)
  • 14:09 switchdc: (volans@sarin) END TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Successfully completed
  • 14:07 switchdc: (volans@sarin) START TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Stop MediaWiki maintenance in the old master DC
  • 14:06 godog: stop swiftrepl on ms-fe1005 for codfw switchover
  • 14:06 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Successfully completed
  • 14:06 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Reduce the TTL of all the MediaWiki discovery records
  • 14:06 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 14:05 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on selected hosts
  • 14:00 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 13:42 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2014.codfw.wmnet,service=varnish-be
  • 13:28 urandom: cqlsh -f /etc/cassandra/adduser.cql, recreating user/perms (as-needed)
  • 12:38 urandom: T163292: Starting removal of Cassandra instance restbase1018-c.eqiad.wmnet
  • 11:36 oblivian:: Setting swift-rw in eqiad DOWN
  • 11:36 oblivian:: Setting swift-rw in codfw UP
  • 11:36 ema: repool varnish-be on cp3044
  • 11:23 godog: add naos to git-deploy term on common-infrastructure4 - T162900
  • 11:03 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 10:57 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 10:56 _joe_: running the warmup stage in codfw for final testing
  • 10:41 ema: depool varnish-be on cp3044 because of mailbox lag issues
  • 09:34 moritzm: installing dbus security updates
  • 09:11 elukey: cleaning up ocg1003's /srv/deployment/ocg/postmortem dir (root partition filled up)
  • 07:26 hoo: Updated the sites and site_identifiers tables on all Wikidata clients for T149522.
  • 06:57 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t06_redis(codfw, eqiad) Successfully completed
  • 06:56 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication
  • 06:52 _joe_: artificially stopping slave replication on rdb2001 for a final test of the switchover redis stage
  • 03:53 urandom: T163292: Starting removal of Cassandra instance restbase1018-b.eqiad.wmnet
  • 03:49 mobrovac@tin: Started restart [restbase/deploy@1bfada4]: (no justification provided)
  • 03:40 mobrovac@tin: Started restart [restbase/deploy@1bfada4]: Kick RB to pick up restbase1018 instances are gone
  • 03:32 mobrovac@tin: Finished deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292 (duration: 00m 53s)
  • 03:31 mobrovac@tin: Started deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292
  • 01:58 mutante: naos: rsyncd is of course legitimately running on a deployment server sepearate from this (unlike in other cases where we used it for syncing during migration), so this was just the one config fragment for /home and not removing the service or anything
  • 01:56 mutante: naos: manually deleting rsyncd config remnants (puppet wouldn't know to clean up after itself)
  • 01:47 mutante: rsyncing /home from mira to naos (T162900)
  • 01:21 urandom: T163292: Starting removal of Cassandra instance restbase1018-a.eqiad.wmnet

2017-04-18

  • 23:04 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1018.eqiad.wmnet
  • 23:02 mutante: ms1001 - deleting old GlobalCert SSL cert for dumps.wm that was about to expire and is replaced by Letsencrypt,
  • 22:30 mutante: ocg1003 gzipping ocg.log for disk space
  • 21:12 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 20:36 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be
  • 17:26 mobrovac@tin: Finished deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons (duration: 07m 12s)
  • 17:26 ssastry@tin: Finished deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m) (duration: 06m 25s)
  • 17:19 ssastry@tin: Started deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m)
  • 17:19 mobrovac@tin: Started deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons
  • 17:12 XenoRyet: updated tools from a8b8d72 to a1e9342
  • 17:09 elukey: restart nutcracker in codfw (profile::mediawiki::nutcracker) to make sure that all the daemons are running with the latest config
  • 16:26 bblack: completed Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 )
  • 16:21 bblack: starting Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 )
  • 16:15 jynus: reimporting some rows to dbstore1002 on jawiki and ruwiki T160509
  • 16:12 godog: reboot tin to fix cpu mhz issue and check bios settings - T163158
  • 16:09 mobrovac@tin: Finished deploy [restbase/deploy@960b468]: Blacklist an enwiki and a commons page (duration: 08m 16s)
  • 16:01 mobrovac@tin: Started deploy [restbase/deploy@960b468]: Blacklist an enwiki and a commons page
  • 16:00 mobrovac@tin: Finished deploy [restbase/deploy@960b468]: Dev Cluster: Blacklist an enwiki and a commons page (duration: 01m 42s)
  • 15:58 mobrovac@tin: Started deploy [restbase/deploy@960b468]: Dev Cluster: Blacklist an enwiki and a commons page
  • 15:20 elukey: restored default output-buffer config for rdb2005:6479
  • 15:08 godog: puppet-run on cache_upload in codfw/eqiad to pick up swift a/p changes
  • 15:02 godog: puppet-run on cache_upload in codfw/eqiad to pick up switch a/a changes
  • 15:02 gehel: upgrading elastic2020 to elasticsearch 5.1.2
  • 14:55 _joe_: switchover of services, misc things done
  • 14:54 oblivian:: Setting restbase-async in codfw DOWN
  • 14:54 oblivian:: Setting restbase-async in eqiad UP
  • 14:43 _joe_: switching traffic for all a/a services plus maps and restbase to codfw-only
  • 14:38 _joe_: forcing puppet run on caches for catching up with the a/a setting of maps and restbase
  • 14:33 oblivian:: Setting restbase in eqiad DOWN
  • 14:33 _joe_: starting switchover of services eqiad => codfw; external traffic will be switched over, as well as internal traffic to restbase
  • 14:25 gehel: un-ban elastic2020 to get ready for real-life test during switchover - T149006
  • 14:22 elukey: executed config set client-output-buffer-limit "normal 0 0 0 slave 2147483648 2147483648 300 pubsub 33554432 8388608 60" on rdb2005:6749 as attempt to solve slave lagging - T159850
  • 14:21 oblivian:: Setting mobileapps in eqiad UP
  • 14:14 oblivian:: Setting mobileapps in eqiad DOWN
  • 14:11 elukey: executed CONFIG SET appendfsync everysec (default) to restore defaults on rdb2005:6479- T159850
  • 14:08 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Successfully completed
  • 14:04 elukey: executed CONFIG SET appendfsync no on rdb2005:6479 to test if fsync stalls affect replication - T159850
  • 13:50 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Rolling restart parsoid in eqiad and codfw
  • 13:35 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Failed to execute
  • 13:35 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Stop MediaWiki maintenance in the old master DC
  • 12:32 moritzm: upgrading labnodepool1001 to Linux 4.9
  • 12:13 moritzm: upgraded mw1261 to HHVM 3.18.2+wmf2
  • 11:39 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Successfully completed
  • 11:38 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 11:37 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(codfw, eqiad) Successfully completed
  • 11:37 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(codfw, eqiad) Update Tendril configuration for the new masters
  • 11:35 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(eqiad, codfw) Successfully completed
  • 11:35 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(eqiad, codfw) Update Tendril configuration for the new masters
  • 11:34 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_tendril(codfw, eqiad) Successfully completed
  • 11:34 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_tendril(codfw, eqiad) Update Tendril configuration for the new masters
  • 11:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Successfully completed
  • 11:33 switchdc: (volans@sarin) START TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Restore the TTL of all the MediaWiki discovery records
  • 11:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Successfully completed
  • 11:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-write mode (db_to config already merged and git pulled)
  • 11:30 switchdc: (volans@sarin) END TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) Successfully completed
  • 11:30 switchdc: (volans@sarin) START TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) set core DB masters in read-write mode
  • 11:18 switchdc: (volans@sarin) END TASK - switchdc.stages.t06_redis(codfw, eqiad) Successfully completed
  • 11:18 switchdc: (volans@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication
  • 11:14 moritzm: upgrading logstash* to Linux 4.9
  • 10:58 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Successfully completed
  • 10:56 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Switch traffic flow to the appservers in the new datacenter
  • 10:56 switchdc: (volans@sarin) END TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Successfully completed
  • 10:55 switchdc: (volans@sarin) START TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Switch MediaWiki configuration to the new datacenter
  • 10:48 switchdc: (volans@sarin) END TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) Failed to execute
  • 10:48 switchdc: (volans@sarin) START TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) set core DB masters in read-only mode
  • 10:43 switchdc: (volans@sarin) END TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Successfully completed
  • 10:43 switchdc: (volans@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-only mode (db_from config already merged and git pulled)
  • 10:33 switchdc: (volans@sarin) END TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Failed to execute
  • 10:33 switchdc: (volans@sarin) START TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Stop MediaWiki maintenance in the old master DC
  • 10:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Successfully completed
  • 10:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Reduce the TTL of all the MediaWiki discovery records
  • 10:31 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Successfully completed
  • 10:31 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Disabling puppet on selected hosts
  • 10:28 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Failed to execute
  • 10:28 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(codfw, eqiad) Reduce the TTL of all the MediaWiki discovery records
  • 10:26 switchdc: (volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Successfully completed
  • 10:26 switchdc: (volans@sarin) START TASK - switchdc.stages.t00_disable_puppet(codfw, eqiad) Disabling puppet on selected hosts
  • 10:25 volans: Final test of switchdc steps in the codfw->eqiad configuration, only idempotent changes, T160178
  • 10:25 moritzm: installing wireshark security updates
  • 10:20 moritzm: uploaded HHVM 3.18.2+wmf2 for jessie-wikimedia/experimental (includes fix for T162354)
  • 09:52 oblivian:: Setting zotero in codfw UP
  • 09:50 _joe_: testing switchover script for services, will act on zotero in codfw
  • 09:45 _joe_: adding 60G to the ocg output partition on ocg1003
  • 09:17 oblivian@neodymium: conftool action : set/pooled=true; selector: dnsdisc=zotero,name=codfw
  • 09:03 volans: upgrading conftool to v0.4.1 on neodymium/sarin
  • 07:48 _joe_: uploaded python-conftool 0.4.1 to jessie-wikimedia
  • 07:42 _joe_: cleaning up orphaned COW images in /var/cache/pbuilder/build/ on copper
  • 06:16 marostegui: For the record: restarted s7 instance on db1069 - T163183
  • 00:36 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend/resources/mobile.mainMenu/mainmenu.less: T163059 (duration: 03m 07s)

2017-04-17

  • 23:37 mutante: runnin rmmod acpi_pad on the 16 R320 via cumin, since blacklisting in puppet does not actively remove, confirmed unloaded. (16/16) success ratio (>= 100.0% threshold) for command: 'lsmod|grep -c acpi_pad ||:' (T162850)
  • 23:33 mutante: running puppet via cumin on all 16 Dell PowerEdge R320, adding blacklist file for acpi_pad kernel module. 15/16 success, all but tin (T162850)
  • 22:46 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/modules/ext.wikimediaEvents.recentChangesClicks.js: T158458 T163152 (duration: 03m 01s)
  • 22:42 mutante: tin - load average going down, acpi_pad processes gone, cpu usage low again (T163158)
  • 22:40 mutante: tin - rmmod acpi_pad (T163158)
  • 22:08 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/modules/ext.wikimediaEvents.recentChangesClicks.js: T158458 T163152 (duration: 16m 23s)
  • 19:16 mutante: tegmen test ircecho stop/start service to confirm it's fine on jessie/prod icinga role (that's the passive server)
  • 19:02 demon@tin: Synchronized wmf-config/: Pruning some old extension message files, co-master sync (duration: 01m 52s)
  • 18:58 demon@tin: Pruned MediaWiki: 1.29.0-wmf.15 (duration: 00m 14s)
  • 18:46 maxsem@tin: Finished deploy [tilerator/deploy@001811e]: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only (duration: 00m 19s)
  • 18:46 maxsem@tin: Started deploy [tilerator/deploy@001811e]: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only
  • 18:45 maxsem@tin: scap aborted: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only (duration: 00m 19s)
  • 18:45 maxsem@tin: Started scap: https://gerrit.wikimedia.org/r/#/c/348224/ to test hosts only
  • 15:48 mobrovac@tin: Finished deploy [restbase/deploy@6595298]: Update client caching headers for T161284 (duration: 08m 15s)
  • 15:40 mobrovac@tin: Started deploy [restbase/deploy@6595298]: Update client caching headers for T161284
  • 15:34 mobrovac@tin: Finished deploy [restbase/deploy@6595298]: (no justification provided) (duration: 01m 29s)
  • 15:33 mobrovac@tin: Started deploy [restbase/deploy@6595298]: (no justification provided)
  • 15:32 mobrovac@tin: Finished deploy [restbase/deploy@6595298]: (no justification provided) (duration: 01m 42s)
  • 15:31 mobrovac@tin: Started deploy [restbase/deploy@6595298]: (no justification provided)
  • 09:33 marostegui: Silence alerts for restbase2004 and restbase2009 T160759

2017-04-16

  • 15:44 elukey: restart ocg on ocg1003 to clean up deleted files in lsof
  • 15:35 elukey: executing sudo find -name *.pdf -mtime +3 -exec rm {} \; on ocg1003's /srv/deployment/ocg/output to clean up some disk space - T162780

2017-04-14

  • 23:14 jynus: skipping CREATE DATABASE wbwikimedia on dbstore2001- duplicate declaration due to multi-source
  • 22:58 jynus: skipping CREATE DATABASE pawikisource on dbstore2001- duplicate declaration due to multi-source
  • 22:49 volans: restarting parsoid to get the disable linter change T148609
  • 22:17 Reedy: created linter tables on wbwikimedia T148609
  • 22:16 Reedy: created linter tables on pawikisource T148609
  • 20:53 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Disable Linter on larger wikis T148609 (duration: 00m 41s)
  • 20:26 reedy@tin: Synchronized wmf-config/abusefilter.php: abusefilter-modify-restricted for trwiki T161960 (duration: 01m 38s)
  • 17:48 mutante: mw1297 - restarted hhvm and apache
  • 17:07 twentyafterfour: deployed phabricator hotfix for T162943
  • 10:29 elukey: rollback systctl settings on mw1306 after experiment (stop jobchron/runner, stop hhvm, restore systctl settings, restart hhvm and job* daemons)
  • 09:50 elukey: temporarily set sysctl -w net.netfilter.nf_conntrack_max=524288 on mw1306 (jobrunner) as test - (rollback: sysctl -w net.netfilter.nf_conntrack_max=262144")
  • 09:43 elukey: temporarily set sysctl -w net.ipv4.ip_local_port_range="15000 64000" on mw1306 (jobrunner) as test - (rollback: sysctl -w net.ipv4.ip_local_port_range="32768 60999") - T157968
  • 08:32 elukey: restored appendfsync to 'everysec' on Redis rdb2005:6380 (end of performance experiment)
  • 07:23 elukey: executed CONFIG SET appendfsync no on redis2005:6780 as performance test
  • 00:39 niharika29@tin: Synchronized wmf-config/abusefilter.php: Fix Abuse Filter configuration for tr.wikipedia (T161960) (duration: 00m 42s)
  • 00:30 niharika29@tin: Finished scap: Reword ORES preferences (T162831), Put ORES r behind a preference (T162831), Deploy Special:Autoblocklist (T146414) (duration: 24m 44s)
  • 00:05 niharika29@tin: Started scap: Reword ORES preferences (T162831), Put ORES r behind a preference (T162831), Deploy Special:Autoblocklist (T146414)
  • 00:03 mutante: mw1297 - restart hhvm/apache
  • 00:03 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 41s)
  • 00:02 niharika29@tin: Synchronized wmf-config/CommonSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 41s)
  • 00:00 mutante: mw1293 - restart hhvm

2017-04-13

  • 23:56 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Retry sync Revert Remove use of blacklist for related pages feature (T162201) (duration: 00m 40s)
  • 23:51 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Revert Remove use of blacklist for related pages feature (T162201) (duration: 00m 41s)
  • 23:43 niharika29@tin: Synchronized wmf-config/CommonSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 40s)
  • 23:41 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Remove use of blacklist for related pages feature (T162201) (duration: 00m 40s)
  • 23:39 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable related pages on Vector for htwiki (T126826) (duration: 00m 41s)
  • 23:26 niharika29@tin: Synchronized php-1.29.0-wmf.20/extensions/CirrusSearch/: Revert Workaround OOM issue on ngrams field (duration: 00m 54s)
  • 23:19 Dereckson: Create account for Jayantanth on wb.wikimedia (bureaucrat)
  • 23:09 dereckson@tin: Synchronized wmf-config/interwiki.php: DMOZ, pa.wikisource and wb.wikimedia interwiki map update (duration: 00m 41s)
  • 23:01 Dereckson: Create local-multiwrite stores for wb.wikimedia (T162510)
  • 23:01 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configurationfor wb.wikimedia.org (T162510) (duration: 00m 40s)
  • 23:00 Dereckson: Create Translate extension tables for wb.wikimedia (T162510)
  • 22:59 dereckson@tin: Synchronized multiversion/MWMultiVersion.php: Add wb.wikimedia.org to wikimedia.org domains to serve as wikis (T162510) (duration: 00m 40s)
  • 22:59 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: Create wb.wikimedia.org (T162510)
  • 22:58 dereckson@tin: Synchronized dblists: Create wb.wikimedia.org (T162510) (duration: 00m 41s)
  • 22:47 dereckson@tin: Synchronized static/images/project-logos/: Logos for wb.wikimedia (T162510) (duration: 00m 41s)
  • 22:32 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: pa.wikisource creation (take two) (duration: 00m 41s)
  • 22:31 dereckson@tin: Synchronized w/static/images/project-logos/: pa.wikisource creation (take two) (duration: 00m 40s)
  • 22:30 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: pa.wikisource creation (take two)
  • 22:30 dereckson@tin: Synchronized dblists: pa.wikisource creation (take two) (duration: 00m 41s)
  • 22:15 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for pa.wikisource (T149522) (duration: 00m 41s)
  • 22:14 dereckson@tin: Synchronized static/images/project-logos/: Logos for pa.wikisource (T149522) (duration: 00m 41s)
  • 22:12 dereckson@tin: rebuilt wikiversions.php and synchronized wikiversions files: (no justification provided)
  • 22:12 dereckson@tin: Synchronized dblists: pa.wikisource creation (T149522) (duration: 00m 41s)
  • 21:56 demon@tin: Finished scap: pruned cdb files from wmf.18 (duration: 07m 55s)
  • 21:48 demon@tin: Started scap: pruned cdb files from wmf.18
  • 20:07 urandom: T161243: Clearing all snapshots
  • 19:45 ejegg: updated civicrm from 908b9c1 to 90d679b
  • 19:43 ejegg: updated SmashPig from ab52dbe to 3db064d
  • 19:16 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.20
  • 18:57 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Clean Wikisource namespaces T46320 (duration: 00m 43s)
  • 18:42 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Education Program on it.wikiversity T162692 (duration: 00m 43s)
  • 18:38 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/LiquidThreads: Remove extra parameter from hook (duration: 00m 45s)
  • 18:35 reedy@tin: Synchronized wmf-config/abusefilter.php: Enable AbuseFilter blocks on tr.wikipedia T161960 (duration: 00m 43s)
  • 18:30 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable NewUserMessage on tr.wikiquote T161962 (duration: 00m 43s)
  • 18:30 urandom: T161243: Truncating parsoid tables (wikimedia storage group)
  • 18:29 mutante: restarting jenkins service to apply logging change gerrit:347877. it was already tested on jenkinstest.integration.eqiad.wmflabs
  • 18:25 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/Wikidata: Stop some logspam for deprecated hooks (duration: 02m 06s)
  • 18:23 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents: Stop some logspam for deprecated hooks (duration: 00m 43s)
  • 18:21 reedy@tin: Synchronized php-1.29.0-wmf.20/extensions/LiquidThreads: Stop some logspam for deprecated hooks (duration: 00m 45s)
  • 18:19 reedy@tin: Synchronized php-1.29.0-wmf.19/extensions/Wikidata: Stop some logspam for deprecated hook usage (duration: 02m 14s)
  • 18:16 urandom: T161243: Truncating parsoid tables (default storage group)
  • 18:16 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Document EducationProgram config (duration: 00m 43s)
  • 18:12 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgUsejQueryThree to false everywhere ahead of further testing (duration: 00m 43s)
  • 18:09 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: Run 3d2png with xfvb-run on beta (duration: 00m 43s)
  • 16:55 elukey: restored default value of client-output-buffer-limit on rdb1007:6379 - T159850
  • 16:23 mobrovac@tin: Finished deploy [citoid/deploy@b8c4cb2]: Test deploy for T162814 (duration: 02m 24s)
  • 16:21 mobrovac@tin: Started deploy [citoid/deploy@b8c4cb2]: Test deploy for T162814
  • 16:15 thcipriani@tin: Synchronized README: scap.cfg change test (duration: 00m 44s)
  • 15:49 mobrovac@tin: Finished deploy [citoid/deploy@212800d]: Enable multiple results for T115248 and remove b/c for T114515 (duration: 03m 10s)
  • 15:46 mobrovac@tin: Started deploy [citoid/deploy@212800d]: Enable multiple results for T115248 and remove b/c for T114515
  • 15:02 andrewbogott: disabling puppet on dubnium and pollux for a cautious merge of https://gerrit.wikimedia.org/r/#/c/348071
  • 15:01 andrewbogott: disabling puppet on seaborgium and serpens for a cautious merge of https://gerrit.wikimedia.org/r/#/c/348071
  • 14:56 ppchelko@tin: Finished deploy [changeprop/deploy@e47afea]: Provide separate rules for ORES precaching in both DCs (duration: 00m 58s)
  • 14:55 ppchelko@tin: Started deploy [changeprop/deploy@e47afea]: Provide separate rules for ORES precaching in both DCs
  • 14:50 moritzm: installing bouncycastle security updates
  • 14:27 bblack: disabling puppet on recnds/ntp boxes to control patch rollout
  • 13:28 moritzm: powercycling thumbor1001, stuck in reboot
  • 13:18 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 43s)
  • 13:16 hashar@tin: Synchronized dblists/closed.dblist: Close wikimania2016 - T161183 (duration: 00m 43s)
  • 13:14 hashar@tin: Synchronized static/images/project-logos: (no justification provided) (duration: 00m 46s)
  • 13:00 moritzm: Upgrading thumbor* to Linux 4.9
  • 12:52 elukey: temporary set config set client-output-buffer-limit "slave 5368709120 5368709120 180" on rdb1007:6379
  • 12:34 volans@tin: Synchronized wmf-config/db-eqiad.php: Use a generic retry for the read only message T160178 (duration: 00m 44s)
  • 12:34 elukey: temporary set config set client-output-buffer-limit "slave 3221225472 3221225472 180" on rdb1007:6379
  • 12:22 volans@tin: Synchronized wmf-config/db-codfw.php: Use a generic retry for the read only message T160178 (duration: 01m 54s)
  • 12:16 moritzm: restarting ntp on achernar
  • 11:59 elukey: temporary set config set client-output-buffer-limit "slave 2536870912 2536870912 60" on rdb1007:6379
  • 11:37 elukey: temporary set config set client-output-buffer-limit "slave 2147483648 2147483648 60" on rdb1007:6379 to give time to rdb2005's replication to catch up - T159850
  • 10:58 moritzm: rebooting alsafi to Linux 4.9
  • 10:58 moritzm: rebooting alfafi to Linux 4.9
  • 10:47 elukey: reverted previous config for Redis rdb2005
  • 10:47 XioNoX: Confirmed we can still reach cr2-knams:lo0 via v6 (from esams), disabling IPv4 transit sessions for T162601
  • 10:42 XioNoX: disable V6 transit BGP session on cr2-knams for T162601
  • 10:22 elukey: executed CONFIG SET appendfsync no (prev value: "everysec") to Redis instance 6380 on rdb2005 - T125735
  • 10:13 godog: upgrade thumbor to 0.1.38
  • 10:08 moritzm: rebooting restbase1016 to Linux 4.9
  • 09:39 moritzm: rebooting restbase1011 to Linux 4.9
  • 09:12 moritzm: rebooting restbase1010 to Linux 4.9
  • 06:29 elukey: re-arm keyholder on mira after reboot
  • 06:14 elukey: powercycle mira - eth0 errors in the dmesg, CPU system utilization skyrocketed
  • 04:14 mutante: ms-be2023 is rebooting
  • 04:12 mutante: ms-be2023 icinga alerts, no more swift processes. cant ssh to it. attempt to power cycle. mgmt console enourmous spam of "rejecting I/O to offline device"
  • 01:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=achernar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 00:36 catrope@tin: Finished scap: Split RCFilters GuidedTour messages for ORES vs non-ORES (T162693) (duration: 53m 47s)

2017-04-12

  • 23:42 catrope@tin: Started scap: Split RCFilters GuidedTour messages for ORES vs non-ORES (T162693)
  • 23:37 catrope@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend/: Log only infoboxes which are not a direct children of lead section (T149884) (duration: 01m 05s)
  • 23:35 catrope@tin: Synchronized php-1.29.0-wmf.20/resources/src/mediawiki.widgets: Fix setDisabled in mw.widgets.Complex* (T162667) (duration: 00m 42s)
  • 23:32 catrope@tin: Synchronized php-1.29.0-wmf.19/resources/src/mediawiki.widgets: Fix setDisabled in mw.widgets.Complex* (T162667) (duration: 00m 44s)
  • 23:25 awight: rebuilt and reenabled process-control jobs
  • 23:20 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable cross-wiki uploads to Commons (T162374) (duration: 00m 43s)
  • 23:19 cwd: removed p-c crontab to stop all jobs
  • 23:15 bblack@neodymium: conftool action : set/pooled=no; selector: name=achernar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 23:13 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgCiteResponsiveReferences on cawiki (T161307) and bgwiki (T162145) (duration: 00m 44s)
  • 23:02 bblack@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 22:50 bblack: acamar fixed up BIOS: HT disabled and power mgmt was set to PPW (DAPC) instead of PPW (OS)
  • 22:45 bblack: downtiming acamar again to fixup bios stuff (HT at least)
  • 21:31 Dereckson: Create Education Program tables on it.wikiversity (T162692)
  • 20:44 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Linter to all wikis - T148609 (duration: 00m 44s)
  • 20:42 bblack@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org,dc=codfw,cluster=dns,service=pdns_recursor
  • 20:25 mutante: planet2001 - manually updating all feeds to make it active (or would have to wait for crons)
  • 20:12 ssastry@tin: Finished deploy [parsoid/deploy@323cebb]: Updating Parsoid to 75debae3 (duration: 09m 16s)
  • 20:07 mutante: planet2001 - activating all the crons, making planet active/active eqiad/codfw
  • 20:03 ssastry@tin: Started deploy [parsoid/deploy@323cebb]: Updating Parsoid to 75debae3
  • 19:42 bd808@tin: Synchronized wmf-config/mc.php: Revert "wikitech: Enable binary memcached protocol" (duration: 00m 43s)
  • 19:05 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.20
  • 19:05 XenoRyet: reverted SmashPig from aede277 to ab52dbe
  • 19:05 demon@tin: Synchronized php: symlink bump (duration: 00m 42s)
  • 19:04 ejegg: updated payments-wiki from 0b396a3 to 36f38f6
  • 18:52 XenoRyet: updated SmashPig from ab52dbe to aede277
  • 18:45 thcipriani@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend: SWAT: formatter: Increase log level of infobox message T149884 (duration: 00m 46s)
  • 18:44 ppchelko@tin: Finished deploy [changeprop/deploy@e403f56]: Config: Send ORES precache requests to both DCs. Attempt #2. T159615 (duration: 01m 15s)
  • 18:43 ppchelko@tin: Started deploy [changeprop/deploy@e403f56]: Config: Send ORES precache requests to both DCs. Attempt #2. T159615
  • 18:38 thcipriani@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend: SWAT: formatter: Change log channel of infobox message T149884 (duration: 00m 46s)
  • 18:37 ppchelko@tin: Finished deploy [changeprop/deploy@0a9a008]: Config: Send ORES precache requests to both DCs. T159615 (duration: 06m 53s)
  • 18:30 ppchelko@tin: Started deploy [changeprop/deploy@0a9a008]: Config: Send ORES precache requests to both DCs. T159615
  • 18:26 thcipriani@tin: Synchronized php-1.29.0-wmf.20/extensions/MobileFrontend: SWAT: setMobileOptions at time of skin creation T125588 (duration: 00m 46s)
  • 18:18 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Tweak Russian logo wordmark T162036 PART II (duration: 00m 43s)
  • 18:16 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ru.svg: SWAT: Tweak Russian logo wordmark T162036 PART I (duration: 00m 43s)
  • 16:46 awight@tin: rebuilt wikiversions.php and synchronized wikiversions files: (no justification provided)
  • 16:37 awight@tin: Synchronized php-1.29.0-wmf.20/extensions/FundraiserLandingPage: Fix for donatewiki T162716 (duration: 00m 45s)
  • 16:35 awight@tin: Synchronized php-1.29.0-wmf.19/extensions/FundraiserLandingPage: Fix for donatewiki T162716 (duration: 00m 48s)
  • 15:53 chasemp: remove 2fa for Freddy2001 on wikitech per T162772
  • 14:31 andrewbogott: running maintain-meta_p on labsdb1001/1003/1009/1010/1011
  • 14:23 hashar: Restarting Jenkins for git/scm plugins updates
  • 14:06 hashar: European SWAT complete
  • 13:51 switchdc: (volans@neodymium) END TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Successfully completed
  • 13:48 switchdc: (volans@neodymium) START TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Switch traffic flow to the appservers in the new datacenter
  • 13:42 volans: testing t05_switch_traffic of the switchdc
  • 13:41 elukey: apply SLOWLOG RESET and CONFIG SET slowlog-max-len 100000 (prev value 10000, 10ms) to rdb1005:6380 to track down slow reqs - T125735
  • 13:37 hoo@tin: Synchronized php-1.29.0-wmf.20/extensions/Wikidata: Update Wikibase/ ArticlePlaceholder (duration: 02m 19s)
  • 13:33 hoo@tin: Synchronized php-1.29.0-wmf.19/extensions/Wikidata: Update Wikibase/ ArticlePlaceholder (duration: 02m 16s)
  • 13:33 elukey: restored slowlog-log-slower-than 10000 on rdb2005
  • 13:25 elukey: applied CONFIG SET slowlog-log-slower-than 300000 to Redis 6379 on rdb2005 and reset slowlog history to play with the stats
  • 13:10 addshore@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/extension.json: WMDE Spring campaign PT2/2 (duration: 00m 45s)
  • 13:09 addshore@tin: Synchronized php-1.29.0-wmf.20/extensions/WikimediaEvents/WikimediaEventsHooks.php: WMDE Spring campaign PT1/2 (duration: 00m 45s)
  • 13:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Revert "Temporarily enable change dispatch logging on testwikidata" - T159828 (duration: 00m 47s)
  • 12:23 elukey: restart HDFS datanode daemons on all the Hadoop worker node to pick up the new JVM settings
  • 12:18 kartik@tin: Finished deploy [cxserver/deploy@2842efa]: Update cxserver to 56a012d (duration: 03m 58s)
  • 12:14 kartik@tin: Started deploy [cxserver/deploy@2842efa]: Update cxserver to 56a012d
  • 11:57 elukey: restart Yarn nodemanager daemons on all the Hadoop worker node to pick up the new JVM settings
  • 11:05 _joe_: downgrading python-urllib3 on puppetmaster1001
  • 11:02 akosiaris: upgrade puppet across the trusty fleet to 3.8. T162462
  • 10:34 hashar: Upgrading Jenkins "Email Extension" plugin 2.57.1..2.57.2 and restarting Jenkins
  • 10:07 hashar: Upgrading Jenkins "Git client" plugin 2.3.0..2.4.1 and restarting Jenkins
  • 09:58 switchdc: (volans@neodymium) END TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) Successfully completed
  • 09:58 switchdc: (volans@neodymium) START TASK - switchdc.stages.t07_coredb_masters_readwrite(codfw, eqiad) set core DB masters in read-write mode
  • 09:56 switchdc: (volans@neodymium) END TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) Failed to execute
  • 09:56 switchdc: (volans@neodymium) START TASK - switchdc.stages.t03_coredb_masters_readonly(codfw, eqiad) set core DB masters in read-only mode
  • 09:53 _joe_: removing the old directory of data from ocg1003
  • 09:52 volans: testing t03 and t07 DB-RO/RW stages of switchdc (codfw->eqiad), we are already in that situation, t03 will fail the verfication, is expected
  • 09:52 godog: swift codfw-prod: ms-be2001 - ms-be2012 initial decom - T162785
  • 09:47 _joe_: remounting the new partition under /srv/deployment/ocg/output, cleaning out the old dir. Will cause a service interruption for requests to ocg1003 for a few minutes. T162780
  • 09:42 gehel: starting load on elastic2020 - T149006
  • 09:41 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: wmgUseGettingStarted false for dewiki (duration: 00m 45s)
  • 09:26 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: WMDE Spring campaign - Add logging from WikimediaEvent (duration: 00m 46s)
  • 09:22 hashar: Restarting Jenkins for Matrix related plugins updates (3)
  • 09:12 _joe_: copying data from / to the neww partition on ocg1003 T162462
  • 09:10 hashar: Restarting Jenkins for plugins update (2)
  • 09:06 _joe_: creating a LVM volume on ocg1003
  • 09:05 hashar: Restarting Jenkins for plugins update
  • 08:59 addshore@tin: Synchronized php-1.29.0-wmf.19/extensions/WikimediaEvents/extension.json: patch1 & patch2 WMDE Spring campaign PT2/2 (duration: 00m 45s)
  • 08:58 addshore@tin: Synchronized php-1.29.0-wmf.19/extensions/WikimediaEvents/WikimediaEventsHooks.php: patch1 & patch2 WMDE Spring campaign PT1/2 (duration: 00m 47s)
  • 08:52 ema: upgrade cache_upload to linux 4.9 T162029
  • 08:44 gehel: reimaging elastic2020 for testing - T149006
  • 08:24 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Successfully completed
  • 08:22 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 08:14 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Failed to execute
  • 08:14 root@tin: Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-write mode in datacenter eqiad (duration: 00m 35s)
  • 08:13 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t08_stop_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-write mode (db_to config already merged and git pulled)
  • 08:09 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t06_redis(codfw, eqiad) Successfully completed
  • 08:09 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication
  • 08:02 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Successfully completed
  • 08:02 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t05_switch_datacenter(codfw, eqiad) Switch MediaWiki configuration to the new datacenter
  • 08:00 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Successfully completed
  • 07:59 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Restore the TTL of all the MediaWiki discovery records
  • 07:58 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Successfully completed
  • 07:55 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t05_switch_traffic(codfw, eqiad) Switch traffic flow to the appservers in the new datacenter
  • 07:55 _joe_: resuming non-dry run tests of switchdc, all logs from switchdc by me are just tests
  • 06:57 _joe_: the last messages are just a test and nothing was really done, as codfw is already in read-only mode right now
  • 06:57 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Failed to execute
  • 06:57 root@tin: Synchronized wmf-config/db-codfw.php: Set MediaWiki in read-only mode in datacenter codfw (duration: 00m 23s)
  • 06:57 switchdc: (oblivian@sarin) MediaWiki read-only period starts at: 2017-04-12 06:56:53.822926
  • 06:56 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(codfw, eqiad) Set MediaWiki in read-only mode (db_from config already merged and git pulled)
  • 06:53 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Failed to execute
  • 06:53 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Stop MediaWiki maintenance in the old master DC
  • 06:50 _joe_: testing switchover codfw => eqiad, no destructive actions will be taken
  • 06:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T17441 (duration: 00m 46s)
  • 06:37 elukey: reimage mw2246.codfw.wmnet mw2152.codfw.wmnet to remove the /tmp partition (codfw videoscalers, switchover prep)
  • 06:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T132416 (duration: 00m 46s)
  • 06:28 _joe_: killing long-running puppet-agent on db2058 too
  • 06:20 _joe_: killing badly-started puppet agents on mc1010, tempdb2001,db1090, db2058, hydrogen, possibly others later
  • 06:13 marostegui: Deploy alter table on db1075 eqiad master (s3, image table) - T160415
  • 06:04 marostegui: Deploy schema change on s6 - db1093 - T17441
  • 06:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 02m 00s)
  • 05:56 marostegui: Deploy alter table on db2108 codfw master (s3, image table) - T160415
  • 04:53 legoktm: started `mwscriptwikiset refreshLinks.php small.dblist` on terbium

2017-04-11

  • 23:58 thcipriani@tin: Synchronized wmf-config/CirrusSearch-production.php: SWAT: Enable deleted archive indexing & searching T109561 PART II (duration: 00m 45s)
  • 23:56 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable deleted archive indexing & searching T109561 PART I (duration: 00m 45s)
  • 23:29 ejegg: updated fundraising-tools from 0a42db3 to a8b8d72
  • 23:27 thcipriani@tin: Synchronized portals: SWAT: Bumping portals to master T128546 (duration: 00m 46s)
  • 23:26 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portals to master T128546 (duration: 00m 46s)
  • 23:23 mutante: ocg: clearing host cache for ocg1001 which is shutdown for hardware repair. (on ocg1003: sudo -u ocg -g ocg nodejs-ocg /srv/deployment/ocg/ocg/mw-ocg-service/scripts/clear-host-cache.js -c /etc/ocg/mw-ocg-service.js ocg1001) T161158
  • 23:15 thcipriani@tin: Synchronized docroot/noc/conf/pageassessments.dblist: SWAT: Adding pageassessments.dblist for maintanence script T159438 PART II (duration: 00m 45s)
  • 23:14 thcipriani@tin: Synchronized dblists/pageassessments.dblist: SWAT: Adding pageassessments.dblist for maintanence script T159438 PART I (duration: 00m 45s)
  • 23:11 mutante: ocg1001 - scheduled downtime in icinga for host and all services, confirmed it's not actively doign things anymore, shutting down for hardware replacement (T161158)
  • 23:10 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Flow beta feature on frwikiversity T162022 (duration: 00m 46s)
  • 23:04 mutante: ocg1001 - apt-get clean for disk space
  • 22:36 mutante: ocg1003 started picking up jobs (mw-ocg-latexer) after it was enabled with gerrit:347781, ocg1001 was disabled in the same change. Also ganglia graphs confirm it. T84723 T161158
  • 22:22 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable alternate RevSlider slider on group0 T160410 (duration: 00m 45s)
  • 22:19 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=ocg1001.eqiad.wmnet
  • 22:17 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict BetaFeature on fiwiki (duration: 00m 46s)
  • 21:23 mobrovac@tin: Finished deploy [restbase/deploy@a4042a6]: Update the legal text in the API docs (duration: 06m 49s)
  • 21:17 mobrovac@tin: Started deploy [restbase/deploy@a4042a6]: Update the legal text in the API docs
  • 21:16 mobrovac@tin: Finished deploy [restbase/deploy@a4042a6]: Staging: Update the legal text in the API docs (duration: 03m 55s)
  • 21:12 mobrovac@tin: Started deploy [restbase/deploy@a4042a6]: Staging: Update the legal text in the API docs
  • 21:12 mobrovac@tin: Finished deploy [restbase/deploy@a4042a6]: Dev cluster: Update the legal text in the API docs (duration: 01m 37s)
  • 21:11 mobrovac@tin: Started deploy [restbase/deploy@a4042a6]: Dev cluster: Update the legal text in the API docs
  • 20:51 _joe_: killed running 'puppet agent t-v' on ruthenium
  • 19:20 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, attempt#2 T160764 (duration: 01m 25s)
  • 19:18 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, attempt#2 T160764
  • 19:11 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, T160764 (duration: 03m 38s)
  • 19:08 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, full deploy, T160764
  • 19:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.20
  • 19:01 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#3 T160764 (duration: 00m 52s)
  • 19:00 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#3 T160764
  • 18:34 elukey: restart hhvm on mw1165 (debug in /tmp/hhvm.5384.bt.)
  • 18:25 demon@tin: Finished scap: testwiki to wmf.20 to bootstrap (duration: 35m 27s)
  • 17:49 demon@tin: Started scap: testwiki to wmf.20 to bootstrap
  • 17:49 demon@tin: Pruned MediaWiki: 1.29.0-wmf.17 [keeping static files] (duration: 00m 16s)
  • 17:41 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Initial Scap3 config deploy - T116335 (duration: 10m 39s)
  • 17:30 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Initial Scap3 config deploy - T116335
  • 17:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 (duration: 00m 57s)
  • 17:14 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 17:08 mobrovac: restbase enabling back puppet for T116335
  • 17:07 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy, take 2 - T116335 (duration: 02m 12s)
  • 17:06 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 17:06 marostegui: Deploy unscheduled alter table on db1044 (s3, image table) - T160415
  • 17:05 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy, take 2 - T116335
  • 17:05 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#2 T160764 (duration: 03m 22s)
  • 17:04 marostegui: Deploy unscheduled alter table on db1015 (s3, image table) - T160415
  • 17:02 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy, take 2 - T116335 (duration: 00m 58s)
  • 17:02 marostegui: Deploy unscheduled alter table on db1038 (s3, image table) - T160415
  • 17:02 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: nope, no wmf.19 for donatewiki. life is hard
  • 17:02 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy, take 2 - T116335
  • 17:01 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001, attempt#2 T160764
  • 17:00 marostegui: Deploy unscheduled alter table on db1035 (s3, image table) - T160415
  • 16:58 marostegui: Deploy unscheduled alter table on db1077 (s3, image table) - T160415
  • 16:56 marostegui: Deploy unscheduled alter table on db1078 (s3, image table) - T160415
  • 16:54 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: donatewiki back to wmf.19. you put your left foot in, you put your left foot out...
  • 16:48 marostegui: Deploy unscheduled alter table on db1093 (adding pl_from index)
  • 16:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 00m 42s)
  • 16:43 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Successfully completed
  • 16:43 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy - T116335 (duration: 01m 33s)
  • 16:42 ppchelko@tin: Finished deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001 T160764 (duration: 04m 28s)
  • 16:41 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Staging: Initial Scap3 config deploy - T116335
  • 16:40 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Switch traffic flow to the appservers in the new datacenter
  • 16:37 ppchelko@tin: Started deploy [electron-render/deploy@5492cdb]: Update to latest upstream, canary on scb2001 T160764
  • 16:37 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: donatewiki still busted
  • 16:35 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: donatewiki back to wmf.19
  • 16:33 mobrovac@tin: Finished deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy - T116335 (duration: 01m 04s)
  • 16:32 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 16:32 mobrovac@tin: Started deploy [restbase/deploy@e470b9f]: Dev Cluster: Initial Scap3 config deploy - T116335
  • 16:28 mobrovac: restbase disabling puppet for T116335
  • 16:27 demon@tin: Synchronized README: no-op, co-master sync (duration: 00m 43s)
  • 16:24 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 16:11 switchdc: (volans@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed
  • 16:08 switchdc: (volans@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 16:08 volans: testing the codfw caches wipe+warm, take 2
  • 16:04 demon@tin: Synchronized scap/plugins/clean.py: syncing to both masters (duration: 00m 44s)
  • 15:56 switchdc: (volans@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Failed to execute
  • 15:54 switchdc: (volans@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches
  • 15:53 volans: testing the codfw caches wipe+warm: https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Phase_4.1_-_Wipe_caches T160178
  • 15:25 thcipriani@tin: Synchronized README: test sync for new scap version 3.5.5 (duration: 00m 59s)
  • 15:19 godog: upload scap 3.5.5-1 - T127762
  • 15:05 ema: upgrade cp4005 (cache_upload) to linux 4.9 T162029
  • 14:31 moritzm: powercycled restbase1007, stuck during reboot
  • 14:18 moritzm: upgrading restbase1007 to Linux 4.9
  • 13:55 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RCFilters beta feature on fawiki, ruwiki, trwiki, and frwiki (T144458) (duration: 00m 39s)
  • 13:54 ottomata: reimaging stat1004 as jessie
  • 13:53 akosiaris: upgrade puppet agent to 3.8 across the jessie fleet. Do that in a stages, starting with parsoid hosts. move on to mw fleet next. T162462
  • 13:51 akosiaris: upgrade puppet agent to 3.8 across the jessie fleet. Do that in a stages, starting with parsoid hosts
  • 13:49 godog: roll-upgrade swift to 2.2.0 across eqiad machines - T162609
  • 13:45 hashar: Updating all Jenkins jobs using the git plugin due to JJB change cdfeb7b - https://phabricator.wikimedia.org/T162674
  • 13:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add autopatrolled group to svwiktionary (T161919) (duration: 00m 39s)
  • 13:34 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Increase default image thumbnail size on Finnish Wikipedia to 250px (T162376) (duration: 00m 39s)
  • 13:10 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Give sysops ability to promote users to eliminator at fawiki (T162396) (duration: 00m 39s)
  • 13:01 godog: roll-upgrade swift to 2.2.0 across codfw machines - T162609
  • 12:55 moritzm: powercycling wtp2013, stuck during reboot
  • 12:47 elukey: reimage mw2246 (Debian codfw videoscaler) to Trusty
  • 12:46 marostegui: Deploy schema change on db1069 (s7 instance) - T160390
  • 11:42 ema: upgrade cache_misc to linux 4.9 T162029
  • 11:33 elukey: resume reboot of analytics1040->1050 for kernel upgrades
  • 11:27 moritzm: wtp2* to Linux 4.9
  • 11:27 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: NOOP (Beta file only) - Remove redundant wmgUseRevisionSlider in InitialiseSettings-labs (duration: 00m 38s)
  • 11:09 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: NOOP - Remove redundant testwiki from wmgUseLinter (already has group0) (duration: 00m 39s)
  • 11:02 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: NOOP (Beta file only) - Fix some tabs (duration: 00m 39s)
  • 10:46 moritzm: upgrading wtp1020-wtp1024 to Linux 4.9
  • 10:13 moritzm: upgrading wtp1010-wtp1019 to Linux 4.9
  • 09:17 moritzm: install remaining pam updates from jessie point update
  • 09:11 godog: upgrade swift to 2.2.0 on ms-be2001 - T162609
  • 06:58 moritzm: restarted cassandra-a on restbase2004, crashed with "out of heap memory"
  • 06:50 marostegui: Deploy alter table enwiki.revision dbstore1002 - T132416
  • 06:48 moritzm: installing jasper security updates
  • 06:30 elukey: restart hhvm on mw1299 - dump debug in /tmp/hhvm.84379.bt
  • 06:28 marostegui: Deploy alter table enwiki.revision db1072 - T132416
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T132416 (duration: 00m 43s)
  • 06:07 marostegui: Deploy schema change on db1041 (eqiad master) (s7) - T160390
  • 06:02 marostegui: Deploy schema change labsdb1003 (s7) - T160390
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T132416 (duration: 00m 39s)
  • 02:59 bblack: jessie recdns software upgrades complete
  • 02:52 bblack@neodymium: conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 02:51 bblack: upgrading maerlant to pdns-recursor 4.x
  • 02:50 bblack@neodymium: conftool action : set/pooled=no; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 02:48 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Apr 11 02:48:56 UTC 2017 (duration 5m 43s)
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 16s)
  • 02:37 bblack@neodymium: conftool action : set/pooled=yes; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 02:32 bblack: upgrading chromium to pdns-recursor 4.x
  • 02:31 bblack@neodymium: conftool action : set/pooled=no; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 07m 47s)
  • 02:16 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: service=pdns_recursor,name=nescio.wikimedia.org
  • 02:13 bblack: upgrading nescio to pdns-recursor 4.x
  • 02:06 bblack: jessie-recdns: unpausing upgrade process...

2017-04-10

  • 23:43 bblack: jessie-recdns: upgrade to pdns-recursor 4.x paused - hydrogen updated and in-service; chromium/nescio/maerlant still puppet-disabled. Going to leave things in this state for a while. If something seems amiss, hydrogen can be re-depooled via confctl: confctl select name=hydrogen.wikimedia.org,service=pdns_recursor set/pooled=no
  • 23:34 bblack@neodymium: conftool action : set/pooled=yes; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 23:33 bblack: upgrading hydrogen to pdns-recursor 4.x
  • 23:25 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Set ORES thresholds for fawiki, ruwiki, trwiki (duration: 00m 39s)
  • 23:18 bblack@neodymium: conftool action : set/pooled=no; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 23:04 bblack: puppet disabled on jessie recdns (maerlant, nescio, hydrogen, chromium) for complex upgrade process ( https://gerrit.wikimedia.org/r/#/c/346937/ )
  • 22:45 dapatrick: Deployed patch for T162621 to wmf18 and wmf19
  • 22:04 ejegg: updated CiviCRM from b6c8f3e to 908b9c1
  • 21:37 ejegg: updated payments-wiki from b5bcfa1 to 0b396a3
  • 21:33 gehel: logstash upgrade on all logstash1* nodes completed- T161908
  • 21:31 gehel: upgrading logstash on logstash1003 - T161908
  • 21:22 gehel: upgrading logstash on logstash1002 - T161908
  • 21:17 gehel: logstash upgrade on logstash1001 completed - T161908
  • 21:13 gehel: running puppet on logstash1001 to deploy new logstash plugins - T161908
  • 20:45 ejegg: updated payments-wiki from 9622a4b to b5bcfa1
  • 20:29 gehel: upgrading logstash on logstash1001 - T161908
  • 20:27 ebernhardson: deployed new logstash plugins to logstash100[123]
  • 20:16 bsitzmann@tin: Finished deploy [mobileapps/deploy@9bc8c07]: Update mobileapps to 1695900 (duration: 05m 27s)
  • 20:10 bsitzmann@tin: Started deploy [mobileapps/deploy@9bc8c07]: Update mobileapps to 1695900
  • 19:51 andrewbogott: upgrading qemu and oslo packages on labvirt1002
  • 19:38 gehel: disabling puppet on logstash1* - T161908
  • 19:38 gehel: starting logstash upgrade - some log messages will be lost! - T161908
  • 18:22 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES review tool in hewiki T161621 (duration: 00m 39s)
  • 18:12 thcipriani: mwscript extensions/ORES/maintenance/CheckModelVersions.php hewiki && mwscript extensions/ORES/maintenance/PopulateDatabase.php hewiki
  • 18:06 thcipriani: create ores tables on hewiki
  • 17:51 elukey: restore Hadoop masters to analytics1001
  • 17:16 papaul: testing lvs2002 after mainboard replacement
  • 17:06 gehel@tin: Finished deploy [wdqs/wdqs@1cfbd8d]: (no justification provided) (duration: 01m 22s)
  • 17:04 gehel@tin: Started deploy [wdqs/wdqs@1cfbd8d]: (no justification provided)
  • 16:48 _joe_: not really restarting parsoid, still testing swtichdc
  • 16:45 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Rolling restart parsoid in eqiad and codfw
  • 16:02 mobrovac@tin: Finished deploy [restbase/deploy@2c70843]: Initial deployment with Scap3 (duration: 07m 52s)
  • 15:58 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 15:58 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on MediaWiki jobrunners and videoscalers
  • 15:55 mobrovac@tin: Started deploy [restbase/deploy@2c70843]: Initial deployment with Scap3
  • 15:47 cmjohnson1: troubleshooting link cr2-eqiad:xe-3/0/1 {#2014 to asw-b-eqiad:xe-1/1/2 per T162199
  • 15:35 mobrovac@tin: Finished deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3 (duration: 00m 10s)
  • 15:35 mobrovac@tin: Started deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3
  • 15:33 mobrovac: restbase enabling back puppet in prod
  • 15:31 mobrovac@tin: Finished deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3 on staging (duration: 03m 31s)
  • 15:28 mobrovac@tin: Started deploy [restbase/deploy@a8d4d02]: Initial deployment with Scap3 on staging
  • 15:19 mobrovac@tin: Finished deploy [restbase/deploy@a8d4d02]: (no justification provided) (duration: 01m 22s)
  • 15:18 mobrovac@tin: Started deploy [restbase/deploy@a8d4d02]: (no justification provided)
  • 15:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 after maintenance with full weight (duration: 00m 39s)
  • 15:05 mobrovac: restbase disabling puppet for upgrade to scap3 deploys
  • 15:01 andrewbogott: disabling puppet on labcontrol1001 to raise log levels
  • 14:58 moritzm: upgrading wtp1006-wtp1009 to Linux 4.9
  • 14:52 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 14:52 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on MediaWiki jobrunners and videoscalers
  • 14:48 marostegui: Deploy alter table enwiki.revision db1073 - T132416
  • 14:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T132416 (duration: 00m 39s)
  • 14:47 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Failed to execute
  • 14:46 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 14:45 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Successfully completed
  • 14:45 ema: upgrade cache_maps to linux 4.9 T162029
  • 14:45 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_restore_ttl(codfw, eqiad) Restore the TTL of all the MediaWiki discovery records
  • 14:45 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Failed to execute
  • 14:45 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t09_start_maintenance(codfw, eqiad) Start MediaWiki maintenance in the new master DC
  • 14:39 switchdc: (oblivian@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed
  • 14:39 switchdc: (oblivian@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on MediaWiki jobrunners and videoscalers
  • 14:31 gehel: deploying new psotgresql replication check, might generate a few icinga alerts -T162345
  • 14:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1028 - T160390 (duration: 00m 38s)
  • 14:05 elukey: reimage anaytics1001 to Debian Jessie
  • 13:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 after maintenance with low weight (duration: 00m 38s)
  • 13:41 moritzm: upgrading wtp1002-wtp1005 to Linux 4.9
  • 13:30 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgTranslateNumerals false on bhwiki - T160098 (duration: 00m 40s)
  • 13:26 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Create editprotected right on ptwikinews - T162577 (duration: 00m 40s)
  • 13:19 elukey: reboot analytics1040->1050 to pick up the new kernel
  • 13:17 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Increase default thumb size to 250px at nowiki - T155892 (duration: 00m 45s)
  • 13:16 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: pagePreviews: Enable NavPopups gadget detection - T160081 (duration: 00m 40s)
  • 13:00 twentyafterfour: stopped search indexer on iridium to lighten load on m3 databases.
  • 12:55 marostegui: Run pt-table-checksum on s4 - T162593
  • 12:40 akosiaris: upload apertium-spa-cat_2.0.0~r77288-2+wmf1 on apt.wikimedia.org jessie-wikimedia/main
  • 11:11 akosiaris: upload puppet_3.8.5-2~bpo8+1 on apt.wikimedia.org jessie-wikimedia/main
  • 11:00 akosiaris: upload apertium-cat_2.0.0~r77286-1+wmf1, apertium-spa_1.0.0~r77293-1+wmf1 on apt.wikmedia.org/jessie-wikimedia
  • 10:58 gehel: starting load test on elstic2020 - T149006
  • 10:48 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1004.eqiad.wmnet
  • 10:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1004.eqiad.wmnet
  • 10:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1003.eqiad.wmnet
  • 10:23 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1003.eqiad.wmnet
  • 10:23 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1002.eqiad.wmnet
  • 10:12 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1002.eqiad.wmnet
  • 10:11 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1001.eqiad.wmnet
  • 10:03 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 10:02 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 10:01 gehel: rolling restart of maps1* (eqiad) cluster
  • 09:52 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 09:52 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 09:44 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2003.codfw.wmnet
  • 09:44 XioNoX: all interfaces back up on cr2-esams, BGP sessions up as well T162239
  • 09:44 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2002.codfw.wmnet
  • 09:33 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2002.codfw.wmnet
  • 09:29 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 09:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 09:17 XioNoX: remote hands work started to replace the FPC on cr2-esams T162239
  • 09:16 gehel: rolling restart of maps2* cluster
  • 08:52 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=wdqs
  • 08:51 godog: swift codfw-prod: bump ms-be2028 ms-be2039 object weight to 4000 - T158337
  • 08:48 gehel: reimage elastic2020 - T149006
  • 08:43 gehel: rolling restart of maps-test cluster
  • 08:39 elukey: manual failover of Hadoop master daemons from analyitics1001 to analytics1002 (T160333)
  • 07:48 _joe_: testing a dry-run of the switchdc software on sarin
  • 07:02 moritzm: installing pam updates from jessie point update
  • 06:26 marostegui: Deploy schema change labsdb1001 (s7) - T160390
  • 06:24 marostegui: Deploy schema change db1028 (s7) - T160390
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1028 - T160390 (duration: 00m 39s)
  • 06:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 - T160390 (duration: 00m 38s)
  • 06:07 marostegui: Deploy schema change db1034 (s7) - T160390
  • 06:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add tempdb2001 to x1 as a slave - T162290 (duration: 00m 38s)
  • 06:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 - T160390 (duration: 00m 39s)
  • 02:49 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Apr 10 02:49:06 UTC 2017 (duration 5m 40s)
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 32s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 08m 17s)

2017-04-09

  • 02:59 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Apr 9 02:59:29 UTC 2017 (duration 5m 35s)
  • 02:53 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 36s)
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 07m 56s)

2017-04-08

  • 20:56 bblack: removed varnishkafka logs and daemon.log.1 on cp1052 to free disk space and clear alert
  • 17:43 chasemp: service nova-compute restart labvirt1002
  • 17:36 chasemp: nova reset-state on 15 nodepool stuck in deletion nodes, and force-delete
  • 17:29 chasemp: delete manual on labcontrol all instances in delete state on nodepool
  • 17:25 chasemp: openstack server delete 970a86ce-2549-4cf3-be91-1f8558ab1b32 (admin-monitoring stuck in build)
  • 17:21 chasemp: restart rabbitmq on labcontrol1001
  • 17:20 chasemp: restart nova-api on labnet
  • 16:00 bblack: banning obj.http.Content-Type ~ text/html on cache_upload
  • 15:46 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in ulsfo
  • 14:56 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in esams
  • 13:54 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in codfw
  • 13:27 bblack: banning obj.http.X-Orig-Content-Type !~ . on cache_upload in eqiad
  • 11:55 bblack: banning obj.http.Content-Type ~ text/html on cache_upload
  • 10:55 jynus: setting labsdb1001 and labsdb1003 in read only mode
  • 09:55 reedy@tin: Finished scap: Rebuild EP l10n cache for namespace aliases T162481 (duration: 79m 11s)
  • 08:36 reedy@tin: Started scap: Rebuild EP l10n cache for namespace aliases T162481
  • 08:34 reedy@tin: Synchronized wmf-config/CommonSettings.php: T162481 (duration: 00m 39s)
  • 08:33 reedy@tin: Synchronized wmf-config/extension-list: T162481 (duration: 00m 40s)
  • 02:56 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Apr 8 02:56:37 UTC 2017 (duration 5m 33s)
  • 02:51 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 08m 03s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 08m 13s)

2017-04-07

  • 23:16 mutante: gerrit2001 - deleting netmon1001 backup (/srv/netmon1001), stop rsyncd, remove rsyncd config (T125020)
  • 23:06 ejegg: updated DjangoBannerStats from 220f80e to 9e6b117
  • 22:18 reedy@tin: Synchronized php-1.29.0-wmf.19/extensions/EducationProgram/EducationProgram.php: Load wgExtensionMessagesFiles in PHP entry point for mergeMessageLists T162481 (duration: 00m 49s)
  • 20:07 demon@tin: Synchronized README: no-op, testing master sync speed now (duration: 00m 38s)
  • 20:05 demon@tin: Synchronized README: no-op, co-master sync (duration: 00m 39s)
  • 19:41 demon@tin: Finished scap: no-op, final history sync (duration: 23m 05s)
  • 19:18 demon@tin: Started scap: no-op, final history sync
  • 18:40 demon@tin: Synchronized php-1.29.0-wmf.19/includes/specials/: no-op, cleaning up history (duration: 01m 00s)
  • 18:16 demon@tin: Synchronized php-1.29.0-wmf.19/includes/api/: No-op, cleaning up git history (duration: 00m 54s)
  • 17:17 demon@tin: Finished scap: no-op, cleaning up wmf.19 history (duration: 25m 07s)
  • 16:51 demon@tin: Started scap: no-op, cleaning up wmf.19 history
  • 16:29 demon@tin: Synchronized php-1.29.0-wmf.19/extensions/SyntaxHighlight_GeSHi/: no-op, cleaning up history (duration: 00m 44s)
  • 15:32 gehel: reimaging elstic2020 - T149006
  • 14:58 marostegui: Deploy schema change dbstore1001 (s7 wikis) - T160390
  • 14:40 marostegui: Deploy schema change db1033 (already depooled) (s7) - T160390
  • 14:13 elukey: restart hadoop-hdfs-namenode on an1002 (Hadoop Master standby) to pick up new jvm settings
  • 14:07 elukey: restart hadoop-mapreduce-historyserver on an1001 to pick up the new jvm settings
  • 14:02 switchdc: (oblivian@sarin) Executing task switchdc.stages.t00_reduce_ttl(eqiad, codfw): Reduce the TTL of all the MediaWiki discovery records
  • 14:01 _joe_: running tests of the switchdc automation in dry-run mode
  • 14:01 switchdc: (oblivian@sarin) Executing task switchdc.stages.t00_disable_puppet(eqiad, codfw): Stop puppet execution on maintenance, jobqueues
  • 12:52 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable alternate RevisionSlider slider on beta BETA ONLY (duration: 00m 51s)
  • 12:48 bblack: banning cache_upload obj.http.Content-type ~ text/html
  • 12:46 bblack: banning cache_upload obj.http.Content-type == text/html
  • 12:45 bblack: banning cache_upload obj.http.Content-type ~ text
  • 10:53 elukey: increase Redis connection timeout manually (.3s -> .5s) on mw1306 as performance test - T125735
  • 09:22 marostegui: Deploy schema change db1062 (already depooled) (s7) - T160390
  • 08:15 moritzm: upgrade mw1262-mw1265 to HHVM 3.18.2
  • 07:58 elukey: added "notifempty" to /etc/logrotate.d/nginx on cp1008, it should remove cronspam for access_pipe.log.1.gz
  • 07:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=wdqs
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T160390 (duration: 00m 50s)
  • 07:50 marostegui: Deploy schema change db1039 (already depooled) (s7) - T160390
  • 07:21 jynus: reimporting several damaged db tables on s2 T154485
  • 07:17 ariel@tin: Finished deploy [dumps/dumps@af61d8d]: I mean: handle page range generation for wikis with PAGES with hundreds of thousands of revisions (duration: 00m 02s)
  • 07:17 ariel@tin: Started deploy [dumps/dumps@af61d8d]: I mean: handle page range generation for wikis with PAGES with hundreds of thousands of revisions
  • 07:16 ariel@tin: Finished deploy [dumps/dumps@af61d8d]: handle page range generation for wikis with hundreds of thousands of revisions (duration: 00m 03s)
  • 07:16 ariel@tin: Started deploy [dumps/dumps@af61d8d]: handle page range generation for wikis with hundreds of thousands of revisions
  • 06:06 marostegui: Deploy schema change db1094 (s7) - T160390
  • 06:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T160390 (duration: 00m 49s)
  • 03:04 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Apr 7 03:04:52 UTC 2017 (duration 5m 13s)
  • 02:59 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 14m 11s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 09m 54s)

2017-04-06

  • 23:14 dereckson@tin: Synchronized php-1.29.0-wmf.19/extensions/Popups: actions: Correctly delay FETCH_COMPLETE (Gerrit:346832) (duration: 00m 41s)
  • 22:23 maxsem@tin: Finished deploy [tilerator/deploy@9cf2338]: https://gerrit.wikimedia.org/r/#/c/346913/ to test hosts only (duration: 00m 18s)
  • 22:22 maxsem@tin: Started deploy [tilerator/deploy@9cf2338]: https://gerrit.wikimedia.org/r/#/c/346913/ to test hosts only
  • 22:15 ejegg: re-enabled adyen and paypal SmashPig job runners
  • 22:07 ejegg: re-enabled two main dedupe jobs and orphan rectifier
  • afk: set thank-you back size back to 400
  • 20:52 awight: change thank_you_batch from 400->1
  • 19:42 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Fix ORES threshold settings again (duration: 00m 40s)
  • 19:10 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.19
  • 18:48 legoktm@tin: Synchronized php-1.29.0-wmf.19/extensions/Linter/includes/RecordLintJob.php: Split statsd metrics by wiki - https://gerrit.wikimedia.org/r/#/c/346807 (duration: 00m 42s)
  • 18:17 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgLinterStatsdSampleFactor (duration: 00m 45s)
  • 18:10 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Adjust plwiki, ptwiki ORES thresholds for new model deployment (duration: 00m 40s)
  • 18:00 switchdc: (volans@neodymium) Test switchdc IRC/SAL announcement (2)
  • 17:57 switchdc: (volans@neodymium) Test switchdc IRC/SAL announcement
  • 17:46 maxsem@tin: Finished deploy [tilerator/deploy@71aed11]: https://gerrit.wikimedia.org/r/#/c/346782/ to test hosts (duration: 00m 19s)
  • 17:45 maxsem@tin: Started deploy [tilerator/deploy@71aed11]: https://gerrit.wikimedia.org/r/#/c/346782/ to test hosts
  • 17:41 halfak@tin: Finished deploy [ores/deploy@3396b64]: T161748 (duration: 21m 08s)
  • 17:20 halfak@tin: Started deploy [ores/deploy@3396b64]: T161748
  • 17:19 arlolra@tin: Finished deploy [parsoid/deploy@b5c2a2b]: Updating Parsoid to 56ae82bb (duration: 08m 29s)
  • 17:13 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Linter to medium wikis too - T148609 (duration: 00m 40s)
  • 17:11 arlolra@tin: Started deploy [parsoid/deploy@b5c2a2b]: Updating Parsoid to 56ae82bb
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T160390 (duration: 00m 43s)
  • 16:18 elukey: restart hhvm on mw1227 - debug in /tmp/hhvm.30097.bt. - theads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 16:17 hoo@tin: Synchronized wmf-config/Wikibase-production.php: Try using redisLockManager for test.wikidata.org (T159828) (duration: 00m 39s)
  • 16:11 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Temporarily enable change dispatch logging on testwikidata (duration: 00m 45s)
  • 15:48 hoo@tin: Synchronized wmf-config/Wikibase.php: Fix Wikibase site groups for testwiki and test2wiki (duration: 00m 40s)
  • 15:36 hoo@tin: Synchronized wmf-config/Wikibase.php: Don't set removed Wikibase client settings (duration: 00m 40s)
  • 15:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add tempdb2001 to config files - T162290 (duration: 00m 40s)
  • 15:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add tempdb2001 to config files - T162290 (duration: 00m 39s)
  • 14:55 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 42s)
  • 14:51 hoo: Restarted apache on mwdebug1001 in order to test a potential CACHE_ACCEL issue
  • 14:46 hoo@tin: Synchronized wmf-config/: Don't use "enwiki" as Wikibase site id on testwiki and test2wiki (T94416) (duration: 01m 08s)
  • 14:12 hoo@tin: Synchronized wmf-config/Wikibase.php: Add testwiki and test2wiki to "specialSiteLinkGroups" on testwikidata (T94416) (duration: 00m 40s)
  • 14:04 elukey: reimage analytics1002 to Debian Jessie (Hadoop Master Node standby)
  • 13:44 gehel: re-generating tiles for tasmania on maps codfw cluster
  • 13:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T160390 (duration: 00m 39s)
  • 13:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1079 - T160390 (duration: 00m 43s)
  • 13:39 marostegui: Deploy schema change db1079 (s7 wikis) - T160390
  • 13:34 hashar: European SWAT completed
  • 13:30 marostegui: Deploy Deploy schema change dbstore1002 (s7 wikis) - T160390
  • 13:20 hashar@tin: Synchronized php-1.29.0-wmf.19/extensions/Popups: renderer: Pass event to behavior for processing - T162324 (duration: 00m 51s)
  • 12:51 ema: upgrade cp3007 to linux 4.9 T162029
  • 12:50 moritzm: upgraded mw1261 to HHVM 3.18.2 with cherrypicked fix for stat_cache deadlock, now running with stat_cache enabled again
  • 12:39 ema: rebooting cp2006 again to check for potential issues bringing up network ifaces / loading intel_uncore T162029
  • 12:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1081 original weight - T161088 (duration: 00m 40s)
  • 12:28 ema: cp2009 stuck rebooting, powercycled
  • 12:21 ema: upgrade cp2009 to linux 4.9 T162029
  • 12:16 moritzm: uploaded HHVM 3.18.2 to jessie-wikimedia/experimental
  • 11:51 ema: upgrade cp2006 to linux 4.9 T162029
  • 11:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increae db1081 weight - T161088 (duration: 00m 40s)
  • 10:59 _joe_: running some tests for the switchdc automation
  • 09:33 moritzm: installing freetype security updates on trusty
  • 08:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increae db1081 weight - T161088 (duration: 00m 39s)
  • 08:41 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,cluster=wdqs
  • 08:40 moritzm: installing glibc updates on trusty
  • 08:37 gehel: shutting down wdqs codfw for data reimport - T162111
  • 08:34 hashar: starting Jenkins on contint1001
  • 08:27 moritzm: rebooting contint1001 to Linux 4.9
  • 08:02 elukey: restart hhvm on mw1194 - dump debug in /tmp/hhvm.1692.bt. - threads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 07:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increae db1081 weight - T161088 (duration: 00m 39s)
  • 07:32 ema: cache_upload: ban all objects with content-type ~ "^text" T162035
  • 07:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 with low weight - T161088 (duration: 00m 48s)
  • 06:29 elukey: restart hhvm on mw1165 (jobrunner) - dump debug in /tmp/hhvm.19449.bt. - threads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 06:09 marostegui: Deploy schema change db2029 (s7 codfw master) - T160390
  • 06:02 marostegui: Configure and start replication on db1081 after the defragment - T161088
  • 05:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 - T160390 (duration: 00m 40s)
  • 05:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 after compression - T153743 (duration: 00m 51s)
  • 04:06 twentyafterfour: restarting apache2 on iridium to apply a minor hotfix
  • 03:06 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Apr 6 03:06:35 UTC 2017 (duration 5m 59s)
  • 03:00 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 15m 46s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 10m 28s)
  • 01:45 mutante: restarting gerrit to pick up config change gerrit:346180 (disable MD5)
  • 00:39 mutante: install1002/2002: deleting /srv/autoinstall/precise.cfg
  • 00:37 mutante: install1002/2002: deleteing /srv/tftboot/precise-installer | puppetmaster1002/2001: deleting /var/lib/puppet/volatile/tftpboot/precise-installer (clean up after gerrit:345549)
  • 00:25 twentyafterfour: Phabricator upgrade completed uneventfully, other than the undisputable fact that the new search functionality is awesome.
  • 00:21 mutante: added #wikimedia-traffic channel to stashbot config, test
  • 00:19 mutante: stopping and starting stashbot for config change - added #wikimedia-traffic channel
  • 00:19 twentyafterfour: updating phabricator, the service will be offline for just a few moments.
  • 00:08 twentyafterfour: preparing to update Phabricator to tag release/2017-04-05/1 #phab-2017-04-05

2017-04-05

  • 23:29 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ru.svg: SWAT: Update Russian Wikipedia logo T162036 (duration: 00m 40s)
  • 23:18 demon@tin: Synchronized wmf-config/CommonSettings.php: unbreak dashiki again (duration: 00m 40s)
  • 23:13 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Page previews to stable on Hungrian and Hebrew Wikipedias T162162 (duration: 00m 40s)
  • 23:12 demon@tin: Synchronized php-1.29.0-wmf.19/extensions/Dashiki/: swattttttt (duration: 00m 41s)
  • 22:37 mobrovac: restbase deploying a8d4d027
  • 22:12 ppchelko@tin: Finished deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging, attempt 2 (duration: 07m 06s)
  • 22:05 ppchelko@tin: Started deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging, attempt 2
  • 22:04 ppchelko@tin: Finished deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging (duration: 02m 29s)
  • 22:02 ppchelko@tin: Started deploy [trending-edits/deploy@46544de]: Correctly calculate since parameter and allow to change decay for debugging
  • 21:58 demon@tin: Synchronized wmf-config/CommonSettings.php: bump video transcode timeouts, brion made me do it (duration: 00m 40s)
  • 20:53 ppchelko@tin: Finished deploy [trending-edits/deploy@475a5c0]: Fix edit scorer (duration: 05m 34s)
  • 20:47 ppchelko@tin: Started deploy [trending-edits/deploy@475a5c0]: Fix edit scorer
  • 20:44 ppchelko@tin: Finished deploy [trending-edits/deploy@475a5c0]: Fix edit scorer (duration: 02m 51s)
  • 20:41 ppchelko@tin: Started deploy [trending-edits/deploy@475a5c0]: Fix edit scorer
  • 20:27 arlolra: Updated Parsoid to 32b7c677 (T112043, T161936)
  • 20:18 arlolra@tin: Finished deploy [parsoid/deploy@f2d4eee]: Updating Parsoid to 32b7c677 (duration: 11m 26s)
  • 20:07 arlolra@tin: Started deploy [parsoid/deploy@f2d4eee]: Updating Parsoid to 32b7c677
  • 19:55 ppchelko@tin: Finished deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint (duration: 04m 59s)
  • 19:50 ppchelko@tin: Started deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint
  • 19:50 ppchelko@tin: Finished deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint (duration: 07m 56s)
  • 19:44 XioNoX: pushing https://www.irccloud.com/pastebin/Kecy61aZ/ to cr1/2.codfw for T162099
  • 19:43 awight: reenabled NL Fundraising campaigns
  • 19:42 ppchelko@tin: Started deploy [trending-edits/deploy@d8ca758]: Providing a debug endpoint
  • 19:38 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: roll back donatewiki to wmf.18
  • 19:37 awight: disabled NL campaigns per T162300
  • 19:12 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.19
  • 18:04 mutante: lvs2002 - power off via mgmt (it was down but still showed power as on)
  • 18:02 awight: rerunning paypal_audit
  • 16:57 moritzm: rearmed keyholder on mira after reboot
  • 15:20 elukey: playing with hhvm settings on mwdebug1002
  • 13:05 hashar@tin: Synchronized wmf-config/throttle.php: Add new throttle rule - T162089 (duration: 00m 40s)
  • 12:57 elukey: reimage analytics1035 (journal node) to Debian Jessie
  • 12:44 marostegui: Deploy schema change db2047 (s7) - T160390
  • 12:44 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 - T160390 (duration: 00m 41s)
  • 12:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2054 - T160390 (duration: 00m 44s)
  • 12:04 moritzm: upgrade remaining ca-certificates from jessie point update
  • 12:00 volans: re-enabled puppet on nitrogen/nihal/einsteinium, restarted ircecho
  • 11:42 volans: disabling ircecho for the merge of gerrit/346110 ( T159163 ) and postgres upgrade
  • 11:18 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 for maintenance (duration: 00m 40s)
  • 09:48 volans: deleted a third swift thumb that was making swiftrepl stuck in a loop: T162122
  • 09:11 elukey: reimage analytics1057 to Debian Jessie
  • 09:04 volans: deleted the 2 swift thumbs that were making swiftrepl stuck in a loop: T162122
  • 08:43 hoo: Ran scap pull on mwdebug1001 to revert local changes to Wikibase maintenance scripts
  • 08:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 after maintenance (duration: 00m 40s)
  • 07:44 marostegui: Migrate dbstore1002 enwiki.page and enwiki.categorylinks from TokuDB to InnoDB+compression - T159430
  • 06:56 marostegui: Stop replication on db1081 for maintenance - T161088
  • 06:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T161088 (duration: 00m 39s)
  • 06:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1081 - T161088 (duration: 00m 39s)
  • 06:36 elukey: restart hhvm on mw1288 (hhvm-dump-debug in /tmp/hhvm.92520.bt.)
  • 06:33 elukey: restart hhvm on mw1223 (hhvm-dump-debug in /tmp/hhvm.2164.bt.)
  • 06:22 marostegui: Deploy schema change db2054 (s7) - https://phabricator.wikimedia.org/T160390
  • 06:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2054 - T160390 (duration: 00m 43s)
  • 06:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2061 - T160390 (duration: 00m 40s)
  • 03:03 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Apr 5 03:03:18 UTC 2017 (duration 5m 53s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.19) (duration: 07m 22s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 08m 47s)
  • 01:27 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes: scap test only, no code changes (duration: 00m 39s)
  • 01:26 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes: scap test only, no code changes (duration: 00m 40s)
  • 01:17 demon@tin: Synchronized scap/plugins/clean.py: fixes (duration: 00m 41s)
  • 00:57 demon@tin: Finished scap: wmf.14 again, testing testing (duration: 26m 48s)
  • 00:30 demon@tin: Started scap: wmf.14 again, testing testing
  • 00:29 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes: scap test only, no code changes (duration: 01m 21s)
  • 00:08 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration/includes/MigrationEditPage.php: for bug fix gerrit 346478 (duration: 00m 56s)

2017-04-04

  • 23:55 tstarling@tin: Synchronized php-1.29.0-wmf.18/extensions/ParserMigration: (no justification provided) (duration: 00m 39s)
  • 23:50 reedy@tin: Synchronized php-1.29.0-wmf.18/extensions/Quiz: Revert "Start implementing Quiz generation using TemplateParser" (duration: 00m 42s)
  • 23:31 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Prepare for related pages config change (T160076) and set $wgOresFiltersThresholds on plwiki and ptwiki (duration: 00m 41s)
  • 23:29 jynus: unscheduled restart of dbstore1002 T162212
  • 23:19 demon@tin: Finished scap: re-syncing old wmf.14-16 branches...cleaned up a little too much (duration: 44m 32s)
  • 22:34 demon@tin: Started scap: re-syncing old wmf.14-16 branches...cleaned up a little too much
  • 22:01 mobrovac: SCB all services updated to use the new service-runner DNS caching
  • 22:00 mobrovac@tin: Finished deploy [trending-edits/deploy@5cc3969]: Bump service-runner to pick up new DNS caching (duration: 06m 40s)
  • 21:55 mobrovac@tin: Finished deploy [graphoid/deploy@5fc26cb]: Bump service-runner to pick up new DNS caching (duration: 02m 15s)
  • 21:54 mobrovac@tin: Started deploy [trending-edits/deploy@5cc3969]: Bump service-runner to pick up new DNS caching
  • 21:53 mobrovac@tin: Started deploy [graphoid/deploy@5fc26cb]: Bump service-runner to pick up new DNS caching
  • 21:52 mobrovac@tin: Finished deploy [mobileapps/deploy@b93488f]: Bump service-runner to pick up new DNS caching (duration: 02m 43s)
  • 21:49 mobrovac@tin: Started deploy [mobileapps/deploy@b93488f]: Bump service-runner to pick up new DNS caching
  • 21:48 mobrovac@tin: Finished deploy [cxserver/deploy@b4184d3]: Bump service-runner to pick up new DNS caching (duration: 03m 37s)
  • 21:45 mobrovac@tin: Started deploy [cxserver/deploy@b4184d3]: Bump service-runner to pick up new DNS caching
  • 21:44 mobrovac@tin: Finished deploy [mathoid/deploy@4eb6d9d]: Bump service-runner to pick up new DNS caching (duration: 03m 27s)
  • 21:40 mobrovac@tin: Started deploy [mathoid/deploy@4eb6d9d]: Bump service-runner to pick up new DNS caching
  • 21:36 mobrovac@tin: Finished deploy [eventstreams/deploy@cf892f4]: Bump service-runner to pick up new DNS caching (duration: 02m 04s)
  • 21:33 mobrovac@tin: Started deploy [eventstreams/deploy@cf892f4]: Bump service-runner to pick up new DNS caching
  • 21:29 mobrovac@tin: Finished deploy [citoid/deploy@7dbbac8]: Bump service-runner to pick up new DNS caching (duration: 03m 13s)
  • 21:27 awight: Finished migrating Fundraising jobs to process-controlb
  • 21:26 mobrovac@tin: Started deploy [citoid/deploy@7dbbac8]: Bump service-runner to pick up new DNS caching
  • 21:20 jynus: applying mariadb MDEV#7383 patch on db1034 T159319
  • 21:18 mutante: running puppet across labvirt10* to replace cert
  • 21:12 mutante: revoked old labvirt-star.eqiad.wmnet cert - created new csr, signed it (CA: wmf_ca_2014_2017). deploying new labvirt-star.eqiad valid for 720 days (T162085)
  • 20:48 catrope@tin: Synchronized php-1.29.0-wmf.19/extensions/Echo/: T162173 (duration: 00m 43s)
  • 20:00 paravoid: rolling out a border-in4 ACL update across core routers (T160055)
  • 19:17 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.19
  • 18:57 awight: enabled pilot process-control job: banner history queue consumer
  • 18:55 demon@tin: Synchronized php: symlink repoint (duration: 00m 39s)
  • 18:55 awight: disabled banner history queue consumer
  • 18:51 demon@tin: Finished scap: wmf.19 bootstrap (duration: 35m 16s)
  • 18:16 demon@tin: Started scap: wmf.19 bootstrap
  • 17:53 andrewbogott: disabling puppet on labvirts to roll out a nova config change
  • 17:40 volans: stopped ircecho to avoid IRC spam
  • 16:03 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 15:59 elukey: reimage analytics1052 (Hadoop Journal node) to Debian Jessie
  • 15:59 jynus: running ANALIZE on revision table for on eswiki,cawiki on db1034
  • 15:56 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 for maintenance (duration: 00m 44s)
  • 14:39 moritzm: rebooting praseodymium to Linux 4.9
  • 14:34 moritzm: rebooting xenon to Linux 4.9
  • 14:27 moritzm: rebooting cerium to Linux 4.9
  • 14:06 elukey: reimage analytics1039 and 1051 to Debian Jessie
  • 13:11 akosiaris: add LVS IPs to the url-downloader blacklist now that all nodejs services no longer require it anymore. See https://gerrit.wikimedia.org/r/207490
  • 13:09 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY Enable interwikisorting on BETA wiktionaries (duration: 00m 44s)
  • 13:05 moritzm: installing ca-certificates updates from jessie point update
  • 13:00 ema: cache_upload: ban all objects with content-type ~ "^text" T162035
  • 12:19 ema: upgrade cp2003 to linux 4.9 T162029
  • 11:58 moritzm: installing e2fsprogs update from jessie point update
  • 11:53 elukey: reimage analytics10[36,37,38] to Debian Jessie
  • 11:46 marostegui: Deploy schema change db2061 (s7) - T160390
  • 11:46 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2061 - T160390 (duration: 00m 44s)
  • 11:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2068 - T160390 (duration: 00m 58s)
  • 09:40 moritzm: rebooting wtp1001 to Linux 4.9
  • 09:10 volans: restarted swiftrepl (repl_all.sh loop) on ms-fe1005
  • 08:47 moritzm: rebooting mw1265 to Linux 4.9
  • 08:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1015 - T159319 (duration: 00m 45s)
  • 07:54 moritzm: rebooting bast2001 to Linux 4.9
  • 07:35 elukey: reimage analytics103[234] to Debian Jessie
  • 06:43 marostegui: Deploy alter table on db2019 (codfw s4 master) - this will generate lag on codfw for s4 - T161683
  • 06:35 marostegui: Deploy schema change db2068 (s7) - T160390
  • 06:34 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2068 - T160390 (duration: 00m 44s)
  • 06:27 marostegui: Deploy schema change db1015 (s3) - https://phabricator.wikimedia.org/T159319
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Apr 4 02:39:47 UTC 2017 (duration 5m 28s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 14m 27s)
  • 01:31 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Disable LoginNotify on wikis that have no Echo T158878 (duration: 00m 44s)
  • 00:45 mutante: install1002/2002: sudo -i reprepro --delete clearvanished to remove precise distro after merging gerrit:345550

2017-04-03

  • 23:54 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Deploy ParserMigration extension T141586 (for real) (duration: 00m 44s)
  • 23:41 thcipriani@tin: Finished scap: SWAT: Deploy ParserMigration extension T141586 (l10nupdate only) (duration: 22m 24s)
  • 23:19 thcipriani@tin: Started scap: SWAT: Deploy ParserMigration extension T141586 (l10nupdate only)
  • 23:10 thcipriani@tin: Synchronized wmf-config: SWAT: Test LoginNotify on Beta cluster T158878 (duration: 00m 46s)
  • 22:39 volans: completed restart of swift-proxies in eqiad, ms-fe1005 was missing due to swiftrepl stuck/running
  • 22:37 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 22:35 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 22:06 mutante: power cycling lvs2002, it was down and console showed nothing
  • 20:47 bsitzmann@tin: Finished deploy [mobileapps/deploy@20ab197]: Update mobileapps to fdd4e31 (duration: 03m 05s)
  • 20:44 bsitzmann@tin: Started deploy [mobileapps/deploy@20ab197]: Update mobileapps to fdd4e31
  • 19:21 hashar: Finished deployment of project-logos optimization for T161999 / https://gerrit.wikimedia.org/r/#/c/346057/ . And purged the related logos
  • 19:18 hashar@tin: Synchronized static/images/project-logos: Optimize a few project logos - T161999 (duration: 00m 44s)
  • 19:16 andrewbogott: in testlabs, deleted ou=projects,dc=wikimedia,dc=org and ou=roles,dc=wikimedia,dc=org as per T126758
  • 19:15 mutante: phabricator/ops: adding ayounsi to WMF-NDA (project 61) and acl*operations-team (project 29) (T162073)
  • 18:37 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Configure Babel for elwikisource (T161593) (duration: 00m 44s)
  • 18:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Convert reference lists to 'responsive' on hewiki (T161804) (duration: 00m 52s)
  • 17:02 gehel@tin: Finished deploy [wdqs/wdqs@d7c367a]: (no justification provided) (duration: 01m 29s)
  • 17:01 gehel@tin: Started deploy [wdqs/wdqs@d7c367a]: (no justification provided)
  • 15:43 hoo: Updated email for "Lucie Kaffee" on wikitech from work address (wikimedia.de) to known volunteer address (upon request)
  • 14:54 marostegui: Deploy alter table to unify revision table across all the s3 wikis on db1015 - T159319
  • 14:49 ema: cache_upload: ban all objects with content-type: text/html T162035
  • 14:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1015 - T159319 (duration: 00m 44s)
  • 14:26 ariel@tin: Finished deploy [dumps/dumps@905a845]: fix stub recombines, broken by too agressive 'cleanup' of local vars (duration: 00m 02s)
  • 14:26 ariel@tin: Started deploy [dumps/dumps@905a845]: fix stub recombines, broken by too agressive 'cleanup' of local vars
  • 14:23 cwd: restarted jenkins to stop ArrayIndexOutOfBoundsException error
  • 14:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T160390 (duration: 00m 51s)
  • 13:38 zfilipin@tin: Synchronized php-1.29.0-wmf.18/extensions/cldr/: SWAT: Translate Atikamekw language name in French (duration: 00m 51s)
  • 13:29 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add NS100 (Portal) to ladwiki, Add rollback user group in fawikisource (duration: 00m 47s)
  • 13:27 hashar: terbium: scap pull for ladwiki namespace additions
  • 13:15 moritzm: upgrading restbase-dev* to Linux 4.9
  • 13:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: enwiki: Temporarily disable Wikidata descriptions (T161805) (duration: 00m 45s)
  • 12:37 elukey: reimage analytics10[29,30,31] to Debian Jessie
  • 12:28 ema: banning 200px-Status_iucn3.1_LC_cs.svg.png from esams frontends T162035
  • 11:49 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:45 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:35 joal@tin: Finished deploy [analytics/refinery@cc73c40]: (no justification provided) (duration: 07m 23s)
  • 11:31 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1008.eqiad.wmnet
  • 11:28 joal@tin: Started deploy [analytics/refinery@cc73c40]: (no justification provided)
  • 11:28 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1008.eqiad.wmnet
  • 11:21 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1007.eqiad.wmnet
  • 11:18 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1007.eqiad.wmnet
  • 11:08 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:05 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1006.eqiad.wmnet
  • 11:04 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 10:43 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,name=ms-fe1005.eqiad.wmnet
  • 10:38 volans: upgrading swift-proxy in eqiad to use discovery URLs
  • 08:46 marostegui: Deploy alter table db1086 (s7) on revision table to unify PK and indexes - T160390
  • 08:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T160390 (duration: 00m 44s)
  • 07:39 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet
  • 07:25 marostegui: Deploy alter table dbstore2001 (s7) on revision table to unify PK and indexes - T160390
  • 07:25 _joe_: rebooting copper to clean up at least partially the docker mess
  • 07:14 moritzm: switched default kernel for jessie installations to Linux 4.9
  • 07:06 _joe_: removing stale files on copper for docker, all local images will be wiped away
  • 07:03 moritzm: instaling gnutls security updates on trusty
  • 06:51 marostegui: Deploy InnoDB compression on dewiki - db1070 - T150438
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 to compress it - T153743 (duration: 00m 44s)
  • 06:40 _joe_: manually restarted replication for etcd
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1057 entry - T160435 (duration: 00m 44s)
  • 06:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1057 entry - T160435 (duration: 00m 54s)
  • 06:12 marostegui: Remove partitions from metawiki.pagelinks (s7) on codfw master (db2029) this will generate lag on codfw - T153300
  • 05:59 marostegui: Resume pt-table-checksum on wikidata - T161294
  • 05:53 _joe_: powercycling mw2256, unresponsive to ping, blank console
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 09m 39s)

2017-04-02

  • 08:18 elukey: powercycle ms-be1016 (stuck in console, answers pings but not ssh)
  • 07:25 ariel@tin: Finished deploy [dumps/dumps@1ac3fb3]: var/method name cleanups, refactor, pregenerate page ranges for page content jobs, auto retry of failed page ranges (duration: 00m 03s)
  • 07:25 ariel@tin: Started deploy [dumps/dumps@1ac3fb3]: var/method name cleanups, refactor, pregenerate page ranges for page content jobs, auto retry of failed page ranges
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 09m 44s)

2017-04-01

  • 19:01 elukey: restart hhvm on mw1191 (dump debug in /tmp/hhvm.16619.bt.) - threads stuck in HPHP::Treadmill::getAgeOldestRequest
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Apr 1 02:37:30 UTC 2017 (duration 5m 20s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 13m 27s)

2017-03-31

  • 23:11 mutante: ruthenium: logrotate --force /etc/logrotate.d/parsoid (note this is existing file "parsoid" not new file "parsoid_testing") (T161920)
  • 20:18 elukey: stopping jobrunners on mw116[89] and restarting hhvm after https://gerrit.wikimedia.org/r/345881
  • 19:44 Reedy: Stop badge hacks from messing up the entire page on IE 11 on MonoBook T161869
  • 19:42 reedy@tin: Synchronized php-1.29.0-wmf.18/extensions/Echo: Stop badge hacks from messing up the entire page on IE 11 on MonoBook T161689 (duration: 00m 50s)
  • 19:16 mutante: ruthenium also deleting ancient "htmldumper" data, gwicke confirmed it's not needed anymore
  • 18:27 mutante: ruthenium mounting /dev/mapper/ruthenium--vg-tank into /srv/visualdiff/pngs | deleted "mysql" and "dumps" data that was on previously unmounted partition , subbu checked that wasn't needed anymore, we still need logrotate (T161920)
  • 18:14 mutante: ruthenium mounting /dev/mapper/ruthenium--vg-tank which wasnt used at all.. bam.. over 477GB of free space
  • 16:02 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2008.codfw.wmnet
  • 16:00 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2008.codfw.wmnet
  • 15:59 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2007.codfw.wmnet
  • 15:57 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2007.codfw.wmnet
  • 15:55 mobrovac@tin: Finished deploy [trending-edits/deploy@26b5eb4]: Config change: lower min_edits to 15 T160127 (duration: 06m 37s)
  • 15:55 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2006.codfw.wmnet
  • 15:52 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2006.codfw.wmnet
  • 15:49 mobrovac@tin: Started deploy [trending-edits/deploy@26b5eb4]: Config change: lower min_edits to 15 T160127
  • 15:44 volans@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=ms-fe2005.codfw.wmnet
  • 15:22 volans@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=imagescaler-rw,name=eqiad
  • 15:17 volans@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=imagescaler-rw,name=eqiad
  • 15:02 volans@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=ms-fe2005.codfw.wmnet
  • 15:01 oblivian@puppetmaster1001: conftool action : set/ttl=300; selector: dnsdisc=restbase-async
  • 15:01 oblivian@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=restbase-async,name=eqiad
  • 14:58 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=eqiad
  • 14:56 oblivian@puppetmaster1001: conftool action : set/ttl=10; selector: dnsdisc=restbase-async
  • 14:55 _joe_: reducing ttl on the restbase-async discovery record, then flipping eqiad to active
  • 14:55 volans: deploying the use of discovery URL to swift-proxy hosts in codfw T160178#3136906
  • 14:09 _joe_: performing a rolling restart of changeprop after puppet runs on scb*
  • 14:00 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=restbase-async,name=codfw
  • 13:23 elukey: restart hhvm on mw116[89] after https://gerrit.wikimedia.org/r/345829
  • 13:19 gehel: rolling restart of maps-test cluster for kernel upgrade
  • 13:09 moritzm: rebooting bromine to Linux 4.9
  • 12:10 moritzm: rebooting mwdebug* to Linux 4.9
  • 12:05 moritzm: rebooting pybal-test* to Linux 4.9
  • 11:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after maintenance (duration: 00m 49s)
  • 10:47 akosiaris: uploaded jessie-wikimedia kubernetes_1.4.6-4 on apt.wikimedia.org/jessie-wikimedia
  • 09:59 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2244.codfw.wmnet
  • 09:56 elukey: set pooled=yes mw210[56789], mw2260 and mw2213 (and cleaned up old /srv/mediawiki dirs that were causing rsync spam in scap pull)
  • 09:52 marostegui: Adding rev_timestamp index to revision page db1066 (s1) - T132416
  • 09:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for maintenance (duration: 00m 44s)
  • 09:47 elukey: restart hhvm on mw1197 - hhvm dump debug in /tmp/hhvm.14540.bt. - threads stuck in Treadmill::getAgeOldestRequest (HHVM 3.12)
  • 09:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1062 (duration: 00m 45s)
  • 09:35 godog: fix long-standing swift-account-server REPLICATE backtrace error on ms-be1022 - https://bugs.launchpad.net/swift/+bug/1424108
  • 09:21 godog: delete stray nginx error log with debug logging on thumbor1002
  • 08:28 moritzm: repooled mw1261 for more HHVM 3.18 debugging
  • 07:29 marostegui: Start pt-table-checksum on s5 wikidatawiki - T161294
  • 02:44 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 31 02:44:11 UTC 2017 (duration 5m 30s)
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 15m 54s)

2017-03-30

  • 23:58 aude@tin: Synchronized php-1.29.0-wmf.18/extensions/Wikidata: Fixes for special pages (duration: 02m 15s)
  • 23:07 catrope@tin: Synchronized php-1.29.0-wmf.18/extensions/ORES/modules/: T161706 (duration: 00m 51s)
  • 21:34 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 -> wmf.18
  • 21:16 awight: reenabled ingenico orphan rectifier (jenkins)
  • 21:08 awight: disable ingenico orphan rectifier (jenkins)
  • 21:07 demon@tin: Synchronized php-1.29.0-wmf.18/extensions/Echo/includes/model/Event.php: fix logging class reference (duration: 00m 47s)
  • 19:25 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.18
  • 18:59 MaxSem: Portals were not deployed: https://phabricator.wikimedia.org/T161832
  • 18:54 maxsem@tin: Synchronized portals/: (no justification provided) (duration: 00m 48s)
  • 18:45 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:45 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 43s)
  • 18:38 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:29 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 45s)
  • 18:28 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 44s)
  • 18:24 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:24 godog: swift eqiad-prod add ms-be1028 -> ms-be1039 - T160640
  • 18:23 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 44s)
  • 18:18 maxsem@tin: Synchronized portals: (no justification provided) (duration: 00m 44s)
  • 18:17 maxsem@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 45s)
  • 17:32 elukey: shutdown analytics1039 to apply new thermal paste - T132256
  • 16:17 godog: upgrade thumbor to 0.1.37 on thumbor100[12]
  • 16:03 _joe_: restarting hhvm on mw1191, stuck in HPHP::Treadmill::getAgeOldestRequest
  • 15:59 twentyafterfour@tin: Synchronized php-1.29.0-wmf.17/includes/: sync I7c5c0a refs T159319 (duration: 01m 41s)
  • 14:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1057 entry from s1 shard - T160435 (duration: 00m 44s)
  • 14:38 godog: run stress test (w/ bonnie) on new swift hw - T160640
  • 14:33 andrewbogott: upgrading nova-compute to 12.0.6 on all labvirts
  • 14:33 moritzm: rebooting restbase2001 to Linux 4.9
  • 14:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090 - T17441 (duration: 00m 45s)
  • 14:22 kaldari@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 48s)
  • 14:21 kaldari: sync InitialiseSettings.php to enable cookie blocking on English Wikipedia
  • 14:06 oblivian@tin: Synchronized wmf-config/ProductionServices.php: switch to discovery for cxserver,eventbus (duration: 00m 43s)
  • 14:01 oblivian@tin: Synchronized wmf-config/ProductionServices.php: switch to discovery for some records (duration: 00m 47s)
  • 13:46 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow eliminators and autoreviewers to move a file on ptwiki (T161532) Assign move-categorypages to sysops&bots only on nlwiki (T161551) Enable Multimedia Viewer at officewiki (T160420) (duration: 00m 44s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: [cleanup] Remove expired rules (T161530) (duration: 00m 45s)
  • 12:49 moritzm: rebooting bast4001 for kernel update to 4.9
  • 12:18 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=eventbus,name=codfw
  • 12:15 hoo: Updated the Constraints table on Wikidata, per T160506.
  • 12:02 moritzm: installing glibc security updates on trusty
  • 11:55 moritzm: installing jbig2dec security updates
  • 10:03 moritzm: repooling mw1261 for additional test
  • 09:48 root@tin: Synchronized wmf-config/db-eqiad.php: Uniform maintenance message and indentation (duration: 00m 47s)
  • 09:34 root@tin: Synchronized wmf-config/db-codfw.php: Uniform maintenance message and indentation (duration: 00m 44s)
  • 09:06 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet
  • 09:05 elukey: depooling mw1261 (hhvm-dump-debug in /tmp/hhvm.98736.bt.)
  • 08:38 moritzm: repooling mw1261 to reproduce hhvm deadlock with higher debug level
  • 08:13 marostegui: Convert UNIQUE keys to PK on db1090 (s2) - T17441
  • 08:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 - T17441 (duration: 00m 44s)
  • 07:43 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2021.codfw.wmnet
  • 07:41 gehel: pull elastic2021 back into active duty - T149006
  • 07:05 ema: upgrading twisted to 16.2.0 on lvs100[123] (eqiad primaries) T160433
  • 06:45 moritzm: installing apparmor security updates on trusty
  • 06:25 marostegui: Logging backwards for the record: restart mysql on db1047 for maintenance - T160454
  • 05:56 marostegui: Deploy schema change on db2014 - codfw master (this will generate lag on codfw) - T73563
  • 05:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T17441 (duration: 00m 45s)
  • 05:51 ema: upgrading twisted to 16.2.0 on lvs100[456] (eqiad secondaries) T160433
  • 03:03 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 30 03:03:31 UTC 2017 (duration 5m 49s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 13m 43s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 07m 20s)
  • 01:52 twentyafterfour: phd fixed on iridium. libphutil was out of sync with phd source
  • 01:11 twentyafterfour: running `puppet agent --test` on iridium
  • 00:10 twentyafterfour: Phabricator update completed.
  • 00:10 mutante: ruthenium low on disk space, because /srv/visualdiff/pngs (parsoid-vd-tests) is pretty large and /srv isn't a separate mount
  • 00:06 twentyafterfour: updating phabricator on iridium
  • 00:04 mutante: ruthenium - apt-get clean gets a little more disk space

2017-03-29

  • 23:46 reedy@tin: Synchronized php-1.29.0-wmf.18/extensions/Quiz: Fix undefined variable stateObject T161735 (duration: 00m 49s)
  • 23:43 reedy@tin: Synchronized wmf-config/CommonSettings.php: Dont use EP_NS in CommonSettings (duration: 00m 44s)
  • 21:23 krinkle@tin: Synchronized errorpages/: I15295835a1a (duration: 00m 44s)
  • 20:56 thcipriani@tin: Synchronized php-1.29.0-wmf.18/extensions/ProofreadPage/includes/page/ProofreadPagePage.php: Makes sure to always return a Title in ProofreadPagePage::findIndexTitle T161734 (duration: 00m 46s)
  • 20:35 halfak@tin: Finished deploy [ores/deploy@554ea12]: T160638 (duration: 18m 40s)
  • 20:31 arlolra: Updated Parsoid to b1b27146 (T161558, T160207, T153798)
  • 20:21 arlolra@tin: Finished deploy [parsoid/deploy@bc798dc]: Updating Parsoid to b1b27146 (duration: 07m 26s)
  • 20:16 halfak@tin: Started deploy [ores/deploy@554ea12]: T160638
  • 20:13 arlolra@tin: Started deploy [parsoid/deploy@bc798dc]: Updating Parsoid to b1b27146
  • 20:09 ppchelko@tin: Finished deploy [changeprop/deploy@ef62908]: Fix metrics for regex topics (duration: 00m 56s)
  • 20:08 ppchelko@tin: Started deploy [changeprop/deploy@ef62908]: Fix metrics for regex topics
  • 19:46 ppchelko@tin: Finished deploy [changeprop/deploy@1150cf5]: Config: Enabling regex-based topic subscription (duration: 01m 45s)
  • 19:44 ppchelko@tin: Started deploy [changeprop/deploy@1150cf5]: Config: Enabling regex-based topic subscription
  • 19:16 awight: re-run today's ingenico audit job
  • 19:15 awight: pick at paypal scab: re-run audit parser
  • 19:10 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 back to 1.29.0-wmf.17
  • 19:05 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.18
  • 18:45 bblack: varnish active/active deploy done ( https://gerrit.wikimedia.org/r/#/c/339667/ ) - all caches running the new code, puppet re-enabled, etc.
  • 18:43 hoo: Started a Wikidata TTL dump run on snapshot1007 using Zend (due to T161695).
  • 18:22 catrope@tin: Synchronized php-1.29.0-wmf.18/includes/page/WikiPage.php: T159319 (duration: 00m 44s)
  • 18:22 catrope@tin: Synchronized php-1.29.0-wmf.18/includes/Title.php: T159319 (duration: 00m 46s)
  • 18:10 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Save->Publihs on Wikipedias except dewiki and enwiki (T131132); set wgOOUIEditPage false everywhere (duration: 00m 57s)
  • 17:58 awight: disabling PayPal audit parser
  • 17:56 ppchelko@tin: Finished deploy [changeprop/deploy@e4547cd]: Support regexed topics (duration: 00m 55s)
  • 17:55 ppchelko@tin: Started deploy [changeprop/deploy@e4547cd]: Support regexed topics
  • 17:31 godog: remove ge-3/0/27 from interface-range labs-instance-ports (now for ms-be1031)
  • 17:25 bblack: puppet disabled on all cp* ahead of careful deploy for https://gerrit.wikimedia.org/r/#/c/339667/
  • 17:12 mutante: removing parsoid-tests.wikimedia.org from DNS - replaced by more specific parsoid-rt-tests and parsoid-vd-tests
  • 17:12 nuria@tin: Finished deploy [eventlogging/analytics@2874077]: (no justification provided) (duration: 00m 03s)
  • 17:12 nuria@tin: Started deploy [eventlogging/analytics@2874077]: (no justification provided)
  • 17:11 elukey: restarting nginx on eqiad appservers to pick up the new certs
  • 16:55 marostegui: Stop eventlog syncs to db1047 and dbstore1002 for maintenance - T160454
  • 16:53 marostegui: Disable puppet on db1047 and dbstore1002 for maintenance - T160454
  • 16:51 elukey: upgrading ssl cert appservers.svc.eqiad.wmnet to include the new discovery endpoints
  • 16:51 _joe_: actually performing the parsoid rolling restart in codfw
  • 16:31 _joe_: rolling restart of parsoid in codfw
  • 14:32 moritzm: installing apparmor security updates on trusty
  • 14:31 elukey: upgrading ssl cert api.svc.eqiad.wmnet to include the new discovery endpoints
  • 14:14 andrewbogott: disabling puppet on labs hosts for a staged rollout of https://gerrit.wikimedia.org/r/#/c/345275/
  • 14:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 - T17441 (duration: 00m 44s)
  • 13:49 elukey: upgrading ssl cert rendering.svc.eqiad.wmnet to include the new discovery endpoints
  • 13:08 reedy@tin: Synchronized wmf-config/CommonSettings.php: use wfLoadExtension for VisualEditor (duration: 00m 44s)
  • 12:53 elukey: reimage analytics1045 to Debian Jessie
  • 12:52 _joe_: depooling wtp1001 to test puppet/confd transfer of responsibilities
  • 11:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T17441 (duration: 00m 44s)
  • 11:30 hoo: Started a Wikidata JSON dump run on snapshot1007 using Zend (due to T161695).
  • 11:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T17441 (duration: 00m 44s)
  • 11:03 elukey: upgrading ssl cert appservers.svc.codfw.wmnet to include the new discovery endpoints
  • 11:01 moritzm: Linux 4.9 uploaded for jessie-wikimedia (along with new meta package linux-meta-4.9 and updated firmware)
  • 11:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T17441 (duration: 00m 44s)
  • 10:27 godog: reimage netmon1001 with jessie
  • 10:12 ema: emptying /srv/log/parsoid/main.log.1 (3.2G!) on ruthenium to reclaim some disk space
  • 10:11 elukey: upgrading ssl cert api.svc.codfw.wmnet to include the new discovery endpoints
  • 09:39 ema: upgrading twisted to 16.2.0 on lvs200[123] (codfw primaries) T160433
  • 08:54 ema: upgrading twisted to 16.2.0 on lvs200[456] (codfw secondaries) T160433
  • 08:39 ema: apt.w.o: set digest-algo to sha256 in gpg.conf T132325
  • 08:29 elukey: upgrading ssl cert rendering.svc.codfw.wmnet to include the new discovery endpoints
  • 07:57 marostegui: Convert s6 UNIQUE keys into PK on db1093 - T17441
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T17441 (duration: 00m 54s)
  • 06:01 marostegui: Keep converting UNIQUE keys to PK on s4 - db1091 - T17441
  • 03:15 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 29 03:15:16 UTC 2017 (duration 5m 53s)
  • 03:09 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.18) (duration: 14m 55s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 13m 41s)
  • 01:41 krinkle@tin: Synchronized errorpages/404.php: Match 404.html and default.html - Id58e25afbe (duration: 00m 44s)
  • 01:16 mutante: rsyncing librenms/torrus/smokeping app data from netmon1001 to gerrit2001. adding alias "syncit" to do it all at once (T125020)
  • 00:57 paravoid: Removing upload.wikimedia.org/index.html ("swift delete root index.html") from both eqiad/codfw

2017-03-28

  • 23:22 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/NavigationTiming/modules/ext.navigationTiming.js: SWAT: ext.NavigationTiming: Restore unsampled Save Timing T161368 (duration: 00m 45s)
  • 23:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable header version 2 on all wikis T160471 (duration: 00m 45s)
  • {{safesubst:SAL entry|1=22:45 urandom: T111113: Restarting Cassandra instances, eqiad row 'd' {{done]}}}
  • 22:21 mutante: DNS - creating new language "dty" (T161529) - running "authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones" to trigger re-creation of zone files after change in langs.tmpl. (gerrit:345077) | https://www.ethnologue.com/language/dty
  • 22:19 mutante: DNS - creating new language "dty" (T160865) - running "authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones" to trigger re-creation of zone files after change in langs.tmpl. (gerrit:345077) | https://www.ethnologue.com/language/dty
  • 21:55 urandom: T111113: Restarting Cassandra instances, eqiad row 'd'
  • 21:55 urandom: T111113: Restarting Cassandra instances, eqiad row 'b' Yes Done
  • 21:18 andrewbogott: upgraded nova-compute on labvirt1014 because it contains a long-awaited bugfix
  • 21:08 urandom: T111113: Restarting Cassandra instances, eqiad row 'b'
  • 21:08 urandom: T111113: Restarting Cassandra instances, eqiad row 'a' Yes Done
  • 20:24 mutante: ms-fe1001 thru msfe1004 - scheduled last downtime for host and services in icinga - shutdown -h now, turn them off, revoke puppet certs, salt-keys... (T160986)
  • 20:22 mutante: mc1019 - puppet fail due to Failed resource /etc/redis/replica since 4 days
  • 20:21 urandom: T111113: Restarting Cassandra instances, eqiad row 'a'
  • 20:21 mutante: copper - puppet errors due to Failed resource /var/lib/docker/devicemapper ??
  • 20:19 mutante: mwdebug1002 - same, was low on disk space, 'apt-get clean' freed > 3GB
  • 20:18 mutante: mwdebug1001 - was low on disk space, 'apt-get clean' - freed about 4GB
  • 20:15 mutante: mw1261 - depooled
  • 20:14 mutante: mw1261 runs with HHVM 3.18 - which seems to have a bug leading to a deadlock every 4-5 hours
  • 20:14 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.18
  • 20:13 mutante: mw1261 HHVM crash as predicted by Moritz - ran sudo hhvm-dump-debug. Backtrace saved as /tmp/hhvm.79460.bt.
  • 20:06 mutante: ms-fe100[1-4] - disable/stop puppet, stop salt minion, decom (T160986)
  • 19:57 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.18 and rebuild l10n cache (duration: 40m 19s)
  • 19:37 mobrovac: restbase deploying d477f495
  • 19:33 urandom: T111113: Restarting Cassandra instances, codfw row 'd' Yes Done
  • 19:17 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.18 and rebuild l10n cache
  • 18:45 urandom: T111113: Restarting Cassandra instances, codfw row 'd'
  • 18:44 urandom: T111113: Restarting Cassandra instances, codfw row 'c' Yes Done
  • 18:18 ppchelko@tin: Finished deploy [changeprop/deploy@1689d86]: Rename event field in logs (duration: 00m 52s)
  • 18:18 ppchelko@tin: Started deploy [changeprop/deploy@1689d86]: Rename event field in logs
  • 17:53 urandom: T111113: Restarting Cassandra instances, codfw row 'c'
  • 17:22 thcipriani: starting branch cut for 1.29.0-wmf.18
  • 17:07 godog: swift codfw-prod: bump ms-be2028 ms-be2039 object weight to 3000 - T158337
  • 17:06 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2021.codfw.wmnet
  • 16:39 urandom: T111113: Restarting remaining Cassandra instances, rack 'b', codfw (restbase20{02,07,10})
  • 16:19 urandom: T111113: Restarting Cassandra on restbase2001 to apply mandatory client encryption (canary)
  • 15:56 gehel: banning elastic2021 to run same tests as elastic2020 - T149006
  • 14:41 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet
  • 14:40 marostegui: Convert UNIQUE keys into PK on db1091 (commonswiki) - T17441
  • 14:38 ppchelko@tin: Finished deploy [changeprop/deploy@bfbaa17]: Increase log level for processinng failures (duration: 01m 07s)
  • 14:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 - T17441 (duration: 00m 43s)
  • 14:38 elukey: ran restart-hhvm on mw1242, hhvm threads stuck (dump debug in /tmp/hhvm.9008.bt.) - HHVM 3.12
  • 14:37 ppchelko@tin: Started deploy [changeprop/deploy@bfbaa17]: Increase log level for processinng failures
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 - T17441 (duration: 00m 43s)
  • 13:44 elukey: started hhvm on mw1261 (still depooled) - no hhvm process running
  • 13:29 RoanKattouw: Ran initUserPreference.php -s ores-enabled -t rcenhancedfilters and -s ores-enabled -t oresHighlight on plwiki and ptwiki
  • 13:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RCFilters beta feature on plwiki and ptwiki T158336 (duration: 00m 43s)
  • 12:58 moritzm: depooled mw1261
  • 10:39 ema: upgrading twisted to 16.2.0 on lvs3001 and lvs3002 (esams primaries) T160433
  • 10:36 ema: upgrading twisted to 16.2.0 on lvs3003 and lvs3004 (esams secondaries) T160433
  • 10:27 marostegui: Convert dewiki UNIQUE keys into PK on db1092 - https://phabricator.wikimedia.org/T17441
  • 10:15 elukey: Switching hue.w.o's backend (cache misc) from anaytics1027 to thorium - T159527
  • 10:10 moritzm: upgraded mw1262 to HHVM 3.18
  • 08:48 marostegui: Convert wikidatawiki UNIQUE keys into PK on db1092 - T17441
  • 08:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T17441 (duration: 00m 44s)
  • 08:29 akosiaris: enable IGMP snooping on all VLANs on asw2-d-eqiad. T133387
  • 07:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T17441 (duration: 00m 43s)
  • 07:18 moritzm: installing eject security updates on trusty hosts
  • 06:11 marostegui: Keep converting unique keys into PK on db1089 - T17441
  • 06:01 marostegui: Deploy schema change on s2.enwiktionary.templatelinks - on codfw master, this will generate lag on codfw slaves (which have been silenced) - T154097
  • 05:52 marostegui: Run pt-table-checksum on es2 - T161510
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Mar 28 02:39:53 UTC 2017 (duration 5m 28s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 12m 37s)
  • 00:36 reedy@tin: Synchronized private: Remove mwblocker.log (duration: 00m 44s)
  • 00:34 reedy@tin: Synchronized wmf-config/CommonSettings.php: Remove $wgProxyList (duration: 00m 43s)

2017-03-27

  • 23:06 ebernhardson@tin: Synchronized php-1.29.0-wmf.17/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT T160006, turning off cirrussearch AB test for sistersearch (duration: 00m 44s)
  • 22:00 bawolff: deployed patch T151735
  • 21:27 andrewbogott: disabling puppet on labvirt* and labcontrol* to stagger roll out of https://gerrit.wikimedia.org/r/#/c/344689/
  • 20:29 arlolra: Updated Parsoid to 6eaad376 (T160599, T161178, T133267)
  • 20:21 arlolra@tin: Finished deploy [parsoid/deploy@371ba4f]: Updating Parsoid to 6eaad376 (duration: 07m 06s)
  • 20:14 arlolra@tin: Started deploy [parsoid/deploy@371ba4f]: Updating Parsoid to 6eaad376
  • 19:57 mutante: ruthenium/varnish misc - remove parsoid-tests.wikimedia.org server_name / backend - replaced by parsoid-rt-test and parsoid-vd-tests
  • 19:56 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Linter: whitelist parsoid canaries too - https://gerrit.wikimedia.org/r/#/c/344998/ - T160573 (duration: 00m 44s)
  • 18:08 mobrovac@tin: Finished deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config, deploying to scb2004 (duration: 00m 43s)
  • 18:07 mobrovac@tin: Started deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config, deploying to scb2004
  • 18:05 mobrovac@tin: Finished deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config (duration: 03m 29s)
  • 18:02 mobrovac@tin: Started deploy [mobileapps/deploy@92f693c]: Remove the proxy from the config
  • 17:19 mutante: tin/mira: welcome new mediawiki deployer 'musikanimal' (T161181)
  • 17:03 gehel@tin: Finished deploy [wdqs/wdqs@d07586c]: (no justification provided) (duration: 01m 26s)
  • 17:02 gehel@tin: Started deploy [wdqs/wdqs@d07586c]: (no justification provided)
  • 16:40 mobrovac: restbase deploying f53bec41
  • 16:34 _joe_: cleaned the bc cache on mw1261, restarted hhvm and repooled
  • 15:46 mobrovac@tin: Finished deploy [mobileapps/deploy@aed916b]: Add discovery.wmnet to no_proxy_list (duration: 04m 05s)
  • 15:42 mobrovac@tin: Started deploy [mobileapps/deploy@aed916b]: Add discovery.wmnet to no_proxy_list
  • 15:38 mobrovac@tin: Finished deploy [cxserver/deploy@40e86ad]: Add discovery.wmnet to no_proxy_list (duration: 02m 39s)
  • 15:35 mobrovac@tin: Started deploy [cxserver/deploy@40e86ad]: Add discovery.wmnet to no_proxy_list
  • 14:33 dcausse: rebuilding ttmserver index in elastic@codfw from wasat
  • 14:14 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add autopatrolled group to svwiki (T161210) (duration: 00m 50s)
  • 13:40 dereckson@tin: Synchronized wmf-config/CommonSettings.php: no-op, to force resync (duration: 00m 43s)
  • 13:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Fix wgLogoHD 2.5x key (T161416) (duration: 00m 43s)
  • 13:26 dereckson@tin: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] enable more accurate regex timeout (T161095) (duration: 00m 44s)
  • 13:22 moritzm: repooled mw1261 (now that fix for lcfirst() issue from T161095 is deployed)
  • 13:20 dereckson@tin: Synchronized static/images/project-logos/: Add khw.wikipedia logos to static resources (T160865) (duration: 00m 43s)
  • 13:19 dereckson@tin: Synchronized wmf-config/: [es5 upgrade] step 5: restore normal operations (T157479, 2/2) (duration: 00m 49s)
  • 13:18 dereckson@tin: Synchronized tests/cirrusTest.php: [es5 upgrade] step 5: restore normal operations (T157479, 1/2) (duration: 00m 48s)
  • 13:08 dereckson@tin: Synchronized wmf-config/CirrusSearch-common.php: Updates and typo fixes to CirrusSearch-common.php (gerrit:344933) (duration: 00m 43s)
  • 12:47 marostegui: Run pt-table-checksum for a couple of hundred small wikis in es2 - T161510
  • 12:44 jynus: deploying semi-sync replication to all hosts on codfw T161007
  • 12:21 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata/vendor/composer/installed.json: Third try for Update Wikidata - fix term validation (T161263) Part III (duration: 00m 43s)
  • 12:19 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata/extensions/Wikibase/: Third try for Update Wikidata - fix term validation (T161263) Part II (duration: 01m 32s)
  • 12:19 godog: upgrade grafana to 4.2.0 on krypton T161193
  • 12:17 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata/composer.lock: Third try for Update Wikidata - fix term validation (T161263) Part I (duration: 00m 44s)
  • 12:16 Amir1: start of ladsgroup@tin:/srv/mediawiki-staging$ scap sync-file php-1.29.0-wmf.17/extensions/Wikidata/composer.lock 'Third try for Update Wikidata - fix term validation (T161263) Part I'
  • 12:02 _joe_: experimenting with cxserver config on scb2004
  • 12:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1053 T160415 - T73563 (duration: 01m 07s)
  • 11:51 marostegui: Deploy new index on db1040, s4 primary master table: commonswiki.image - T160415
  • 11:14 akosiaris: upgraded bacula-sd to 7.4.3+dfsg-1+sid1~bpo8+1 on heze as well
  • 11:03 akosiaris: performed bacula schema change on db1016 for database bacula
  • 11:00 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=ores
  • 10:59 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=.*-ro
  • 10:54 akosiaris: upgrade bacula director and storage daemon to 7.4.3
  • 10:47 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=(kartotherian|search)
  • 10:20 hashar: Restarting Jenkins to drop the Throttle Concurrent Builds plugin - T158596
  • 10:16 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata: Second try for Update Wikidata - fix term validation (T161263) (duration: 02m 05s)
  • 10:15 Amir1: start of ladsgroup@tin:/srv/mediawiki-staging/php-1.29.0-wmf.17$ scap sync-dir php-1.29.0-wmf.17/extensions/Wikidata "Second try for Update Wikidata - fix term validation (T161263)"
  • 09:56 _joe_: rolling restart of restbase in codfw to pick up the new parsoid config
  • 09:54 ladsgroup@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata: Update Wikidata - fix term validation (T161263) (duration: 02m 22s)
  • 09:53 mforns@tin: Finished deploy [analytics/aqs/deploy@a5e1775]: (no justification provided) (duration: 01m 41s)
  • 09:52 Amir1: start of ladsgroup@tin:/srv/mediawiki-staging/php-1.29.0-wmf.17$ scap sync-dir php-1.29.0-wmf.17/extensions/Wikidata "Update Wikidata - fix term validation (T161263)"
  • 09:52 mforns@tin: Started deploy [analytics/aqs/deploy@a5e1775]: (no justification provided)
  • 09:35 mforns@tin: Finished deploy [analytics/aqs/deploy@80a9de4]: (no justification provided) (duration: 01m 49s)
  • 09:33 mforns@tin: Started deploy [analytics/aqs/deploy@80a9de4]: (no justification provided)
  • 08:42 jynus: deploying semisync replication to all hosts (eqiad and codfw) on s6 T161007
  • 08:38 hashar@tin: Synchronized php-1.29.0-wmf.17/languages/classes/LanguageKk.php: Check for string initialization in lcfirst() for HHVM 3.18 - T161095 (duration: 00m 52s)
  • 08:16 marostegui: Deploy alter tables on db1089 (depooled) for a bunch of tables to convert UNIQUE keys into PK for testing - T17441
  • 08:01 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2014 after maintenance (duration: 00m 43s)
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T17441 (duration: 00m 45s)
  • 07:17 elukey@puppetmaster1001: conftool action : set/pooled=active; selector: name=mw2256.codfw.wmnet
  • 06:26 marostegui: Deploy alter table s4 (commonswiki) db1053 - https://phabricator.wikimedia.org/T73563 https://phabricator.wikimedia.org/T160415
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 T160415 - T73563 (duration: 00m 56s)
  • 06:18 _joe_: disabling puppet on authdns while merging a dns change
  • 06:06 marostegui: Resume pt-table-checksum on dewiki - T161294
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 27 02:26:53 UTC 2017 (duration 5m 25s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 08m 22s)

2017-03-26

  • 10:06 _joe_: restarting apache2 on puppetmaster2002, passenger probably stuck
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 26 02:26:05 UTC 2017 (duration 5m 25s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 07m 20s)

2017-03-25

  • 22:28 legoktm@tin: Synchronized wmf-config/: No-op labs only changes https://gerrit.wikimedia.org/r/#/c/344788/ (duration: 00m 52s)
  • 20:08 Krinkle: Ran mwscript deleteEqualMessages.php on public wikis (T45917) - deleted 5 pages across 5 wikis
  • 13:43 dereckson@tin: Synchronized wmf-config/throttle.php: Add throttle rule for BordeauxJS (T161402) (duration: 00m 50s)
  • 03:15 Krinkle: Re-create optimised indexes for xhgui in mongodb on tungsten per https://github.com/perftools/xhgui/tree/v0.7.0#installation (lost after T161196)
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Mar 25 02:36:46 UTC 2017 (duration 5m 27s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 12m 22s)

2017-03-24

  • 20:03 krinkle@tin: Synchronized php-1.29.0-wmf.17/StartProfiler.php: touch - T161286 - (symlink) (duration: 00m 42s)
  • 19:55 krinkle@tin: Synchronized wmf-config/StartProfiler.php: T161286 - include hostname (duration: 00m 49s)
  • 19:33 krinkle@tin: Synchronized wmf-config/StartProfiler.php: touch - T161286 - hhvm cache maybe? (duration: 00m 43s)
  • 18:10 ejegg: updated CiviCRM from d3c439f to b6c8f3e
  • 17:50 ebernhardson: restart elasticsearch on relforge100[12] to test reindex api over https
  • 15:27 jynus: running unscheduled ALTER TABLE on arbcom_cswiki.archive T104756
  • 13:47 moritzm: installing freetype security updates on trusty
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 T160415 - T73563 (duration: 00m 44s)
  • 12:13 marostegui: Start first run of pt-table-checksum on s5 (dewiki) - T161294
  • 11:18 godog: upgrade grafana to 4.2.0 on labmon1001 - T161193
  • 09:39 godog: pool prometheus100[34] - T148408
  • 08:23 marostegui: Deploy schema change s4 db2019 (codfw master) - T160415
  • 08:01 ema: upgrading twisted to 16.2.0 on lvs4001 and lvs4002 (ulsfo primaries) T160433
  • 07:49 marostegui: Deploy schema change s4 on db1069 and db1056 - T160415 - T73563
  • 07:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056, repool db1059 T160415 - T73563 (duration: 00m 43s)
  • 07:42 moritzm: installing git updates on trusty
  • 07:35 dcausse: cirrus: refresh comp suggest indices in elastic@codfw
  • 07:26 ema: upgrading twisted to 16.2.0 on lvs4003 and lvs4004 (ulsfo secondaries) T160433
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1070, db1071 and db1082 - T137191 (duration: 00m 43s)
  • 06:10 Krinkle: Removing xhgui.results entries before 1-Dec-2016 finished. Running xhgui->command(compact=>results) now. T161196
  • 02:31 Krinkle: Reverted patch - https://gerrit.wikimedia.org/r/#/c/344569/
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 08m 35s)
  • 02:28 Krinkle: Reminder to incident doc writer: Was difficult figuring out what the last "real" patch was, scap message for SAL is manually written (not says which commit in which repo), and git log contains noise from security patches. We need simple revert options from the flat git tree at /srv/mediawiki
  • 02:26 Krinkle: Reminder to incident doc writer: Logstash was (and is) not responsive serving Kibana-rendered errors about logstash Service unavailable
  • 02:25 Krinkle: All apaches are back up
  • 02:24 krinkle@tin: Synchronized php-1.29.0-wmf.17/extensions/Wikidata: revert (duration: 02m 34s)
  • 02:24 MaxSem: Killed l10nupdate on tin, was blocking emergency pushes
  • 02:22 Krinkle: Hard-killed all l10nupdate processes and rm'ed scap lock
  • 02:11 Krinkle: Removing xhgui.results entries from before 1 December 2016 in MongoDB on tungsten (T161196)
  • 01:45 mutante: bacula - on helium, attempt to start bacula-director process, attempt to fix permissions on key files as codified in director.pp
  • 01:40 catrope@tin: Finished scap: Wikidata cherry-picks (with i18n) (duration: 25m 03s)
  • 01:15 catrope@tin: Started scap: Wikidata cherry-picks (with i18n)

2017-03-23

  • 23:26 Krinkle: Removing xhgui.results entries from before 1 June 2016 (T161196)
  • 23:12 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/ORES: SWAT: Stats: Invert "false" thresholds so they are correct T161250 (duration: 00m 52s)
  • 23:05 Pchelolo: update RESTBase to 2536b25c7 - eqiad
  • 22:56 Pchelolo: update RESTBase to 2536b25c7 - staging
  • 22:39 Pchelolo: update RESTBase to 2536b25c7 - codfw
  • 21:36 krinkle@tin: Synchronized wmf-config/StartProfiler.php: (no justification provided) (duration: 00m 53s)
  • 21:05 ejegg: rolled back payments-wiki to 9622a4b
  • 21:00 ejegg: updated payments from 9622a4b to bb956bf
  • 20:16 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.17
  • 19:34 thcipriani@tin: Synchronized php: Swap symlink for 1.29.0-wmf.17 (duration: 00m 43s)
  • 19:11 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.17
  • 18:51 bblack: systemctl enable+start of lldpd on cp2009, cp1051, cp1061 (mysteriously dead and disabled)
  • 18:16 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Remove exception on Other Projects sidebar for Dutch Wikipedia (T159634) (duration: 00m 47s)
  • 18:04 Pchelolo: update RESTBase to 9d2b393fb - production
  • 17:52 Pchelolo: update RESTBase to 9d2b393fb - staging
  • 16:46 mobrovac@tin: Started restart [parsoid/deploy@0c22f72]: (no justification provided)
  • 16:45 _joe_: reenabling puppet on all jobqueue redises
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1082 weight - T137191 (duration: 00m 43s)
  • 16:22 hashar: Merged operations/puppet.git Jenkins job in a single one that runs tox then rake - T160923
  • 16:10 urandom: T111113: Live-hacking client encryption to be non-optional, to verify cqlsh encryption, restbase1007-a.eqiad.wmnet
  • 16:07 mobrovac: restbase deploy 752ca4b7
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1082 weight - T137191 (duration: 00m 43s)
  • 15:32 moritzm: upgrading restbase-test* to Linux 4.9
  • 14:59 akosiaris: enabling and running puppet on rdb200X fleet in a rolling restart scheme
  • 14:59 akosiaris: disabled puppet on rdb* fleet
  • 14:56 andrewbogott: dist-upgrading labvirt1001 and rebooting it a few times
  • 14:22 moritzm: installing exim4 updates from jessie point release
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 with low weight - T137191 (duration: 00m 48s)
  • 13:53 dcausse: cirrus: refreshing comp suggest indices in elastic@eqiad to measure times
  • 12:59 marostegui: Deploy schema change s4 on db1064 https://phabricator.wikimedia.org/T160415 - https://phabricator.wikimedia.org/T73563
  • 12:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1059, depool db1064 T160415 - T73563 (duration: 00m 43s)
  • 12:27 moritzm: installing libxml2 security updates
  • 12:21 marostegui: Deploy schema change s4 on labsdb1003 https://phabricator.wikimedia.org/T160415 - https://phabricator.wikimedia.org/T73563
  • 12:05 jynus: converting es2014 tables back to uncompressed InnoDB T129350
  • 11:08 godog: codfw-prod: bump ms-be2028 ms-be2039 object weight to 2000 T158337
  • 11:01 godog: pool prometheus200[34] / depool prometheus200[12] - T148408
  • 11:01 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool all es2XXX servers, depool es2014 for maintenance (duration: 00m 43s)
  • 10:59 hashar: Actually restarting Jenkins for email plugins upgrades
  • 10:20 hashar: Jenkins jobs got slightly blocked because I forgot to cancel the shutdown when jobs had to run.
  • 09:58 hashar: Jenkins: upgrading plugins email-ext and mailer
  • 09:14 hashar: Jenkins upgrading SSH Slaves plugin. Might cause disruption in CI
  • 08:47 moritzm: repooled mw1261 now that T161095 is deployed
  • 08:29 marostegui: Stop db1070 MySQL db1070 for maintenance - T137191
  • 08:06 moritzm: installing audiofile security updates
  • 07:37 marostegui: Deploy schema change s4 on db1059 and labsdb1001 T160415 - T73563
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1068, depool db1059 T160415 - T73563 (duration: 00m 43s)
  • 07:08 marostegui: Stop MySQL db1082 for maintenance - https://phabricator.wikimedia.org/T137191
  • 06:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T137191 (duration: 00m 44s)
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 23 03:14:05 UTC 2017 (duration 5m 47s)
  • 03:08 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.17) (duration: 14m 39s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 26s)
  • 01:46 mutante: added ottomata and milimetric to "wmf-deployers" in Gerrit web ui, both have existing (deployment resp. root) shell already (T161157)

2017-03-22

  • 22:59 RainbowSprinkles: gerrit: Quick service restart, picking up new config
  • 21:25 awight: reenabling Jenkins orphan rectifier job
  • 21:18 andrewbogott: rebooting labvirt1001 because it is being terrible. https://phabricator.wikimedia.org/T159835
  • 21:05 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: No-op, beta (duration: 00m 47s)
  • 21:00 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: No-op, beta (duration: 00m 43s)
  • 20:52 awight: disabling Ingenico orphan rectifier
  • 20:43 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Group0 to php-1.29.0-wmf.17
  • 20:05 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/ZeroPortal/includes/ApiZeroPortal.php: Failure to parse json config should result in a usable error T161036 (duration: 00m 42s)
  • 20:04 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Remove redundant whitelist read list for grantswiki (duration: 00m 44s)
  • 19:55 thcipriani@tin: Synchronized php-1.29.0-wmf.17/extensions/Flow: Make sure topiclist queries always join against workflow table T121644 (duration: 00m 59s)
  • 19:45 thcipriani@tin: Synchronized php-1.29.0-wmf.17/includes/Revision.php: Make Revision::getRevisionText() cache the converted text (duration: 00m 44s)
  • 19:44 mutante: rsyncing /srv of netmon1001 to /srv/netmon1001 on gerrit2001 (T125020)
  • 19:37 jynus: deploying m2 dns additions on codfw
  • 19:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1094 with full weight (duration: 00m 43s)
  • 17:10 _joe_: restarted ocg on ogc1001, not serving http queries
  • 16:55 jynus: shutting down es2016's mariadb to clone to es2015
  • 15:41 hashar@tin: Synchronized php-1.29.0-wmf.16/languages/classes/LanguageKk.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 44s)
  • 15:40 hashar@tin: Synchronized php-1.29.0-wmf.16/languages/classes/LanguageAz.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 48s)
  • 15:36 hashar: Deploying LanguageAz.php and LanguageKk.php hotfix for HHVM 3.18 on mwdebug* and mw1261 - T161095
  • 15:34 hashar@tin: Synchronized php-1.29.0-wmf.17/languages/classes/LanguageKk.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 54s)
  • 15:33 hashar@tin: Synchronized php-1.29.0-wmf.17/languages/classes/LanguageAz.php: Check for string initialization in ucfirst() to make HHVM 3.18 happy - T161095 (duration: 00m 59s)
  • 15:25 ema: cp*: removed linux-image-amd64, linux-image-3.16.0-4-amd64 and linux-image-4.4.0-1-amd64 to reduce churn
  • 14:54 moritzm: rebooting elastic2001 to Linux 4.9
  • 14:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1087 original weight - T137191 (duration: 00m 44s)
  • 14:24 marostegui: Deploy schema change s4 to db1068 - https://phabricator.wikimedia.org/T160415 https://phabricator.wikimedia.org/T73563
  • 14:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081, depool db1068 T160415 - T73563 (duration: 00m 43s)
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1087 weight - T137191 (duration: 00m 47s)
  • 13:39 volans: stopped ircecho to avoid the message spam
  • 13:15 dcausse: eu swat done
  • 13:09 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [cirrus] Enable the completion suggester (duration: 00m 43s)
  • 13:07 bblack@puppetmaster1001: conftool action : set/ttl=275; selector: dnsdisc=appservers-rw
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Enable db1087 for API - T137191 (duration: 00m 42s)
  • 12:29 dcausse: cirrus: reindexing lost writes (2017-03-21T13:30:00Z to 2017-03-21T17:50:00Z) during es5 upgrade in elastic@eqiad (T157479)
  • 12:26 marostegui: Deploy schema change on s4 to db1081 and labsdb1011 - T160415 T73563
  • 12:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084, depool db1081 T160415 - T73563 (duration: 00m 43s)
  • 12:20 gehel: maps restarting kartotherian - T150354
  • 12:18 gehel: installing latest mapnik version on maps servers
  • 12:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 with low weight - T137191 (duration: 00m 43s)
  • 12:09 gehel: maps upgrade to nodejs 6 completed - T150354
  • 12:09 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1004.eqiad.wmnet
  • 12:05 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1004.eqiad.wmnet
  • 12:04 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1003.eqiad.wmnet
  • 12:02 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1003.eqiad.wmnet
  • 12:01 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1002.eqiad.wmnet
  • 11:58 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1002.eqiad.wmnet
  • 11:57 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1001.eqiad.wmnet
  • 11:54 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 11:53 gehel: maps codfw fully upgraded to nodejs 6, starting upgrade on maps eqiad - T150354
  • 11:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 11:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 11:46 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 11:41 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2002.codfw.wmnet
  • 11:34 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2002.codfw.wmnet
  • 11:33 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 11:27 gehel: maps2001.codfw.wmnet upgraded to nodejs6
  • 11:19 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 11:15 akosiaris: Enable IGMP snooping for private1-d-eqiad on asw2-d. T133387
  • 11:15 akosiaris: Enable IGMP snooping for private1-d-eqiad. T133387
  • 11:05 gehel: disabling puppet on all maps servers - T150354
  • 11:04 gehel: upgrade maps to nodejs 6 - T150354
  • 10:53 akosiaris: cr1-eqiad: set ae4 and members to enable again. T133387
  • 10:41 akosiaris: reoot asw2-d T133387
  • 10:31 dcausse: cirrus: rebuilding comp suggest indices in elastic@eqiad
  • 10:15 akosiaris: Upgrading asw2-d-eqiad to JunOS 14.1X53 (T133387)
  • 10:09 akosiaris: cr1-eqiad: set ae4 and members to disable. T133387
  • 09:55 moritzm: upgrading mw1261 to HHVM 3.18.1
  • 09:50 moritzm: upgrading mwdebug* to HHVM 3.18.1
  • 09:40 marostegui: Deploy alter table s4 (commonswiki) db1084 - T73563 T160415
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 T160415 - T73563 (duration: 00m 43s)
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 T160415 - T73563 (duration: 00m 43s)
  • 09:20 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus2002.codfw.wmnet
  • 08:46 marostegui: Stop MySQL db1070 to clone db1087 from it - T137191
  • 07:53 dcausse: rebuilding ttmserver index in elastic@eqiad to catchup lost writes during es5 upgrade
  • 07:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 T160415 - T73563 (duration: 00m 43s)
  • 07:10 oblivian@puppetmaster1001: conftool action : set/ttl=300; selector: dnsdisc=.*
  • 07:05 marostegui: Stop MySQL db1087 - T137191
  • 06:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T137191 (duration: 00m 43s)
  • 06:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2037 T160415 - T73563 (duration: 00m 43s)
  • 06:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1092 weight - T137191 (duration: 00m 49s)
  • 05:44 _joe_: finished tests on citoid/dns discovery; restbase successfully detects the change
  • 05:18 _joe_: depooling temporarily citoid in eqiad from dns discovery
  • 02:40 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 22 02:40:42 UTC 2017 (duration 5m 29s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 20s)
  • 02:02 krinkle@tin: Synchronized errorpages/: minor tweaks - I60344bd519d (duration: 00m 54s)
  • 00:15 Dereckson: SWAT done.
  • 00:15 dereckson@tin: Synchronized php-1.29.0-wmf.16/extensions/CirrusSearch/: CompSuggest: Increase default limit from 50 to 255 + speed optimization (Gerrit:343962 + Gerrit:343966) (duration: 00m 55s)
  • 00:05 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Allow translationadmin self-add for beta.wikiversity admins (T160120) (duration: 00m 43s)

2017-03-21

  • 23:57 eileen: update civicrm from 92e3b85 to d3c439f
  • 23:44 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Mapframe on sv.wikipedia (T161032) (duration: 00m 43s)
  • 23:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Translate on beta.wikiversity (T160120) (duration: 00m 45s)
  • 23:19 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Make rcenhancedfilters available as beta feature, enable on test wikis (Gerrit:343435 + Gerrit:343436) (duration: 00m 51s)
  • 22:45 mutante: lists: deactivate arbcom-ko per T160892 and Google translation of Korean talk pages
  • 22:44 Dereckson: Run namespaceDupes on pnbwiki (T159976)
  • 22:28 Dereckson: Create Translate tables on betawikiversity (T160120)
  • 20:59 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Revert Group0 to 1.29.0-wmf.17
  • 20:56 demon@tin: Synchronized wmf-config/InitialiseSettings.php: logging for bad header stuff (duration: 00m 52s)
  • 20:33 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T161001 Turn off completion suggester until length error is fixed (duration: 00m 44s)
  • 20:29 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Group0 to 1.29.0-wmf.17
  • 19:54 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.17 and rebuild l10n cache (duration: 51m 16s)
  • 19:35 mutante: phab2001 - same as iridium, phab search config change
  • 19:33 mutante: iridium - ran puppet after gerrit:343936 - phabricator config change to use cluster search applied
  • 19:22 chasemp: clean out admin-monitoring for nova-fullstack T160908
  • 19:10 mutante: ruthenium - dev API enabled in parsoid config for parsoid rt tests
  • 19:03 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.17 and rebuild l10n cache
  • 18:18 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase weight of db1092 and db1094 (duration: 00m 42s)
  • 18:02 twentyafterfour: refreshing phabricator's elasticsearch index in eqiad
  • 17:56 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [es5 upgrade] step 4: repool eqiad for writes (3/3) (duration: 00m 42s)
  • 17:54 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 4: repool eqiad for writes (2/3) (duration: 00m 42s)
  • 17:53 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 4: repool eqiad for writes (1/3) (duration: 00m 42s)
  • 17:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1092 weight - T137191 (duration: 00m 42s)
  • 17:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1092 weight - T137191 (duration: 00m 45s)
  • 17:13 thcipriani: starting branch cut for 1.29.0-wmf.17
  • 17:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 with low weight - T137191 (duration: 00m 42s)
  • 17:02 urandom: T111113: Rolling restart of RESTBase, eqiad, complete
  • 16:52 urandom: T111113: Rolling restart of RESTBase, eqiad
  • 16:41 urandom: T111113: Rolling restart of RESTBase, codfw, complete
  • 16:17 urandom: T111113: Enabling RESTBase client encryption on (remaining) codfw nodes
  • 16:11 urandom: T111113: Enabling RESTBase client encryption on restbase2001.codfw.wmnet (canary)
  • 15:56 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2020.codfw.wmnet
  • 15:27 moritzm: removed "Directory Managers" group from LDAP (Bug T157131)
  • 15:01 bd808@tin: Synchronized php-1.29.0-wmf.16/extensions/OpenStackManager/special/SpecialNovaInstance.php: SpecialNovaInstance: Remove some totally useless domain code. (T160995) (duration: 00m 43s)
  • 14:58 gehel: elasticsearch upgrade on eqiad is completed - T157479
  • 14:50 moritzm: installing gnutls security updates on trusty (jessie already fixed)
  • 14:44 gehel: elasticsearch eqiad, full cluster restart after cleanup of known old indices - T157479
  • 14:39 gehel: deleting old v2 indices from each elasticsearch server - T157479
  • 14:34 gehel: deleting old v2 indices from elastic1030: azbwiki_general_first, vewikimedia_content_1415331110, vewikimedia_general_1415331150 - T157479
  • 14:07 gehel: upgrading elasticsearch eqiad to v5.x - T157479
  • 14:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2044, depool db2037 T160415 - T73563 (duration: 00m 42s)
  • 13:44 dcausse: eu SWAT done
  • 13:39 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] Enable completion suggester (duration: 00m 42s)
  • 13:30 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [es5 upgrade] step 3: depool eqiad for writes (take 2) (3/3) (duration: 00m 41s)
  • 13:29 gehel: rolling restart of wdqs to load new configuration options
  • 13:29 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 3: depool eqiad for writes (take 2) (2/3) (duration: 00m 43s)
  • 13:27 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 3: depool eqiad for writes (take 2) (1/3) (duration: 00m 42s)
  • 13:15 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T157111 pagePreviews: Increase perf instrumentation sample (duration: 00m 58s)
  • 13:14 Reedy: Make that clear 2FA for RickinBaltimore per T160671
  • 13:12 Reedy: Clear centralauth for RickinBaltimore per T160671
  • 12:54 moritzm: installing r-base security updates
  • 12:47 gehel: running stress and bonnie on elastic2020 - T149006
  • 12:34 Dereckson: Created OATHAuth tables on projectcomwiki (T143138)
  • 12:27 Dereckson: Create account Superzerocool on projectcomwiki (bureaucrat, T143138)
  • 11:00 ema: upgrading twisted to 16.2.0 on lvs1007-12 T160433
  • 10:33 marostegui: Run pt-table-checksum on s6 (ruwiki) - https://phabricator.wikimedia.org/T160509
  • 09:42 moritzm: installing libevent security updates on remaining hosts in eqiad
  • 09:42 marostegui: Stop MySQL db1070 to clone db1092 from it - T137191
  • 09:14 akosiaris: enable bacula deamons on helium, everything looks ok
  • 09:09 moritzm: installing wireshark security updates
  • 09:06 hashar: CI deploying config hack "High priority test pipeline"  : https://gerrit.wikimedia.org/r/343318 - T160667
  • 08:43 gehel: shutting down elasticsearch on elastic2020, investigating T149006
  • 07:50 gehel: banning elastic2020 from cluster to investigate T149006
  • 07:36 marostegui: Stop mysql db1092 for maintenance - T137191
  • 07:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T137191 (duration: 00m 42s)
  • 07:22 marostegui: Run pt-table-checksum on s6 (jawiki) - T160509
  • 07:18 marostegui: Deploy schema change on db2044 and labsdb1009 (s4) - https://phabricator.wikimedia.org/T160415 - https://phabricator.wikimedia.org/T73563
  • 07:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2044 - T160415 - T73563 (duration: 00m 41s)
  • 06:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2051 - T160415 - T73563 (duration: 01m 07s)
  • 06:01 joal@tin: Finished deploy [analytics/refinery@c3a9139]: (no justification provided) (duration: 06m 39s)
  • 05:55 joal@tin: Started deploy [analytics/refinery@c3a9139]: (no justification provided)
  • 03:38 eileen: update civicrm from 21afe66 to 92e3b85
  • 03:10 eileen: update civicrm from 0ed1659 to 21afe66
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Mar 21 02:37:33 UTC 2017 (duration 5m 22s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 22s)
  • 01:59 eileen: update civicrm from f454f16 to 0ed1659
  • 00:37 Amir1: ladsgroup@terbium:/srv/mediawiki/php-1.29.0-wmf.16$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=etwiki is done now (T159609)
  • 00:30 Amir1: ladsgroup@terbium:/srv/mediawiki/php-1.29.0-wmf.16$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=etwiki (T159609)
  • 00:28 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES review tool in etwiki (T159609) (duration: 00m 42s)
  • 00:13 Amir1: mwscript maintenance/sql.php --wiki=etwiki extensions/ORES/sql/(ores_model|ores_classification).sql (T159609)
  • 00:04 Krinkle: mwscript deleteEqualMessages.php on public wikis (T45917)
  • 00:02 eileen: update civicrm from e058e8c to f454f16

2017-03-20

  • 23:59 mutante: phab2001 / iridium - running puppet after gerrit:343635 - switches phab search to codfw
  • 23:58 dereckson@tin: Synchronized php-1.29.0-wmf.16/extensions/CirrusSearch/includes/CompletionSuggester.php: Don't pass null suggest queries to elasticsearch (T160896) (duration: 00m 42s)
  • 23:54 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Restrict page images to lead section (T152115) (duration: 00m 43s)
  • 23:48 dereckson@tin: Synchronized php-1.29.0-wmf.16/extensions/CirrusSearch/includes/BuildDocument/Completion/SuggestBuilder.php: Gerrit:343754 Allow completion suggester to work with titles that look like integers (duration: 00m 45s)
  • 23:47 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable wgCiteResponsiveReferences on fr. en. it. la. no.wp + en.wikt (duration: 00m 46s)
  • 23:39 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Gerrit:343781 Test ORES migration on ruwiki beta too (labs only, no-op in prod) (duration: 00m 42s)
  • 22:57 mutante: ruthenium: running puppet after gerrit:343782 added missing diffserver unit file. puppet run looked good: Visualdiff::Server[diffserver]/Service[diffserver]/ensure: ensure changed 'stopped' to 'running', systemctl status says failed though
  • 22:54 ppchelko@tin: Finished deploy [trending-edits/deploy@e4fa9b8]: Config: Set up 'trends_at' property T160127 (duration: 06m 20s)
  • 22:47 ppchelko@tin: Started deploy [trending-edits/deploy@e4fa9b8]: Config: Set up 'trends_at' property T160127
  • 22:45 ejegg: updated payments-wiki from f991f15 to 9622a4b
  • 22:38 ppchelko@tin: Finished deploy [trending-edits/deploy@5d3eb7f]: Do not purge articles that have trended T160127 (duration: 07m 57s)
  • 22:31 mutante: ruthenium - gerrit:343682 applied - puppet: OK nginx: OK diffserver service refresh: failed @ssastry
  • 22:30 ppchelko@tin: Started deploy [trending-edits/deploy@5d3eb7f]: Do not purge articles that have trended T160127
  • 20:52 mutante: DNS - new Wikipedias "khw" (Khowar) and "kbp" (Kabiye) created (T160868) (T160865) ( on ns0/ns1: authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones to trigger template recreation after edit to langs.tmpl)
  • 20:47 mutante: DNS - ns2 - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones to create new WP languages 'khw' and 'kbp'
  • 20:19 bsitzmann@tin: Finished deploy [mobileapps/deploy@815ebb5]: Update mobileapps to c0ab01d (duration: 07m 31s)
  • 20:14 reedy@tin: Synchronized php-1.29.0-wmf.16/includes/api/ApiQueryAllPages.php: Limit query=allpages filterredir if MiserMode T160916 (duration: 00m 42s)
  • 20:12 reedy@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialAllPages.php: Re-enable Special:AllPages, disable redirect filter if MiserMode T160916 (duration: 00m 42s)
  • 20:12 bsitzmann@tin: Started deploy [mobileapps/deploy@815ebb5]: Update mobileapps to c0ab01d
  • 19:45 mutante: lists: disabled wikimediaro-l due to inactivity (disabling lists is easy nowadays and also revertable): fermium: sudo /usr/local/sbin/disable_list <list name> | (T146563)
  • 19:42 mobrovac@tin: Finished deploy [changeprop/deploy@decb6a1]: (no justification provided) (duration: 00m 56s)
  • 19:41 mobrovac@tin: Started deploy [changeprop/deploy@decb6a1]: (no justification provided)
  • 18:39 thcipriani@tin: Synchronized wmf-config: SWAT: Revert "Revert "Turn off patrolling for FlaggedRevs in bswiki"" T158662 (duration: 00m 44s)
  • 18:28 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: pagePreviews: Enable by default on "stage 0" wikis T136602 (duration: 00m 42s)
  • 18:18 ariel@tin: Finished deploy [dumps/dumps@91d3215]: more default config fixes, flagged rev table config fix (duration: 00m 02s)
  • 18:18 ariel@tin: Started deploy [dumps/dumps@91d3215]: more default config fixes, flagged rev table config fix
  • 17:35 akosiaris: slow rolling restart of redis databases in codfw T159850
  • 17:22 ariel@tin: Finished deploy [dumps/dumps@80d88cd]: fic buglet due to new default config file (duration: 00m 02s)
  • 17:22 ariel@tin: Started deploy [dumps/dumps@80d88cd]: fic buglet due to new default config file
  • 17:09 gehel@tin: Finished deploy [wdqs/wdqs@e9e7c95]: (no justification provided) (duration: 01m 41s)
  • 17:07 gehel@tin: Started deploy [wdqs/wdqs@e9e7c95]: (no justification provided)
  • 16:48 mobrovac: restbase deploying e4c327b0
  • 15:59 hashar: Special:AllPages being blank has a public task: https://phabricator.wikimedia.org/T160916
  • 15:50 dcausse@tin: Synchronized wmf-config/CommonSettings.php: Revert: T157479 [es5 upgrade] step 3: depool eqiad for writes (1/3) (duration: 00m 42s)
  • 15:49 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: Revert: T157479 [es5 upgrade] step 3: depool eqiad for writes (2/3) (duration: 00m 42s)
  • 15:40 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: Revert: T157479 [es5 upgrade] step 3: depool eqiad for writes (3/3) (duration: 00m 42s)
  • 15:39 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: T157479 [es5 upgrade] step 3: depool eqiad for writes (3/3) (duration: 00m 41s)
  • 15:37 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T157479 [es5 upgrade] step 3: depool eqiad for writes (2/3) (duration: 00m 46s)
  • 15:34 dcausse@tin: Synchronized wmf-config/CommonSettings.php: T157479 [es5 upgrade] step 3: depool eqiad for writes (1/3) (duration: 00m 45s)
  • 15:15 hashar@tin: Synchronized php-1.29.0-wmf.16/extensions/Translate: ElasticTTM: set the index when deleting docs (duration: 00m 53s)
  • 15:08 hashar@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialWatchlist.php: Restoring Watchlist: Fix form and preference overriding https://gerrit.wikimedia.org/r/#/c/343433/ (duration: 00m 51s)
  • 14:36 hashar@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialAllPages.php: Disable SpecialAllPages on all wikis. Temporary workaround (duration: 01m 08s)
  • 14:36 hashar: Disabled Special:AllPages on all wikis making it spurts a blank page instead. ( https://gerrit.wikimedia.org/r/#/c/343647/ )
  • 14:32 akosiaris: disable puppet on all rdb* nodes to shepherd https://gerrit.wikimedia.org/r/343027 into production. T159850
  • 14:28 elukey: (Correct one) Temporary hack for T160888 - moved /srv/mw-log/archive/api.log-20170224.gz to /srv/mw-log/archive/api_log_backup_elukey/ to avoid rsync timeouts to stat1002 (the file is big and close to being deleted for retention)
  • 14:27 elukey: Temporary hack for T160886 - moved /srv/mw-log/archive/api.log-20170224.gz to /srv/mw-log/archive/api_log_backup_elukey/ to avoid rsync timeouts to stat1002 (the file is big and close to being deleted for retention)
  • 14:22 hashar@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialWatchlist.php: reverts commit SpecialWatchlist.php 0d675d2 (duration: 00m 43s)
  • 13:55 jynus: shutting down es2015 for maintenance T160242
  • 13:41 zfilipin@tin: Synchronized wmf-config/: SWAT: Enable CollaborationKit on beta enwiki (T138325) (duration: 00m 44s)
  • 13:35 zfilipin@tin: Synchronized php-1.29.0-wmf.16/tests/phpunit/includes/specials/SpecialWatchlistTest.php: SWAT: Watchlist: Fix form and preference overriding (T160734) (duration: 00m 48s)
  • 13:34 zfilipin@tin: Synchronized php-1.29.0-wmf.16/includes/specials/SpecialWatchlist.php: SWAT: Watchlist: Fix form and preference overriding (T160734) (duration: 01m 01s)
  • 11:39 akosiaris: return rdb1007 client-output-buffer-limit config to initially configured value T159850
  • 10:09 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 after crash (duration: 00m 47s)
  • 09:42 godog: swift bump ms-be2028 -> ms-be2039 weight - T158337
  • 09:37 jynus: restarting db1094 for upgrade
  • 09:02 dcausse: refreshing ttm documents in elastic@codfw
  • 08:47 hashar: Jenkins: depooling / deleting Precise instances. T158652
  • 08:28 dcausse: cirrus: refreshing all comp sugggest indices in elastic@codfw
  • 02:23 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 20 02:23:38 UTC 2017 (duration 5m 25s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 06m 44s)

2017-03-19

  • 18:40 ariel@tin: Finished deploy [dumps/dumps@8cff500]: generate json status files for use by downloaders (duration: 00m 02s)
  • 18:39 ariel@tin: Started deploy [dumps/dumps@8cff500]: generate json status files for use by downloaders
  • 10:43 ariel@tin: Finished deploy [dumps/dumps@87d748b]: dump magic words and namespace info (duration: 00m 02s)
  • 10:43 ariel@tin: Started deploy [dumps/dumps@87d748b]: dump magic words and namespace info
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 19 02:24:53 UTC 2017 (duration 5m 23s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 07m 29s)

2017-03-18

  • 20:16 chasemp: labstore1005 service nfs-exportd restart
  • 19:43 chasemp: test on labstore1004 nfs-exportd candidate /root/nfs-exportd-candidate.py --observer-pass xxxxxx --interval 0 --config-path /etc/nfs-mounts.yaml --exports-d-path /root/fake_export/ --debug
  • 18:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 after crash (duration: 01m 02s)
  • 18:20 jynus: powercycling db1094
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Mar 18 02:36:00 UTC 2017 (duration 5m 23s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 11m 29s)
  • 00:06 mutante: lists: creating new list wikimedia-nys (Noongar language) (T159499)

2017-03-17

  • 23:48 mutante: lists: creating new list wikispecies-admin (T159625)
  • 23:36 catrope@tin: Synchronized php-1.29.0-wmf.16/extensions/VisualEditor/lib/ve: Fixes for T154123 T160479 T160190 T160197 (duration: 00m 42s)
  • 23:31 mutante: lists: making Steinsplitter and Zhuyifei1999 list admins of commons-poty (T160672)
  • 16:16 elukey: reimage restbase-dev1001.eqiad.wmnet
  • 14:01 marostegui: Deploy schema change on dbstore1001 and db2051 (s4) - T160415 - T73563
  • 14:00 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2051 - T160415 - T73563 (duration: 00m 42s)
  • 13:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2058 - T160415 - T73563 (duration: 01m 06s)
  • 12:46 chasemp: labsdb10[01|03] maintain-views --table user_groups --all-database --replace-all --debug
  • 12:44 chasemp: labsdb10[09|10|11] maintain-views --table user_groups --all-database --replace-all --debug
  • 11:33 elukey: reimage analytics1044 (Hadoop Worker node) to Debian Jessie
  • 10:58 akosiaris: reimage helium.eqiad.wmnet to jessie
  • 09:04 jynus: killing 11h-running query on db1089 from terbium (orphan process)
  • 08:32 marostegui: Deploy schema change on dbstore2002 and db2058 (s4) - T160415 T73563
  • 08:31 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2058 - T160415 - T73563 (duration: 00m 43s)
  • 08:00 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2065 - T160415 - T73563 (duration: 00m 44s)
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1070 - T157931 (duration: 00m 45s)
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 17 02:39:10 UTC 2017 (duration 5m 22s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 12m 12s)
  • 01:54 urandom: T111113: Rolling restarts of Cassandra complete
  • 01:12 urandom: T111113: Rolling restarts of Cassandra, eqiad, rack 'd'
  • 00:41 ebernhardson@tin: Synchronized php-1.29.0-wmf.16/resources/src/mediawiki.special/: SWAT: Fix search result percentage width when no interwiki sidebar shown (duration: 00m 42s)
  • 00:40 ebernhardson@tin: Synchronized php-1.29.0-wmf.16/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: enabled sister search AB test on 8 wikis (duration: 00m 43s)
  • 00:34 urandom: T111113: Rolling restarts of Cassandra, eqiad, rack 'b'
  • 00:23 urandom: T111113: Rolling restarts of Cassandra on restbase1016
  • 00:13 urandom: T111113: Rolling restarts of Cassandra on restbase1011
  • 00:03 urandom: T111113: Rolling restarts of Cassandra on restbase1010

2017-03-16

  • 23:46 reedy@tin: Synchronized php-1.29.0-wmf.16/extensions/CodeReview: Fix preg_ error again (duration: 00m 47s)
  • 23:25 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable PageViewInfo to group2 T125917 (duration: 00m 49s)
  • 23:24 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'd' *correction*
  • 23:24 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'b'
  • 22:34 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'a'
  • 21:50 urandom: T111113: Rolling restarts of Cassandra in codfw, rack 'b'
  • 21:36 urandom: T111113: Restarting Cassandra on restbase1007-{b,c} to enable (optional) client encryption
  • 21:19 urandom: T111113: Restarting Cassandra on restbase1007-a to enable (optional) client encryption
  • 21:17 ebernhardson: reindexing group2 in cirrussearch for codfw downtime during 2.x -> 5.x upgrade
  • 21:06 ejegg: updated new CiviCRM from cca5921 to e058e8c
  • 20:08 mutante: repooled elastic2010, depooled correct host elastic2020 instead (T149006)
  • 20:08 dzahn@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic2020.codfw.wmnet
  • 20:08 dzahn@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2010.codfw.wmnet
  • 20:06 mutante: depooled elastic2010 since it is powered-off/down. (set/pooled=inactive) - (T149006)
  • 20:05 dzahn@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic2010.codfw.wmnet
  • 20:05 twentyafterfour: restarted phd on iridium to fix workers dieing
  • 19:26 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.16
  • 19:07 thcipriani@tin: Synchronized php-1.29.0-wmf.16/extensions/VisualEditor/lib/ve: SWAT: Update VE core submodule to wmf/1.29.0-wmf.16 HEAD (50a6323d7) T154123 T160479 (duration: 00m 44s)
  • 19:02 gehel: restart relforge to activate new plugins - T160674
  • 16:57 ebernhardson: started cirrus completion indices rebuild for group2 on wasat.codfw.wmnet
  • 16:48 ebernhardson: manually adjusted wikiversions on wasat.codfw.wmnet to point all wikis at wmf.16 to rebuild cirrus completion search indices before group2 rolls forward
  • 16:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1070 - T157931 (duration: 00m 41s)
  • 16:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2065 - T160415 - T73563 (duration: 00m 42s)
  • 16:01 marostegui: Deploy schema change on s4 (commonswiki) https://phabricator.wikimedia.org/T73563 and https://phabricator.wikimedia.org/T160415
  • 16:00 elukey: racadm serveraction powerdown on mw2256 for hw maintenance
  • 15:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 - T160415 (duration: 00m 42s)
  • 15:44 godog: reboot ms-be1008 after disk swap to clear stuck mkfs.xfs
  • 15:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 - T160415 (duration: 00m 42s)
  • 15:27 otto@tin: Finished deploy [eventlogging/eventbus@75ab39c]: /v1/schemas/:schema_uri endpoint, T159179 (duration: 00m 14s)
  • 15:27 otto@tin: Started deploy [eventlogging/eventbus@75ab39c]: /v1/schemas/:schema_uri endpoint, T159179
  • 15:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T160415 (duration: 00m 42s)
  • 15:13 elukey: restart hhvm on mw1200, high load and queued requests - hhvm-dump-debug on /tmp/hhvm.27107.bt.
  • 15:09 elukey: restart hhvm on mw1207, high load and queued requests - hhvm-dump-debug on /tmp/hhvm.27441.bt.
  • 15:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 - T160415 (duration: 00m 42s)
  • 14:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T160415 (duration: 00m 41s)
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T160415 (duration: 00m 42s)
  • 14:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 with low weight - T157931 (duration: 00m 45s)
  • 14:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 - T160415 (duration: 00m 43s)
  • 14:12 Dereckson: EU SWAT, round 2, done
  • 14:11 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Create Wikichanzo namespace for swwiki T158041) (duration: 00m 42s)
  • 14:07 dereckson@tin: Synchronized wmf-config/throttle.php: Add Odia Wikipedia's 100 Women Editathon throttle rule (T160619) (duration: 00m 57s)
  • 13:52 Dereckson: Resume EU SWAT for two new changes
  • 13:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1045 - T160415 (duration: 00m 58s)
  • 13:38 marostegui: Shutdown es2015 for maintenance - T160242
  • 13:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 - T160415 (duration: 00m 42s)
  • 13:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 - T160415, Repool db1067 - T160435 (duration: 00m 42s)
  • 13:01 addshore: EU SWAT done
  • 12:59 addshore@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule T160427 (lift of IP cap for RIT - March 25, 2017) (duration: 00m 43s)
  • 12:49 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: wmgUseInterwikiSorting true for wikidataclients T160465 T150183 (duration: 00m 42s)
  • 12:39 marostegui: Deploy schema change on s5 - T160415
  • 12:38 addshore@tin: Synchronized php-1.29.0-wmf.16/extensions/InterwikiSorting: Use ExtensionFunctions instead of BeforeInitialize hook T160465 (duration: 00m 44s)
  • 12:31 addshore@tin: Synchronized php-1.29.0-wmf.16/extensions/InterwikiSorting: Use ExtensionFunctions instead of BeforeInitialize hook T160465 (duration: 00m 43s)
  • 12:17 addshore@tin: Synchronized php-1.29.0-wmf.15/extensions/InterwikiSorting: Use ExtensionFunctions instead of BeforeInitialize hook T160465 (duration: 00m 43s)
  • 11:48 godog: repair prometheus' leveldb database archived_fingerprint_to_metric on bast3002, upgrade prometheus to latest version from jessie-backports
  • 11:26 moritzm: enabled BBR as TCP congestion control algorithm on cp1008
  • 11:04 joal@tin: Finished deploy [analytics/aqs/deploy@006bf8c]: (no justification provided) (duration: 03m 30s)
  • 11:01 joal@tin: Started deploy [analytics/aqs/deploy@006bf8c]: (no justification provided)
  • 10:59 joal@tin: Finished deploy [analytics/aqs/deploy@006bf8c]: (no justification provided) (duration: 02m 13s)
  • 10:56 joal@tin: Started deploy [analytics/aqs/deploy@006bf8c]: (no justification provided)
  • 10:12 volans: upgraded cumin to version 0.0.2 in the repository and on neodymium/sarin
  • 10:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T160415 (duration: 00m 41s)
  • 09:56 moritzm: installing libevent security updates
  • 09:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T160415 (duration: 00m 42s)
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 - T160415 (duration: 00m 42s)
  • 09:46 moritzm: upgrading apache on cobalt/gerrit
  • 09:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 - T160415 (duration: 00m 47s)
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T160415 (duration: 00m 42s)
  • 09:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T160415 (duration: 00m 42s)
  • 09:11 moritzm: upgrading apache on fermium/lists.wikimedia.org
  • 09:10 moritzm: upgrading apache on mendelevium/OTRS
  • 09:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T160415 (duration: 00m 42s)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T160415 (duration: 00m 41s)
  • 08:57 godog: codfw-prod: add ms-be203[1-9] - T158337
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T160415 (duration: 00m 41s)
  • 08:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T160415 (duration: 00m 41s)
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T160415 (duration: 00m 43s)
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T160415 (duration: 00m 41s)
  • 08:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T160415 (duration: 00m 46s)
  • 08:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T160415 (duration: 00m 47s)
  • 08:12 moritzm: upgrading apache on einsteinium/icinga.wikimedia.org
  • 07:51 marostegui: Deploy schema change on s1 - T160415
  • 07:36 marostegui: Deploy schema change on s7 - T160415
  • 07:08 marostegui: Starting pt-table-checksum on s6 (frwiki) - T160509
  • 03:01 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 16 03:01:37 UTC 2017 (duration 5m 50s)
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.16) (duration: 13m 39s)
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 08m 51s)
  • 00:14 eileen: updated civicrm from f1a3d64 to cca5921

2017-03-15

  • 23:36 twentyafterfour: train unblocked and wmf.16 is deployed to group1 wikis.
  • 23:32 twentyafterfour@tin: Synchronized php-1.29.0-wmf.16/extensions/ApiFeatureUsage/ApiFeatureUsageQueryEngineElastica.php: deploy I2d8603 refs T160578 T158997 (duration: 00m 42s)
  • 23:25 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Restrict page images to lead section on cawiki T152115 (duration: 00m 42s)
  • 23:17 thcipriani@tin: Synchronized wmf-config: SWAT: Set $wgOresExtension for I63b11eff3a4 T159763 (duration: 00m 44s)
  • 23:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy PageViewInfo to group1 T125917 (duration: 00m 43s)
  • 22:51 twentyafterfour@tin: Synchronized wmf-config/CirrusSearch-common.php: Deploy I4980da refs T160569 and T158997 (duration: 00m 42s)
  • 22:34 mutante: Cassandra test hosts: deploy break-fix gerrit:342912 , run puppet on cerium and praseodymium. on xenon puppet is disabled.
  • 21:54 twentyafterfour@tin: Synchronized wmf-config/CirrusSearch-common.php: Deploy I67d712 refs T160569 and T158997 (duration: 00m 42s)
  • 21:52 eileen: civicrm update from 639eb68 to f1a3d64
  • 21:29 twentyafterfour@tin: Synchronized wmf-config: deploy Iad9849 to fix 160569 and unblock the train refs T158997 (duration: 00m 49s)
  • 21:01 twentyafterfour@tin: Synchronized wmf-config: deploy I489c4a to fix 160569 and unblock the train refs T158997 (duration: 00m 45s)
  • 20:54 ladsgroup@tin: Finished deploy [ores/deploy@bc0bc74]: Mid-March deploy of ORES (T160279) (duration: 26m 46s)
  • 20:44 gehel: restarting postgresql on maps clusters - T160209
  • 20:38 urandom: T111113: Restarting xenon (RESTBase Staging) to enable client encryption (canary)
  • 20:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Move db1067 from s2 to s1 as a db1057 replacement (duration: 00m 42s)
  • 20:30 twentyafterfour: T160569 blocks the train until I can figure out what is causing it. The frequency is low so I haven't reverted to wmf.15, group 1 remains on wmf.16 refs T158997
  • 20:27 ladsgroup@tin: Started deploy [ores/deploy@bc0bc74]: Mid-March deploy of ORES (T160279)
  • 20:13 bsitzmann@tin: Finished deploy [mobileapps/deploy@fa43048]: Update mobileapps to bb8fcf2 (duration: 03m 51s)
  • 20:09 bsitzmann@tin: Started deploy [mobileapps/deploy@fa43048]: Update mobileapps to bb8fcf2
  • 20:05 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.16
  • 19:56 twentyafterfour@tin: Synchronized php-1.29.0-wmf.16/includes/specialpage/: deploy revert of 5b15728 (duration: 00m 44s)
  • 19:46 jynus: shutting down db1067 for maintenance (as a db1057 replacement) T160435
  • 19:16 mobrovac@tin: Finished deploy [changeprop/deploy@b68bf51]: Deploy producer fix for T159200 (duration: 00m 51s)
  • 19:15 mobrovac@tin: Started deploy [changeprop/deploy@b68bf51]: Deploy producer fix for T159200
  • 18:35 legoktm@tin: Synchronized php-1.29.0-wmf.16/resources/src/mediawiki.widgets/mw.widgets.SearchInputWidget.js: mw.widgets.SearchInputWidget: Do not pass to TextInputWidget - T148471 (2/2) (duration: 00m 42s)
  • 18:34 legoktm@tin: Synchronized php-1.29.0-wmf.16/includes/widget/SearchInputWidget.php: mw.widgets.SearchInputWidget: Do not pass to TextInputWidget - T148471 (1/2) (duration: 00m 41s)
  • 18:32 legoktm@tin: Synchronized php-1.29.0-wmf.16/includes/libs/filebackend/SwiftFileBackend.php: Make sure Swift store operations close the source file handle - T159607 (duration: 00m 44s)
  • 18:25 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Linter to group0 and small wikis - T148609 (duration: 00m 42s)
  • 18:21 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy PageViewInfo to group0 - T125917 (duration: 00m 42s)
  • 18:20 otto@tin: Finished deploy [eventstreams/deploy@eb8698e]: T159200 (duration: 06m 18s)
  • 18:19 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. T159200 (duration: 06m 11s)
  • 18:19 legoktm@tin: Synchronized wmf-config/logging.php: Use custom LogstashFormatter - T145133, T151290 (duration: 00m 42s)
  • 18:15 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Deploy for switching to librdkafka 0.9.4 T159200 (duration: 00m 33s)
  • 18:15 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Deploy for switching to librdkafka 0.9.4 T159200
  • 18:14 mobrovac: restbase deploying f047dabb
  • 18:13 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. T159200
  • 18:13 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 18:12 ottomata: upgrading librdkafka on scb eqiad nodes T159200
  • 18:12 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Show 'Publish' not 'Save' on most public wikis -T131132 (duration: 00m 42s)
  • 18:08 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Deploy to EQIAD canary for switching to librdkafka 0.9.4 T159200 (duration: 00m 20s)
  • 18:07 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Deploy to EQIAD canary for switching to librdkafka 0.9.4 T159200
  • 18:07 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. Canary on scb1001.eqiad.wmnet. T159200 (duration: 01m 07s)
  • 18:06 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0. Canary on scb1001.eqiad.wmnet. T159200
  • 18:06 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 17:55 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0 in codfw. T159200 (duration: 03m 51s)
  • 17:53 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Deploy to CODFW for switching to librdkafka 0.9.4 T159200 (duration: 01m 44s)
  • 17:52 otto@tin: Finished deploy [eventstreams/deploy@eb8698e]: T159200 (duration: 01m 35s)
  • 17:51 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Deploy to CODFW for switching to librdkafka 0.9.4 T159200
  • 17:51 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Update to node-rdkafka 0.8.0 in codfw. T159200
  • 17:50 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 17:50 ottomata: upgrading librdkafka on scb in codfw T159200
  • 17:46 otto@tin: Finished deploy [eventstreams/deploy@eb8698e]: T159200 (duration: 00m 17s)
  • 17:46 otto@tin: Started deploy [eventstreams/deploy@eb8698e]: T159200
  • 17:43 mobrovac@tin: Finished deploy [changeprop/deploy@614cb4b]: Canary deploy for switching to librdkafka 0.9.4 T159200 (duration: 00m 53s)
  • 17:43 ppchelko@tin: Finished deploy [trending-edits/deploy@85be190]: Trending: Update to node-rdkafka 0.8.0. Canary on scb2001. T159200 (duration: 01m 21s)
  • 17:42 mobrovac@tin: Started deploy [changeprop/deploy@614cb4b]: Canary deploy for switching to librdkafka 0.9.4 T159200
  • 17:41 ppchelko@tin: Started deploy [trending-edits/deploy@85be190]: Trending: Update to node-rdkafka 0.8.0. Canary on scb2001. T159200
  • 17:21 demon@tin: Synchronized wmf-config/CommonSettings.php: Stop calling an idiot user an idiot (duration: 00m 42s)
  • 17:03 demon@tin: Synchronized wmf-config/: pruning old extensionmessages files (duration: 00m 49s)
  • 15:58 moritzm: upgraded jessie systems running HHVM in deployment-prep to 3.18.1+dfsg-1+wmf1
  • 15:47 moritzm: uploaded new HHVM 3.18 package with backported patch for stat_cache regression (T158176)
  • 15:45 marostegui: For the record: deployed schema change on s2 and s6 for image table (add an index) - T160415
  • 14:22 moritzm: installing chromium security update on osmium
  • 14:05 moritzm: uploaded python-phabricator 0.6.1-1~bpo8~trusty1 for trusty-wikimedia to apt.wikimedia.org (required for Phabricator support in offboarding script running on terbium (trusty))
  • 13:48 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: T160403: Add d to enwikisource's import list (duration: 00m 42s)
  • 13:37 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: T157111: pagePreviews: Enable perf instrumentation (duration: 00m 42s)
  • 13:18 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: 342456: Remove "editusercssjs". (duration: 02m 50s)
  • 13:14 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable Cognate on beta wiktionary sites T156241 Beta Only (again) (duration: 02m 45s)
  • 13:04 gehel: syncing puppet git repo on wdqs-puppet.wikidata-query.eqiad.wmflabs
  • 12:13 godog: deploy thumbor 0.1.36-1 on thumbor100*
  • 10:41 Dereckson: Run namespaceDupes.php for pnb.wiktionary (T159976): all looks good for this one
  • 10:37 Dereckson: Run namespaceDupes.php for pnb.wikipedia (T159976)
  • 10:34 ema: upgrade cp4001 (misc) and cp4011 (maps) to linux 4.9 T154934
  • 09:11 marostegui: Disable parallel replication on dbstore2002, dbstore2001, dbstore1002, dbstore1001 - T160407
  • 09:02 marostegui: Disable parallel replication on x1 slaves (db1029, db2033) - T160407
  • 08:27 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Enable Cognate on beta wiktionary sites T156241 Beta Only (duration: 02m 48s)
  • 08:26 moritzm: removed imagemagick 6.8.9.9-5+deb8u7+wmf1 from apt.wikimedia.org (the sharpen patch is folded into the new 6.8.9.9-5+deb8u8 security update)
  • 08:22 marostegui: Deploy alter table x1 testing parallel replication - T160407
  • 08:11 moritzm: installing imagemagick security updates
  • 07:26 marostegui: Enable parallel replication on x1 slaves - T160407
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 13m 33s)
  • 00:55 eileen: update civicrm from 31f19d6 to 639eb68
  • 00:41 maxsem@tin: Synchronized wmf-config/logging.php: https://gerrit.wikimedia.org/r/342778 (duration: 02m 46s)
  • 00:32 maxsem@tin: Synchronized php-1.29.0-wmf.16/extensions/RelatedSites/: Hide DMOZ links with https://gerrit.wikimedia.org/r/#/c/342753/ + https://gerrit.wikimedia.org/r/#/c/342768/ (duration: 02m 48s)
  • 00:27 maxsem@tin: Synchronized php-1.29.0-wmf.15/extensions/RelatedSites/: Hide DMOZ links with https://gerrit.wikimedia.org/r/#/c/342753/ + https://gerrit.wikimedia.org/r/#/c/342768/ (duration: 02m 48s)
  • 00:19 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/340697/2 (duration: 02m 53s)
  • 00:08 mutante: depooled mw2256 because it's down again (T155180)
  • 00:08 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet
  • 00:05 dzahn@puppetmaster1001: conftool action : get/pooled; selector: dc=eqiad,name=mw2256.codfw.wmnet

2017-03-14

  • 23:59 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/342148/ (duration: 02m 47s)
  • 23:55 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/342148/ (duration: 02m 47s)
  • 23:42 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: - (duration: 02m 50s)
  • 23:31 tgr@tin: Finished scap: T125917: Deploy PageViewInfo to testwiki (duration: 48m 58s)
  • 22:42 tgr@tin: Started scap: T125917: Deploy PageViewInfo to testwiki
  • 21:29 ebernhardson: reindexed search in group0 for mondays codfw search downtime/upgrade
  • 20:45 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 02m 50s)
  • 20:17 twentyafterfour: scap was unable to connect to mw2256.codfw.wmnet
  • 20:14 twentyafterfour@tin: Finished scap: full scap of new branch, move test wikis to 1.29.0-wmf.16 refs T158997 (duration: 56m 05s)
  • 19:18 twentyafterfour@tin: Started scap: full scap of new branch, move test wikis to 1.29.0-wmf.16 refs T158997
  • 19:14 ema: restarting pybal on lvs1010-11 T160405
  • 19:13 Reedy: Delete 2FA for User:Conny per request on IRC. Identy verified via Lydia_WMDE
  • 18:42 nuria@tin: Finished deploy [eventlogging/analytics@417c40f]: (no justification provided) (duration: 00m 02s)
  • 18:42 nuria@tin: Started deploy [eventlogging/analytics@417c40f]: (no justification provided)
  • 18:39 gehel: removing swap from elasticsearch servers - T158884
  • 18:37 ottomata: upgrading librdkafka to 0.9.4 and restarting varnishkafka on cache text hosts
  • 18:19 ottomata: upgrading librdkafka to 0.9.4 and restarting varnishkafka on cache upload hosts
  • 18:13 ottomata: upgrading librdkafka to 0.9.4 and restarting varnishkafka on cache misc hosts
  • 18:11 nuria@tin: Finished deploy [eventlogging/analytics@c3ccb4a]: (no justification provided) (duration: 00m 03s)
  • 18:11 nuria@tin: Started deploy [eventlogging/analytics@c3ccb4a]: (no justification provided)
  • 17:07 ottomata: upgrading librdkafka to 0.9.4 on cache misc and restarting varnishkafka
  • 16:29 jynus: no reponse from db1057 after powercycle- trying to hard reset it
  • 16:10 urandom: T111113: Restart Cassandra in RESTBase Staging to enable optional client encryption
  • 15:39 godog: shut ms-be2002 for idrac / bios troubleshooting T155689
  • 15:24 chasemp: silence toolschecker precise job start check in anticipation of removal
  • 15:18 twentyafterfour: preparing to branch 1.29.0-wmf.16 refs T158997
  • 14:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T132416 (duration: 00m 40s)
  • 14:36 marostegui: Enabled parallel replication (5 threads) on db2033 (x1) - T160407
  • 14:20 chasemp: labsdb100[9|10|11] 'maintain-views --all-databases --table page --replace-all --debug'
  • 14:18 chasemp: labsdb1003 time maintain-views --all-databases --table page --replace-all --debug
  • 14:01 Dereckson: Purged portals URL
  • 13:56 dereckson@tin: Synchronized portals: Resync portals/ directory after touch (duration: 00m 42s)
  • 13:56 chasemp: labsdb1001 maintain-views --all-databases --table page --replace-all --debug
  • 13:46 dereckson@tin: Synchronized portals: Bump to e576c18522ff (duration: 00m 41s)
  • 13:45 dereckson@tin: Synchronized portals/prod/wikipedia.org/assets: Bump to e576c18522ff (duration: 00m 41s)
  • 13:18 elukey: started redis-cli --bigkeys -i 0.1 on rdb1008 (eqiad jobqueue slave)
  • 13:15 dereckson@tin: Synchronized portals: (no justification provided) (duration: 00m 41s)
  • 13:14 dereckson@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 41s)
  • 13:00 gehel: restarting elasticsearch on relforge1001 to test gelf appender
  • 12:41 elukey: reimage analytics1043 to Debian Jessie
  • 12:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1054 with full weight after warmup (duration: 00m 40s)
  • 12:28 jynus: stopping mariadb on db1057, preparing to backup and reimage
  • 12:24 addshore@tin: Synchronized dblists/: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 4/4 (duration: 00m 41s)
  • 12:23 addshore@tin: Synchronized docroot/: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 3/4 NOOP (duration: 00m 44s)
  • 12:19 addshore@tin: Synchronized wmf-config/CommonSettings.php: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 2/4 (duration: 00m 41s)
  • 12:18 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T150183 wmgUseInterwikiSorting true for all wikidata clients #1 #2 PT 1/4 (duration: 00m 52s)
  • 09:15 addshore@tin: Synchronized dblists/interwikisorting.dblist: wmgUseInterwikiSorting true for wikidata clients, excluding wikipedias T150183 (duration: 00m 42s)
  • 08:38 elukey: moved some log files from /var/log/upstart/$logname.log.1 to /var/log/upstart/$logname.log.1.bis on labvirt1014, labtestvirt2001, labtestnet2001, labnet1001 to reduce cronspam
  • 08:15 moritzm: installing icu security updates on trusty (jessie already fixed)
  • 08:07 moritzm: installing icoutils security update on trusty (jessie already fixed)
  • 07:26 moritzm: installing python-imaging/pillow security updates on trusty (jessie already fixed)
  • 07:07 marostegui: Deploy alter table enwiki.revision db1080 - T132416
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T132416 (duration: 00m 41s)
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 - T132416 (duration: 00m 41s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 12m 36s)

2017-03-13

  • 23:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1054 after upgrade with low weight (duration: 00m 41s)
  • 22:29 bawolff@tin: Synchronized php-1.29.0-wmf.15/extensions/SemanticForms/includes/SF_ValuesUtils.php: Backport bb42c6f401b9 (duration: 00m 48s)
  • 21:40 bawolff: Deployed fix for T160266
  • 20:45 addshore: InterwikiSorting deploy (to group0) done
  • 20:43 addshore@tin: Synchronized wmf-config/CommonSettings.php: T150183 Enable InterwikiSorting on group0 #1 #2 PT 4/4 (duration: 00m 40s)
  • 20:42 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T150183 Enable InterwikiSorting on group0 #1 #2 PT 3/4 (duration: 00m 41s)
  • 20:41 addshore@tin: Synchronized docroot/noc/conf/interwikisorting.dblist: T150183 Enable InterwikiSorting on group0 #1 #2 PT 2/4 NOOP (duration: 00m 42s)
  • 20:39 addshore@tin: Synchronized dblists/interwikisorting.dblist: T150183 Enable InterwikiSorting on group0 #1 #2 PT 1/4 (duration: 00m 51s)
  • 18:37 dcausse@tin: Synchronized php-1.29.0-wmf.15/extensions/CirrusSearch/: Make incoming link counting compatible with 5.x (duration: 00m 53s)
  • 18:06 jynus: chowning /var/lib/git/operations/puppet to gitpuppet on labscontrol1002
  • 18:03 jynus: chowning /var/lib/git/operations/puppet to gitpuppet on labscontrol1001
  • 17:46 reedy@tin: Synchronized wmf-config/throttle.php: Throttle rule for event currently ongoing (duration: 00m 43s)
  • 17:29 gehel: re-configuring cluster settings after elasticsearch upgrade - T158680
  • 17:29 dcausse: done re-enabling writes to elastic@codfw (elastic5 upgrade)
  • 17:28 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 2: repool codfw and send wmf16 to codfw 3/3 (duration: 00m 41s)
  • 17:26 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [es5 upgrade] step 2: repool codfw and send wmf16 to codfw 2/3 (duration: 00m 44s)
  • 17:24 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 2: repool codfw and send wmf16 to codfw 1/3 (duration: 00m 46s)
  • 17:23 gehel@tin: Finished deploy [wdqs/wdqs@202a106]: (no justification provided) (duration: 01m 46s)
  • 17:22 gehel@tin: Started deploy [wdqs/wdqs@202a106]: (no justification provided)
  • 17:19 jynus: stopping mariadb at db1054 and preparing for backup and reimage
  • 17:18 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1003.eqiad.wmnet
  • 16:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1054 for upgrade (duration: 00m 53s)
  • 16:55 godog: outdated swift rings pushed in eqiad-prod, pushed again updated rings from git repo - T158337
  • 16:35 godog: add ms-be2028/29/30 to swift codfw-prod, initial add - T158337
  • 16:25 gehel: restarting elasticsearch on all codfw cluster after upgrade - T158680
  • 16:23 gehel: restarting elasticsearch on elastic2001 after upgrade - T158680
  • 16:06 gehel: upgrading plugins to 5.1.2 on elasticsearch codfw - T158680
  • 15:41 gehel: shutting down elasticsearch on codfw for v5.1.2 upgrade - T158680
  • 15:21 dcausse: elastic@codfw stopped to receive writes
  • 15:21 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: [es5 upgrade] step 1: depool codfw for writes 2/2 (duration: 00m 44s)
  • 15:19 dcausse@tin: Synchronized wmf-config/CommonSettings.php: [es5 upgrade] step 1: depool codfw for writes 1/2 (duration: 00m 45s)
  • 14:33 marostegui: Deploy alter table enwiki.revision db1083 - T132416
  • 14:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T132416 (duration: 00m 41s)
  • 14:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T132416 (duration: 00m 41s)
  • 13:01 hashar@tin: Synchronized wmf-config/CommonSettings.php: +$wgAvailableRights[] = autoreviewrestore; (duration: 00m 41s)
  • 12:08 ema: restart pybal on lvs1003 to add swift-https_443
  • 12:05 moritzm: install libevent security updates
  • 11:56 elukey: reimage analytics1042 (Hadoop worker node) to Debian Jessie
  • 11:15 godog: bounce pybal on lvs1006 to try picking up swift https changes
  • 11:06 zeljkof: purge bswiki logo - T158815
  • 10:44 Dereckson: Update site statistics on gu.wikipedia (T160328)
  • 09:23 gehel: downgrading elasticsearch to v5.1.2 on relforge, a full reindex will be needed - T156150
  • 08:40 marostegui: Compress dewiki - db1070 - T153743
  • 08:31 marostegui: Stop replication on labsdb1009,10 and 11 - T153743
  • 08:30 marostegui: Stop MySQL on db1095 (sanitarium2) to take a backup - T153743
  • 08:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 - T153743 (duration: 00m 41s)
  • 08:08 marostegui: Deploy alter table s6 - db1050 (master) - T159414
  • 08:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1030 - T159414 (duration: 00m 41s)
  • 07:46 moritzm: upgrading apache on remaining mediawiki servers in eqiad
  • 07:24 marostegui: Deploy alter table enwiki.revision db1089 - T132416
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T132416 (duration: 00m 41s)
  • 07:13 marostegui: Deploy alter table s6 revision table on db1030 - T159414
  • 07:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1030 - T159414 (duration: 00m 52s)
  • 06:52 elukey: powercycle mw2256, stuck in boot (looked in the console)
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 13 02:28:18 UTC 2017 (duration 5m 21s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 08m 57s)

2017-03-12

  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 12 02:25:37 UTC 2017 (duration 5m 32s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 07m 25s)

2017-03-11

  • 08:39 jynus: powercycle es2015 - unresponsive
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 07m 41s)
  • 00:19 smalyshev@tin: Finished deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix (duration: 02m 15s)
  • 00:16 smalyshev@tin: Started deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix
  • 00:08 smalyshev@tin: Finished deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix (duration: 00m 16s)
  • 00:08 smalyshev@tin: Started deploy [wdqs/wdqs@UNKNOWN]: Deploy new updater on 1003 for potential connection drop fix
  • 00:07 SMalyshev: going to deploy updater patch on wdq1003. The host is in maintenance, not a production deployment.

2017-03-10

  • 21:42 hashar: restarted Zuul
  • 20:06 gehel: restart kartotherian / tilerator(ui) on maps-test*
  • 20:06 gehel@tin: Finished deploy [kartotherian/deploy@76adf21]: (no justification provided) (duration: 00m 54s)
  • 20:05 gehel@tin: Started deploy [kartotherian/deploy@76adf21]: (no justification provided)
  • 20:03 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 00m 16s)
  • 20:03 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:57 gehel: restarting tilerator(ui) on maps-test2004
  • 19:57 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 00m 04s)
  • 19:57 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:47 gehel: restarting tilerator(ui) on maps-test2004
  • 19:47 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 00m 03s)
  • 19:47 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:45 gehel: failed tilerator deploy on maps-test2004
  • 19:45 gehel@tin: Finished deploy [tilerator/deploy@b501046]: (no justification provided) (duration: 01m 20s)
  • 19:44 gehel@tin: Started deploy [tilerator/deploy@b501046]: (no justification provided)
  • 19:36 ejegg: ran wmf_civicrm db updates through 7500 - Add benevity as a financial type for benevity imports.
  • 19:34 gehel: restart kartotherian on maps-test2004
  • 19:28 gehel@tin: Finished deploy [kartotherian/deploy@76adf21]: (no justification provided) (duration: 00m 23s)
  • 19:27 gehel@tin: Started deploy [kartotherian/deploy@76adf21]: (no justification provided)
  • 19:19 gehel: upgrading kartotherian on maps-test2004 - T150354
  • 19:07 MaxSem: Unmasked kartotherian on maps-test2004
  • 18:28 smalyshev@tin: Finished deploy [wdqs/wdqs@1f2973c]: Deploy new updater on 1003 for potential connection drop fix (duration: 00m 03s)
  • 18:28 smalyshev@tin: Started deploy [wdqs/wdqs@1f2973c]: Deploy new updater on 1003 for potential connection drop fix
  • 17:28 ottomata: installed librdkafka 0.9.4 via dpkg -i on cp1052 (cache text) and restarted varnishkafka in preparation for fleet upgrade next week
  • 17:24 ottomata: installed librdkafka 0.9.4 via dpkg -i on cp1058 (cache misc) and restarted varnishkafka in preparation for fleet upgrade next week
  • 16:44 papaul: oresrdb2002 - signing puppet certs, salt-key, initial run
  • 16:25 elukey: reboot mw22(5[1-9]|60) to enable mw-cgroup mountpoint
  • 15:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1030 - T159414 (duration: 02m 42s)
  • 15:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1022 - T159414 (duration: 00m 45s)
  • 15:03 marostegui: Stop slave db2033 for maintenance - T159707
  • 14:05 hashar: contint1001 and contint2001 : Migrating git-daemon to systemd . Would stop zuul merger briefly
  • 13:58 elukey: added 3 new MW api-appservers (mw2251-53) and 7 new appservers (mw2254-60) to codfw
  • 13:35 hashar: Restarting Jenkins. Deadlocks in ssh connections. T160168
  • 07:28 moritzm: upgrading libarchive on trusty systems (jessie already fixed)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Added weight 1 for db1061 - T159414 (duration: 00m 40s)
  • 07:13 marostegui: Deploy alter table s6 revision table on db1022 - T159414
  • 07:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1022 - T159414 (duration: 00m 41s)
  • 04:29 mutante: codfw mw jobrunner: they start but then fail again shortly after: mw2248 jobrunner[67314]: [Fri Mar 10 04:23:07 2017] [hphp] [67314:7f6a34b746c0:0:000024] [] LightProcess::closeShadow failed due to exception: Failed in afdt::sendRaw: Broken pipe
  • 04:12 mutante: more codfw appservers ... - systemctl start jobchron, systemctl start jobrunner (both were failed but are now active (running)
  • 04:09 mutante: mw2155 - systemctl start jobchron, systemctl start jobrunner (both were failed but are now active (running)
  • 04:02 mutante: mw2249 systemctl start jobrunner - now Active: active (running)
  • 03:56 mutante: codfw appserver jobrunner service fail related to https://gerrit.wikimedia.org/r/#/c/259660/ ?
  • 03:54 mutante: codfw appservers showing "systemd degraded" alerts are failed jobrunner service unit. after puppet-agent "Mediawiki::Jobrunner/Package[jobrunner]/ensure) ensure changed..." ..then jobrunner.service: main process exited, code=exited, status=143/n/a
  • 02:51 AaronSchulz: Restarted job services for 5101424 (statsd batching) after monitoring mw1161
  • 02:39 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 10 02:39:25 UTC 2017 (duration 5m 28s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 12m 17s)
  • 00:54 ppchelko@tin: Finished deploy [trending-edits/deploy@1673068]: Replayed events are purged based on current timestamp T160136 (duration: 06m 24s)
  • 00:48 ppchelko@tin: Started deploy [trending-edits/deploy@1673068]: Replayed events are purged based on current timestamp T160136
  • 00:39 ppchelko@tin: Finished deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136 (duration: 02m 23s)
  • 00:38 dereckson@tin: Synchronized php-1.29.0-wmf.15/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTargetLoader.js: ArticleTargetLoader: wikitext switch shouldn't require FullRestbaseURL (T158692) (duration: 00m 41s)
  • 00:37 ppchelko@tin: Started deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136
  • 00:31 ppchelko@tin: Finished deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136 (duration: 07m 17s)
  • 00:30 eileen: update CiviCRM from d20ed40 to 31f19d6
  • 00:24 ppchelko@tin: Started deploy [trending-edits/deploy@a5716b9]: Replayed events are purged based on current timestamp T160136
  • 00:22 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Move NavigationTiming config to EventLogging section + Remove setting of unused $wgPercentHHVM (Gerrit:342147 and Gerrit:342149, no-op) (duration: 00m 40s)
  • 00:19 maxsem@tin: Finished deploy [tilerator/deploy@160f314]: https://gerrit.wikimedia.org/r/#/c/342153/ - revert submodule updates due to broken manik->libc dependency (duration: 00m 16s)
  • 00:19 maxsem@tin: Started deploy [tilerator/deploy@160f314]: https://gerrit.wikimedia.org/r/#/c/342153/ - revert submodule updates due to broken manik->libc dependency

2017-03-09

  • 22:50 mutante: prometheus1003/1004 - systemctl stop prometheus (as opposed to /etc/init.d/prometheus), as they are low on disk but are not in production yet
  • 22:49 maxsem@tin: Finished deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/ (duration: 00m 05s)
  • 22:48 maxsem@tin: Started deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/
  • 22:46 mutante: prometheus1003 - stopping service: [....] Stopping monitoring system and time series database: prometheusInvalid --pidfile argument: '/var/run/prometheus/prometheus.pid' (Parent directory does not exist)
  • 22:46 maxsem@tin: Finished deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/ (duration: 00m 21s)
  • 22:45 maxsem@tin: Started deploy [tilerator/deploy@fb06c99]: https://gerrit.wikimedia.org/r/#/c/342140/
  • 22:18 maxsem@tin: Finished deploy [tilerator/deploy@367df80]: no-op (duration: 00m 22s)
  • 22:18 maxsem@tin: Started deploy [tilerator/deploy@367df80]: no-op
  • 22:00 mobrovac@tin: Finished deploy [trending-edits/deploy@57a654e]: Bump max_pages for T156411 (duration: 06m 07s)
  • 21:54 mobrovac@tin: Started deploy [trending-edits/deploy@57a654e]: Bump max_pages for T156411
  • 21:37 mutante: fluorine - puppet node clean, puppet node deactivate, salt-key -d, remove from Icinga.. (T159996)
  • 21:35 mutante: fluorine - shutdown -h now (decom) T159996
  • 20:09 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.15
  • 20:02 mutante: cobalt: remove crontab entry of user gerrit2 that created reviewer counts, gzip /var/www/reviewer-counts.json and moved to /root/ for backup (re: gerrit:341592) T54329
  • 19:53 reedy@tin: Synchronized php-1.29.0-wmf.15/extensions/ConfirmEdit: Fixup maintenance script (duration: 00m 43s)
  • 19:22 legoktm: foreachwiki extensions/WikimediaMaintenance/createExtensionTables.php linter
  • 18:21 moritzm: rebooting cp1008 for upgrade to Linux 4.9
  • 17:50 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1003.eqiad.wmnet
  • 17:45 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1004.eqiad.wmnet
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1003.eqiad.wmnet
  • 17:11 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1002.eqiad.wmnet
  • 17:11 bblack: reboot lvs1001 (post-incident cleanup reboot)
  • 17:02 bblack: reboot lvs1004 (post-incident cleanup reboot)
  • 16:58 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1001.eqiad.wmnet
  • 16:10 elukey: remove Piwik/bohrium health check from Varnish cache misc (https://gerrit.wikimedia.org/r/#/c/342007/)
  • 15:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 - T159414 (duration: 00m 41s)
  • 15:07 reedy@tin: Synchronized php-1.29.0-wmf.15/extensions/ConfirmEdit: Fixup maintenance script (duration: 00m 43s)
  • 15:02 moritzm: installing nettle security updates
  • 14:42 zeljkof: EU SWAT finished
  • 14:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add HD logos for several projects (T150618) (duration: 00m 41s)
  • 14:38 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add HD logos for several projects (T150618) (duration: 00m 42s)
  • 14:35 moritzm: removed cn=svn group from LDAP directory (Bug: T129788)
  • 14:25 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: throttle] Add new throttle rule+remove expired rules (T159957) (duration: 00m 45s)
  • 14:15 addshore@tin: Synchronized wmf-config/CommonSettings.php: Don't show rdf2latex table hint with ElectronPdfService enabled T157432 (duration: 00m 49s)
  • 13:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 with normal weight after warmup (duration: 00m 40s)
  • 13:52 moritzm: removed cn=svnadm group from LDAP directory (Bug: T129788)
  • 13:46 moritzm: removed cn=trebuchet group from LDAP directory (Bug: T129788)
  • 13:43 gehel: invalidating Tasmania zoom level 10 tiles in varnish - T159631
  • 13:21 marostegui: Deploy alter table s6 revision table on db1085 - T159414
  • 13:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 - T159414 (duration: 00m 41s)
  • 13:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 - T159414 (duration: 00m 43s)
  • 12:34 moritzm: rebooting multatuli to Linux 4.9
  • 12:23 jynus: purging old rc rows from non-production database replicas
  • 11:24 marostegui: Stop replication db2033 - T159707
  • 10:49 marostegui: Deploy alter table s6 revision table on db1088 - T159414
  • 10:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 - T159414 (duration: 00m 41s)
  • 10:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 - T159414 (duration: 00m 42s)
  • 10:25 ema: service systemd-sysctl restart on lvs hosts
  • 08:21 marostegui: Deploy alter table s6 revision table on db1093 - T159414
  • 08:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T159414 (duration: 00m 49s)
  • 08:10 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after maintenance with low weight (duration: 00m 43s)
  • 05:24 bblack: poweroff lvs1001 from idrac
  • 03:15 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 9 03:15:39 UTC 2017 (duration 5m 53s)
  • 03:09 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 14m 35s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 14m 34s)
  • 01:08 twentyafterfour: phabricator update complete.
  • 01:06 twentyafterfour: updating phabricator to tag release/2017-03-08/1
  • 00:54 mutante: iridium - tested stop/start of phd service with upstart, unlink /etc/init.t/phd which was the formerly used symlink to a phab php script
  • 00:41 mutante: iridium - re-enable puppet, convert to base::service unit, phd restarting
  • 00:36 mutante: iridium - temp. disable puppet | phab1001 - converting service to base::service_unit (T137928)
  • 00:18 catrope@tin: Synchronized php-1.29.0-wmf.15/extensions/Echo/modules/styles/mw.echo.ui.NotificationBadgeWidget.less: Fix RTL popup alignment (T159999) (duration: 00m 42s)

2017-03-08

  • 22:10 legoktm: resuming running refreshLinks.php on small wikis
  • 21:43 arlolra@tin: Started restart [parsoid/deploy@0c22f72]: (no justification provided)
  • 21:41 legoktm@tin: Synchronized wmf-config/CommonSettings.php: Enable Linter on testwiki - T148609 (2/2) (duration: 00m 41s)
  • 21:39 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Linter on testwiki - T148609 (1/2) (duration: 00m 44s)
  • 21:38 legoktm: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=testwiki linter
  • 21:31 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.15
  • 21:31 twentyafterfour@tin: Synchronized php-1.29.0-wmf.15/extensions/CodeReview/backend/CodeCommentLinker.php: deploy https://gerrit.wikimedia.org/r/#/c/341857/ (duration: 00m 46s)
  • 21:27 arlolra: Updated Parsoid to dec47257 (T59603)
  • 21:19 arlolra@tin: Finished deploy [parsoid/deploy@0c22f72]: Updating Parsoid to dec47257 (duration: 08m 19s)
  • 21:11 arlolra@tin: Started deploy [parsoid/deploy@0c22f72]: Updating Parsoid to dec47257
  • 19:54 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Reenable Collection on srn.wikipedia (T158467) (duration: 00m 46s)
  • 19:43 madhuvishy: Upgraded nslcd and libnss-ldapd in labstore100[1,2,4,5]
  • 19:36 reedy@tin: Synchronized php-1.29.0-wmf.14/extensions/ConfirmEdit: Maintenance script updates (duration: 00m 50s)
  • 17:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1070 ROW based replication comments - T153743 (duration: 00m 41s)
  • 17:28 Pchelolo: update RESTBase to 20e2c44c
  • 17:25 Pchelolo: update RESTBase to 20e2c44c: canary on restbase1007
  • 17:23 Pchelolo: update RESTBase to 20e2c44c: staging
  • 17:21 moritzm: installing Ubuntu imagemagick security updates (jessie already fixed)
  • 16:13 marostegui: Deploy alter table s6 revision table on dbstore1002 - T159414
  • 16:06 mobrovac@tin: Finished deploy [eventstreams/deploy@78e248c]: Deploy for T159486 (duration: 01m 48s)
  • 16:04 mobrovac@tin: Started deploy [eventstreams/deploy@78e248c]: Deploy for T159486
  • 15:37 moritzm: uploaded firmware-nonfree 20161130 for jessie-wikimedia/experimental to apt.wikimedia.org
  • 15:33 reedy@tin: Synchronized wmf-config/CommonSettings.php: Remove EducationProgram config back compat (duration: 00m 41s)
  • 15:32 reedy@tin: Synchronized wmf-config/flaggedrevs.php: Whitespace (duration: 00m 41s)
  • 15:29 moritzm: uploaded linux 4.9.13 for jessie-wikimedia/experimental to apt.wikimedia.org
  • 15:19 elukey: rebooting mw22(5[4-9]|60) as part of sanity check for T155180
  • 15:08 elukey: rebooting mw225[123] as part of sanity check for T155180
  • 14:42 zeljkof: EU SWAT finished
  • 14:42 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add HD logos for several projects (T150618) (duration: 00m 41s)
  • 14:41 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add HD logos for several projects (T150618) (duration: 00m 44s)
  • 14:27 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logo for bswiki (Bosnian Wikipedia) (T158815) (duration: 00m 41s)
  • 14:26 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update logo for bswiki (Bosnian Wikipedia) (T158815) (duration: 00m 41s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T159803) (duration: 00m 41s)
  • 13:42 marostegui: Deploy alter table s6 revision table on db1023 - T159414
  • 13:11 godog: make mwlog1001 the primary logging host, deprecate fluorine
  • 12:35 godog: add mwlog[12]001 to analytics-in4 term rsync-http-https - T123728
  • 11:35 moritzm: installing texlive-base security updates
  • 10:34 jynus: restarting labsdb1004's mariadb T159572
  • 10:31 marostegui: Shutdown postgresql on labsdb1007 for maintenance - T157359
  • 10:12 elukey: reimage analytics1041 to Debian Jessie
  • 09:51 gehel: re-enabled waterline import on maps[12]001 - T159631
  • 09:39 marostegui: Stop replication on db2033 - T159707
  • 09:07 ariel@tin: Finished deploy [dumps/dumps@e30fbd0]: run monitor.py relative to cwd, to pick up default config files (duration: 00m 02s)
  • 09:07 ariel@tin: Started deploy [dumps/dumps@e30fbd0]: run monitor.py relative to cwd, to pick up default config files
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1070 - T153743 (duration: 00m 41s)
  • 08:36 moritzm: upgrading apache on mw1161-mw1208
  • 08:36 marostegui: Restart mysql on db1070 to change binlog to ROW - T153743
  • 08:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1070 - T153743 (duration: 00m 41s)
  • 07:27 marostegui: Start pt-table-checksum on plwiki (s2) - T154485
  • 07:19 marostegui: Deploy alter table s6 revision table on db1061 - T159414
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1060 original weight - T158193 (duration: 00m 47s)
  • 03:40 krinkle@tin: Synchronized docroot/noc/: Fix conftool link (I2f34be0a5), Remove IE6 css (Iae8a356e2), add db-codfw.php (I9f02dee3c) (duration: 00m 42s)
  • 03:17 bblack: authdns back to normal (puppet enabled, do normal things!)
  • 03:09 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 8 03:09:21 UTC 2017 (duration 5m 49s)
  • 03:03 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.15) (duration: 15m 08s)
  • 02:46 bblack: disabling puppet on production authdns caches (testing dns lint related bits)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 53s)
  • 01:33 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 41s)
  • 00:35 mobrovac@tin: Finished deploy [electron-render/deploy@5ec5614]: (no justification provided) (duration: 00m 59s)
  • 00:34 mobrovac@tin: Started deploy [electron-render/deploy@5ec5614]: (no justification provided)
  • 00:33 mobrovac@tin: Finished deploy [electron-render/deploy@5ec5614]: (no justification provided) (duration: 04m 08s)
  • 00:29 mobrovac@tin: Started deploy [electron-render/deploy@5ec5614]: (no justification provided)
  • 00:27 mobrovac@tin: Finished deploy [electron-render/deploy@5ec5614]: Deploy for T159486 (duration: 04m 46s)
  • 00:27 mobrovac@tin: Finished deploy [mobileapps/deploy@d6202e4]: Deploy for T159486 (duration: 03m 52s)
  • 00:26 catrope@tin: Synchronized php-1.29.0-wmf.15/extensions/Echo/modules/ui/: Fix regression in Echo popup (duration: 00m 42s)
  • 00:23 mobrovac@tin: Started deploy [mobileapps/deploy@d6202e4]: Deploy for T159486
  • 00:23 mobrovac@tin: Started deploy [electron-render/deploy@5ec5614]: Deploy for T159486
  • 00:22 mobrovac@tin: Finished deploy [mathoid/deploy@83f80ee]: Deploy for T159486 (duration: 04m 53s)
  • 00:22 mobrovac@tin: Finished deploy [graphoid/deploy@485ca11]: Deploy for T159486 (duration: 04m 45s)
  • 00:20 mobrovac@tin: Finished deploy [electron-render/deploy@51cff8a]: Deploy for T159486 (duration: 03m 29s)
  • 00:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Modify add/remove groups for flood group on wikitech (duration: 00m 42s)
  • 00:18 mobrovac@tin: Started deploy [mathoid/deploy@83f80ee]: Deploy for T159486
  • 00:17 mobrovac@tin: Finished deploy [cxserver/deploy@7e22281]: Deploy for T159486 (duration: 02m 24s)
  • 00:17 mobrovac@tin: Started deploy [graphoid/deploy@485ca11]: Deploy for T159486
  • 00:17 mobrovac@tin: Started deploy [electron-render/deploy@51cff8a]: Deploy for T159486
  • 00:16 mobrovac@tin: Finished deploy [changeprop/deploy@99280e3]: Deploy for T159486 (duration: 01m 09s)
  • 00:16 mobrovac@tin: Finished deploy [trending-edits/deploy@88e2f74]: Deploy changes for T156666 T156680 T159486 T156411 (duration: 06m 58s)
  • 00:15 mobrovac@tin: Started deploy [cxserver/deploy@7e22281]: Deploy for T159486
  • 00:15 mobrovac@tin: Started deploy [changeprop/deploy@99280e3]: Deploy for T159486
  • 00:13 Reedy: Clear 2FA for "User:Steven Walling"; identity confirmed via facebook
  • 00:09 mobrovac@tin: Started deploy [trending-edits/deploy@88e2f74]: Deploy changes for T156666 T156680 T159486 T156411
  • 00:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable wgCiteResponsiveReferences by default for back-compat (T33597) (duration: 00m 41s)

2017-03-07

  • 23:38 mutante: gerrit restarting for config changes 341701, 341587
  • 22:45 papaul: ms-be2028-ms-be2039 - signing puppet certs, salt-key, initial run
  • 22:11 mobrovac@tin: Finished deploy [citoid/deploy@5a7e053]: Deploy for T158675 T103478 T159486 (duration: 02m 36s)
  • 22:08 mobrovac@tin: Started deploy [citoid/deploy@5a7e053]: Deploy for T158675 T103478 T159486
  • 22:02 mobrovac@tin: Finished deploy [zotero/translators@35da336]: Update transators for T158675 (duration: 00m 06s)
  • 22:01 mobrovac@tin: Started deploy [zotero/translators@35da336]: Update transators for T158675
  • 21:59 mobrovac@tin: Finished deploy [trending-edits/deploy@f855460]: (no justification provided) (duration: 04m 48s)
  • 21:54 mobrovac@tin: Started deploy [trending-edits/deploy@f855460]: (no justification provided)
  • 21:40 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.15 refs T158996
  • 21:30 twentyafterfour@tin: Finished scap: bump test wikis to 1.29.0-wmf.5 refs T158996 (duration: 53m 17s)
  • 21:23 mutante: mw1177 - service hhvm restart
  • 20:37 twentyafterfour@tin: Started scap: bump test wikis to 1.29.0-wmf.5 refs T158996
  • 20:29 mutante: iridium - re-enabling puppet, ssh-phab service converted to base::service_unit, upstart template moved but unchanged, service restarted just fine.
  • 20:27 mutante: phab2001 - phab-ssh service converted to base::service_unit and with working systemd unit file. 'systemctl ssh-phab status' is active (running) (T158434)
  • 20:26 ottomata: installing librdkafka 0.9.4 on cp1045 (cache misc host) via .deb package to try it with varnishkafka in prod (ping bblack, ema, just in case)
  • 20:23 mutante: iridium - temp disabled puppet - converting phab-ssh service to base::service_unit, systemd on phab2001, upstart on iridium
  • 19:23 twentyafterfour: branching 1.29.0-wmf15 refs T158996
  • 19:20 bblack: rebooting baham (ns1) AGAIN - low cpu frequencies issues like T147905 - checking bios/idrac stuff
  • 19:08 bblack: rebooting baham (ns1) - low cpu frequencies issues like T147905
  • 18:52 volans: rmmod acpi_pad on baham, was using 100% CPU T137647
  • 18:37 mobrovac: restbase deploy start of cd53670b
  • 16:58 akosiaris: re-increase temporarily the client-output-buffer-limit for rbd1007, phab task filling to follow
  • 16:40 akosiaris: decrease client-output-buffer-limit soft-limit back to normal values
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1008.eqiad.wmnet
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1007.eqiad.wmnet
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1006.eqiad.wmnet
  • 16:22 filippo@puppetmaster1001: conftool action : set/weight=40; selector: name=ms-fe1005.eqiad.wmnet
  • 15:28 joal@tin: Finished deploy [analytics/aqs/deploy@e0da1bd]: (no justification provided) (duration: 06m 08s)
  • 15:22 joal@tin: Started deploy [analytics/aqs/deploy@e0da1bd]: (no justification provided)
  • 15:15 akosiaris: increase client-output-buffer-limit soft-limit to 500MB temporarily on rdb1007
  • 14:46 jynus: restart labsdb1004 for config and data check
  • 14:32 moritzm: uploaded HHVM 3.18 builds of hhvm-tidy, hhvm-luasandbox and hhvm-wikidiff2 to the experimental section of apt.wikimedia.org (Bug: T158176)
  • 14:03 reedy@tin: Synchronized docroot/: Fixup filebackend symlinks (duration: 00m 41s)
  • 13:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1060 weight - T158193 (duration: 00m 58s)
  • 12:53 marostegui: Just for the sake of having it logged: gtid_domain_id has been deployed in all the database servers - T149418
  • 12:53 elukey: analytics1040 back in service - testing the new Debian configuration
  • 12:39 marostegui: Deploy ALTER table on db2028 (codfw s6 master) on the revision table - T159414
  • 12:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 with less weight - T158193 (duration: 00m 40s)
  • 12:19 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2053 - T159414 (duration: 00m 43s)
  • 12:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2034 - T132416 (duration: 00m 50s)
  • 11:41 gehel: cleaning empty log file on elastic2001 (cronspam)
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=trendingedits'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=pdfrender'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=eventstreams'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=cxserver'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=citoid'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=graphoid'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2006.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mobileapps'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=trendingedits'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=pdfrender'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=eventstreams'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=ores'])
  • 11:33 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=cxserver'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=apertium'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=citoid'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=graphoid'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 11:32 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: scb2005.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mobileapps'])
  • 11:27 elukey: end of hacking on install1002 (puppet re-enabled)
  • 09:23 ema: cache_text, cache_upload: upgrading to varnish 4.1.5 T159424
  • 09:10 elukey: temporary live hacking analytics-flex.cfg partman config on install1002
  • 08:25 moritzm: installing systemd bugfix updates from jessie point release
  • 07:39 marostegui: Stop MySQL db1067 to clone db1060 from it - T158193
  • 07:16 marostegui: Deploy ALTER table on db2053 (s6) for the revision table - T159414
  • 07:16 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2053 - T159414 (duration: 00m 41s)
  • 05:22 Krinkle: foreachwikiindblist 'all - closed - private' deleteEqualMessages.php (T45917) - purge upstreamed translations from remaining wikis
  • 03:28 Krinkle: foreachwikiindblist closed deleteEqualMessages.php (T45917) - purge upstreamed translations from closed wikis
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Mar 7 02:28:59 UTC 2017 (duration 5m 32s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 08m 19s)
  • 00:49 RainbowSprinkles: gerrit: coming back online now
  • 00:43 RainbowSprinkles: gerrit: taking offline for a minute or two for case-insensitive login conversion
  • 00:39 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: In CSP policy for foundationwiki, wikidata.org -> www.wikidata.org (duration: 00m 40s)
  • 00:19 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add other WMF domains to foundationwiki CSP policy for Special:HideBanners (duration: 00m 40s)
  • 00:00 mobrovac: restbase restarting in labs for T158628

2017-03-06

  • 22:14 awight: update payments-wiki config to a591e4c
  • 21:51 mutante: bast3001 - powerdown (T159480), decom in progress
  • 21:48 mutante: bast3001 - schedule downtime for host and all services in Icinga, remove from puppet, salt .. (T159480)
  • 21:36 hashar@tin: Synchronized static/images/project-logos: [fixup] Fix up wrongly updated sr.wikibooks and bs.wiktionary logos - T159542 T159534 (duration: 00m 42s)
  • 21:02 matt_flaschen: populateContentModel.php --wiki=cawiki --ns=103 run for revision, archive, page . T159047 complete
  • 21:00 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Flow for Viquiprojecte DiscussiĂł on cawiki (duration: 00m 40s)
  • 20:46 ottomata: removing old cdh packages from thirdparty component in apt
  • 20:34 gehel: reimport waterlines data on maps1001.eqiad.wmnet - T159631
  • 20:34 matt_flaschen: For T159047
  • 20:34 matt_flaschen: Ran (time mwscript extensions/Flow/maintenance/convertNamespaceFromWikitext.php --wiki=cawiki 'Viquiprojecte_DiscussiĂł') 2>&1|tee --append ~/2017-03-02_cawiki_convertNamespacesFromWikitext_Viquiprojecte_DiscussiĂł.log
  • 20:26 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Disable Cognate on beta wiktionary sites T156241 Beta Only (duration: 00m 46s)
  • 20:11 thcipriani@tin: Synchronized wmf-config: SWAT: Enable Cognate for beta wiktionaries T156241 beta-only change (duration: 00m 43s)
  • 20:05 ejegg: updated payments-wiki from 66d8125 to f991f15
  • 20:05 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create "flood" flag for labswiki (duration: 00m 40s)
  • 19:53 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add "flow-create-board" to CommonSettings.php for global groups (duration: 00m 40s)
  • 19:52 gehel: restarting wdqs-updater on wdqs* servers to activate GC logs - T159248
  • 19:43 thcipriani: mwscript migrateUserGroup.php --wiki=trwiki 'technician' 'interface-editor' on terbium for T159636
  • 19:43 thcipriani@tin: Synchronized wmf-config: SWAT: Rename "technician" to "interface-editor" on trwiki T144638 (duration: 00m 46s)
  • 19:41 gehel@tin: Finished deploy [wdqs/wdqs@1f2973c]: (no justification provided) (duration: 01m 25s)
  • 19:39 gehel@tin: Started deploy [wdqs/wdqs@1f2973c]: (no justification provided)
  • 19:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 (duration: 00m 40s)
  • 18:22 elukey: analytics1040 has been silenced and it is not ready to work, need to fix its partman recipe
  • 18:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2060 - T159414 (duration: 00m 44s)
  • 18:04 gehel@tin: Finished deploy [wdqs/wdqs@7b77735]: (no justification provided) (duration: 01m 46s)
  • 18:03 demon@tin: Synchronized wmf-config/interwiki.php: Sync interwiki list, T159680 (duration: 00m 41s)
  • 18:02 gehel@tin: Started deploy [wdqs/wdqs@7b77735]: (no justification provided)
  • 15:01 hashar: restarting Jenkins
  • 14:59 addshore: EU SWAT done
  • 14:50 chasemp: labnet1001 'service nova-fullstack restart'
  • 14:44 addshore@tin: Synchronized wmf-config/extension-list-labs: Remove InterwikiSorting and add Cognate to extension-list-labs T150183 T156241 BETA ONLY (duration: 00m 39s)
  • 14:42 addshore@tin: Synchronized wmf-config/extension-list: Add InterwikiSorting extension to prod extension-list T150183 NOOP (duration: 00m 38s)
  • 14:39 addshore@tin: Synchronized wmf-config/db-labs.php: SWAT: Create extension1 db cluster for beta T156241 BETA ONLY (duration: 00m 39s)
  • 14:37 addshore@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Add a CSP policy to foundationwiki to prevent privacy breach T159386 (duration: 00m 39s)
  • 14:23 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change account creation throttle for idwiki to default (6) (duration: 00m 39s)
  • 14:15 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Translation memories multi-DC support T132076 2/2 (NOOP) (duration: 00m 42s)
  • 14:13 addshore@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Enable Translation memories multi-DC support T132076 1/2 (duration: 00m 50s)
  • 14:05 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Bs.wiktionary namespace changes T159538 (duration: 00m 40s)
  • 14:00 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: srwikibooks & bswiktionary logos T159534 T159542 2/2 (duration: 00m 39s)
  • 13:58 addshore@tin: Synchronized static/images/project-logos/: SWAT: srwikibooks & bswiktionary logos T159534 T159542 1/2 (duration: 00m 39s)
  • 13:23 godog: reenable puppet on graphite2001
  • 13:07 marostegui: Deploy ALTER table on db2060 (s6) for the revision table - T159414
  • 13:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2060 - T159414 (duration: 00m 39s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046 - T159414 (duration: 00m 50s)
  • 12:45 moritzm: upgrading apache on mw1209-mw1235
  • 12:44 moritzm: upgrading apache on graphite*
  • 11:49 moritzm: installing imagemagick security updates
  • 11:36 moritzm: upgrading apache on krypton
  • 11:30 moritzm: upgrading apache on planet.wikimedia.org
  • 11:05 elukey: reimage the first Hadoop worker node (an1040) to Debian Jessie
  • 10:46 moritzm: upgrading apache on mediawiki servers in codfw
  • 10:36 gehel: upgrade to elasticsearch 5.2.2 on relforge cluster - T156150
  • 10:24 elukey: (shamefully) replaced /etc/init.d/hadoop-hdfs-datanode script with "exit 0" to prevent the HDFS datanode daemon to start on analytics1028 (broken disk) and leave the rest running (puppet included) - T159632
  • 10:12 gehel: postgresql upgrade on maps* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4)
  • 10:06 ariel@tin: Finished deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var (duration: 00m 01s)
  • 10:06 ariel@tin: Started deploy [dumps/dumps@8521be0]: fix: retries of broken runs could except on uninited var
  • 09:46 gehel: postgresql upgrade on maps-test* (postgresql-9.4 postgresql-9.4-postgis-2.3 postgresql-9.4-postgis-2.3-scripts postgresql-client-9.4 postgresql-client-common postgresql-common postgresql-contrib-9.4)
  • 09:14 ariel@tin: Finished deploy [dumps/dumps@04794df]: move default config into a file and clean up (duration: 00m 02s)
  • 09:14 ariel@tin: Started deploy [dumps/dumps@04794df]: move default config into a file and clean up
  • 09:09 gehel: killing stuck tilerator notification on maps-test2001 - T145534
  • 07:22 marostegui: Resume pt-table-checksum on plwiki (s2) - T154485
  • 06:59 marostegui: Deploy ALTER table on db2046 (s6) for the revision table - T159414
  • 06:46 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2046 - T159414 (duration: 00m 51s)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Mar 6 02:24:24 UTC 2017 (duration 5m 19s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 15s)
  • 01:29 cwd: updated staging civicrm database and triggers

2017-03-05

  • 22:23 Reedy: Generating some more captchas again T159581
  • 10:19 elukey: disabled puppet on analytics1028 to avoid puppet to start the HDFS daemon (T159632)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Mar 5 02:24:02 UTC 2017 (duration 5m 20s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 07m 07s)

2017-03-04

  • 16:43 Reedy: Manually generating even more captchas (going upto 10k total) in screen as reedy on terbium T159581
  • 16:35 Reedy: Manually generating some more captchas T159581
  • 03:28 legoktm: pausing refreshLinks.php run due to increase in job queue
  • 03:05 mutante: planet2001 - and this time it just worked and i can't reproduce the issue. install finished. re-adding to puppet, signing certs...
  • 03:00 mutante: planet2001 - reinstalling once more (T159432)
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Mar 4 02:36:25 UTC 2017 (duration 5m 19s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 12m 10s)
  • 00:52 mutante: conf2002 - ran "systemctl reset-failed" to fix Icinga alert about broken systemd state due to formerly existing but failed service etcdmirror-eqiad-wmnet. turns out you need this to remove missing units. found on http://serverfault.com/questions/606520/how-to-remove-missing-systemd-units (T131959)

2017-03-03

  • 23:23 RainbowSprinkles: phabricator: restarted apache 1 last time, removed hack
  • 23:19 mutante: icinga: for special external hosts benefactorevents and eventdonations, "submit passive check result for this host" -> "check_tcp -p 80" to avoid "crit hosts" that just don't respond to ICMP (http://www.htmlgraphic.com/nagios-check-host-without-ping/)
  • 23:12 RainbowSprinkles: phabricator: restarting apache real quick
  • 22:03 hashar: rebooting contint2001
  • 21:54 hashar: restarting Jenkins
  • 21:51 hashar: enabling puppet on contint1001 and puppet-run
  • 21:05 hashar: disabled puppet on contint1001
  • 20:26 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster only (duration: 00m 40s)
  • 19:35 ebernhardson: restart elasticsearch on relforge1002 to update remote reindex whitelist
  • 19:33 ebernhardson: restart elasticsearch on relforge1001 to update remote reindex whitelist
  • 19:11 legoktm: running refreshLinks.php across small wikis
  • 18:43 addshore@tin: Synchronized php-1.29.0-wmf.14/extensions/RevisionSlider/modules/ext.RevisionSlider.css: T159428 Quick fix for misplaced tooltips on RTL wikis (duration: 00m 42s)
  • 17:35 hashar: CI is mostly recovered. It could not spawn instance anymore. The queue is being processed and will take a while to be completed. Check status on https://integration.wikimedia.org/zuul/ | T159543
  • 16:17 hashar: Stopped Jenkins from processing builds while instances are being recycled
  • 13:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2067 - T159414 (duration: 00m 50s)
  • 13:12 elukey: removed apache2 (rc state) and apache2-utils from analtytics1027
  • 11:11 elukey@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 00m 14s)
  • 11:11 elukey@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 11:09 elukey@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 00m 02s)
  • 11:09 elukey@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 11:05 jynus: stopping mariadb and restarting db1051 for maintenance
  • 11:03 joal@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 01m 23s)
  • 11:02 joal@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 10:53 marostegui: Start pt-table-checksum on plwiki (s2) - T154485
  • 10:48 joal@tin: Finished deploy [analytics/refinery@1440646]: (no justification provided) (duration: 15m 33s)
  • 10:33 joal@tin: Started deploy [analytics/refinery@1440646]: (no justification provided)
  • 09:28 hashar: Restarting Jenkins (2)
  • 09:03 hashar: Restarting Jenkins
  • 08:27 moritzm: upgrading apache on bromine
  • 08:22 marostegui: Run pt-table-checksum on s2 (nowiki) - T154485
  • 08:20 marostegui: Deploy alter table s6 on db2067 - T159414
  • 08:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2067 - T159414 (duration: 00m 40s)
  • 07:30 moritzm: installing w3m security updates on trusty (jessie already fixed)
  • 04:39 mutante: planet2001 last log message was for T159432
  • 04:38 mutante: planet2001 - reinstall, boot into installer, scheduled downtime (T15943)
  • 04:16 legoktm: running refreshLinks.php on aawiki
  • 04:13 legoktm@tin: Synchronized php-1.29.0-wmf.14/maintenance/refreshLinks.php: Queue non-recursive updates - https://gerrit.wikimedia.org/r/340920 (duration: 00m 40s)
  • 03:27 awight: rerunning schema_update wmf_civicrm:7480
  • 03:26 awight: update civicrm from 133bde2 to d20ed40
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Mar 3 02:38:40 UTC 2017 (duration 5m 19s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 13m 28s)
  • 01:45 awight: rerun schema change wmf_civicrm:7480
  • 01:34 Krinkle: terbium$ foreachwiki purgeModuleDeps.php (T158105)
  • 01:34 Krinkle: terbium$ foreachwikiindblist group0 purgeModuleDeps.php (T158105)
  • 01:33 Krinkle: terbium$ mwscript purgeModuleDeps.php --wiki test2wiki (T158105)
  • 01:28 awight: update civicrm from 0cab193 to 133bde2
  • 01:12 MaxSem: Restarted tilerator on codfw tileservers to catch latest code changes
  • 01:11 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/autoload.php: resourceloader: Add purgeModuleDeps.php maintenance script (duration: 00m 39s)
  • 01:10 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/maintenance/cleanupRemovedModules.php: resourceloader: Add purgeModuleDeps.php maintenance script (duration: 00m 40s)
  • 01:09 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/maintenance/purgeModuleDeps.php: resourceloader: Add purgeModuleDeps.php maintenance script (duration: 00m 40s)
  • 01:02 ejegg: re-running fix for missing names
  • 00:42 ejegg: re-enabled CiviCRM de-dupe jobs
  • 00:41 ejegg: CiviCRM geocoding update finished, name fix failed on badly formatted comment
  • 00:35 mattflaschen@tin: Synchronized wmf-config/CirrusSearch-common.php: CirrusSearch: Enable super_detect_noop (duration: 00m 39s)
  • 00:16 mattflaschen@tin: Synchronized php-1.29.0-wmf.14/extensions/Flow/: Fix autoload data and script (duration: 00m 59s)

2017-03-02

  • 23:49 ejegg: running batched geocoding update and donor name fixes
  • 23:43 ejegg: updated civicrm from d012767 to 2d1de87
  • 23:42 ejegg: disabled dedupe jobs for civi update
  • 23:07 bblack: all authdns servers puppet re-enabled
  • 23:05 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-rw,name=eqiad
  • 23:05 bblack@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=appservers-rw
  • 22:55 Krinkle: Stopped statsd-mw-js-deprecate service on hafnium per https://gerrit.wikimedia.org/r/338929
  • 22:46 catrope@tin: Synchronized dblists/: T63729: disable Flow on metawiki (duration: 00m 58s)
  • 22:36 MaxSem: killed stuck updates on maps-test2001
  • 22:09 mutante: bast3002 - stop rsyncd, remove rsyncd config snippets (T156506)
  • 20:05 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.14
  • 19:58 demon@tin: Synchronized wmf-config/CommonSettings.php: Stacktraces are useful when cli scripts fail (duration: 00m 56s)
  • 19:58 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-rw,named=eqiad
  • 19:57 bblack@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=appservers-rw
  • 19:53 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: Trying https://gerrit.wikimedia.org/r/#/c/340607/ once again (duration: 00m 04s)
  • 19:53 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: Trying https://gerrit.wikimedia.org/r/#/c/340607/ once again
  • 19:49 maxsem@tin: Finished deploy [tilerator/deploy@0fe5a1d]: Reverting to previous version
  • 19:49 maxsem@tin: Started deploy [tilerator/deploy@0fe5a1d]: Reverting to previous version
  • 19:46 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 05s)
  • 19:46 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:43 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 03s)
  • 19:43 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:42 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 03s)
  • 19:42 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:42 maxsem@tin: Finished deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/ (duration: 00m 23s)
  • 19:42 maxsem@tin: Started deploy [tilerator/deploy@edb97c5]: https://gerrit.wikimedia.org/r/#/c/340607/
  • 19:16 addshore@tin: Synchronized dblists/all-labs.dblist: Add beta hewiktionary T158628 2/2 NOOP (duration: 00m 39s)
  • 19:15 addshore@tin: Synchronized wikiversions-labs.json: Add beta hewiktionary T158628 1/2 NOOP (duration: 00m 42s)
  • 19:06 awight: reenabling donation and recurring queue consumers
  • 19:05 addshore@tin: Synchronized wmf-config/throttle.php: Add new rules for WMUK T159454 T159461 (duration: 00m 43s)
  • 19:04 awight: update civicrm from fb91fa8 to d012767
  • 18:22 demon@tin: Synchronized php-1.29.0-wmf.14/includes/changes/EnhancedChangesList.php: T159466 (duration: 00m 40s)
  • 17:51 bblack: disabling puppet on authdns prod machines for hacky discovery testing
  • 17:44 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-ro,name=codfw
  • 17:44 bblack@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=appservers-rw,name=eqiad
  • 17:38 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=eqiad
  • 16:52 bblack: puppet re-enabled on authdns production boxes
  • 16:27 bblack: puppet disabled on authdns production boxes, for hacky testing of discovery-related commits
  • 16:00 jynus: restarting db1001 for kernel and mariadb upgrade
  • 15:49 moritzm: uploaded 6.8.9.9-5+deb8u7+wmf1 to apt.wikimedia.org (CMYK sharpen bugfix rebased on latest Debian update)
  • 15:42 moritzm: installing libfcgi-perl security updates
  • 14:47 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: T157700: Re-enable Page Previews instrumentation (duration: 00m 40s)
  • 14:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1008.eqiad.wmnet
  • 14:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1007.eqiad.wmnet
  • 14:37 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1006.eqiad.wmnet
  • 14:32 phuedx@tin: Synchronized portals: (no justification provided) (duration: 00m 41s)
  • 14:32 phuedx@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 14:26 jynus: running alter table on db2040 T147747
  • 14:22 elukey@tin: Finished deploy [analytics/refinery@c3dd129]: (no justification provided) (duration: 02m 18s)
  • 14:20 elukey@tin: Started deploy [analytics/refinery@c3dd129]: (no justification provided)
  • 14:12 phuedx@tin: Synchronized wmf-config/InitialiseSettings.php: Remove Page Previews experiment config (duration: 00m 40s)
  • 14:10 phuedx@tin: Synchronized wmf-config/CommonSettings.php: Remove Page Previews experiment config (duration: 01m 06s)
  • 13:47 moritzm: removed obsolete kernels on ocg1002
  • 13:46 moritzm: removed obsolete kernels on eventlog1001
  • 13:03 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1005.eqiad.wmnet
  • 12:52 moritzm: installing shadow security updates on jessie hosts
  • 12:43 jynus: running ANALYZE table on revision at db1051 (depooled) T159319
  • 12:36 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 for maintenance (duration: 00m 42s)
  • 11:58 hashar: CI composer based builds are now ok. Only operations/mediawiki-config was impacted as far as I can tell.
  • 11:10 kartik@tin: Finished deploy [cxserver/deploy@5101090]: (no justification provided) (duration: 02m 24s)
  • 11:07 kartik@tin: Started deploy [cxserver/deploy@5101090]: (no justification provided)
  • 10:51 hashar: CI composer based builds are sometime broken since composer got upgraded to 1.1.0 . See https://phabricator.wikimedia.org/T159431
  • 10:23 moritzm: installing bind updates (we're using client-side libs/tools)
  • 10:04 moritzm: installing tiff security updates on trusty hosts (jessie already fixed)
  • 09:55 elukey: increased PHP memory_limit on bohrium for Piwik (T154558)
  • 09:26 moritzm: installing glibc updates from jessie point release
  • 09:24 hashar: Upgrading composer to 1.1.0 on CI instances
  • 09:08 moritzm: installing apache2 security updates on mw1262-mw1265
  • 08:51 jynus: running alter table on db2039 T147747
  • 08:45 jynus: running alter table on db2035 T147747
  • 08:27 marostegui: Start pt-table-checksum on itwiki (s2) - T154485
  • 07:20 marostegui: Deploy alter table enwiki.revision db2016 (codfw master) - T132416
  • 07:09 marostegui: Resume pt-table-checksum on idwiki (s2) - T154485
  • 03:04 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Mar 2 03:04:16 UTC 2017 (duration 5m 49s)
  • 02:58 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 14m 52s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 09m 30s)
  • 01:52 eileen1: civicrm changed...
  • 00:48 mutante: tin/mira - you will notice in the output of keyholder status you will not see the pathes in the "comment" column anymore. this is due to newer versions of openssh-client and caused our problem last time i attempted this. thanks to thcipriani's fix https://gerrit.wikimedia.org/r/#/c/312947/ we don't rely on this anymore and all is good, keyholder stays armed even after re-encrypting the
  • 00:44 mutante: tin - disarm/rearm keyholder after changing passphrases of all deployment keys to new passphrase (T154943)
  • 00:41 mutante: mira - disarm/rearm keyholder after changing passphrases of all other deployment keys (T154943)
  • 00:37 dereckson@tin: Synchronized wmf-config/interwiki.php: Update interwiki map (ref T159103) (duration: 00m 41s)
  • 00:23 mutante: mira - disarming keyholder, changed password of analytics deploy key - rearming to test changes for T154943

2017-03-01

  • 23:28 mutante: contint1002, contint2001: rm /usr/lib/ganglia/python_modules/diskstat.py*; rm /etc/ganglia/conf.d/diskstat.pyconf (re: gerrit 340657)
  • 21:44 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: (no justification provided) (duration: 00m 15s)
  • 21:44 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: (no justification provided)
  • 21:44 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: (no justification provided) (duration: 00m 15s)
  • 21:44 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: (no justification provided)
  • 21:43 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 02m 00s)
  • 21:41 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:41 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 03m 50s)
  • 21:40 demon@tin: Synchronized php-1.29.0-wmf.14/extensions/Echo/includes/model/Event.php: better logging and such (duration: 00m 40s)
  • 21:37 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:37 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 05m 14s)
  • 21:32 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:32 arlolra@tin: Finished deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0 (duration: 07m 39s)
  • 21:28 demon@tin: Synchronized php-1.29.0-wmf.14/extensions/CentralAuth/: Unbreak pending real fix (duration: 00m 49s)
  • 21:24 arlolra@tin: Started deploy [parsoid/deploy@32ca3fb]: Updating parsoid to 9f96b2a0
  • 21:04 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to wmf.14
  • 21:03 demon@tin: Synchronized php: Symlink swap (duration: 00m 39s)
  • 20:41 mutante: netmon1001, labsdb1006,labsdb1007, fluorine, helium same fix as above, were not covered by salt targeting as they are precise. this is all now. ubuntu.wikimedia.org does not appear in sources when checking *
  • 20:35 mutante: [neodymium:~] $ sudo salt --out=txt -b 10 -C 'G@lsb_distrib_codename:trusty' cmd.run "sed -i 's/ubuntu.wikimedia/mirrors.wikimedia/g' /etc/apt/sources.list && apt-get update" (https://phabricator.wikimedia.org/rOPUPe9da17d739233a4db197e947e627cf2a47ce6e6f#2080366)
  • 20:27 mutante: all trusty hosts via salt - fix APT sources list. replace ubuntu.wikimedia (deleted) with mirrors.wikimedia, apt-get update (re: https://phabricator.wikimedia.org/rOPUPe9da17d739233a4db197e947e627cf2a47ce6e6f)
  • 20:02 smalyshev@tin: Finished deploy [wdqs/wdqs@2b8ffef]: Bump memory limit for Java to 16g (duration: 03m 36s)
  • 19:59 smalyshev@tin: Started deploy [wdqs/wdqs@2b8ffef]: Bump memory limit for Java to 16g
  • 19:40 mutante: ocg1001, db1047, californium, db1051, rcs1002, db1041, iridium - fix APT sources list. replace ubuntu.wikimedia (deleted) with mirrors.wikimedia, apt-get update
  • 19:30 mutante: labsdb1001, labtestcontrol2001, labtestvirt2001 - fix APT sources list. replace ubuntu.wikimedia (deleted) with mirrors.wikimedia
  • 19:19 awight: applying civicrm db migration wmf_civicrm:7465
  • 19:18 awight: update civicrm from b3f6eef to 58c8c06
  • 19:07 mutante: terbium - install multiple pending package upgrades
  • 19:04 mutante: terbium - uses ubuntu.wikimedia.org in APT sources but that does not exist anymore. replaced 'ubuntu' with 'mirrors' globally, apt-get update
  • 18:35 thcipriani@tin: Synchronized README: test sync for scap 3.5.3-1 (duration: 00m 46s)
  • 17:54 jynus: autoremoving old kernels on terbium to make room on /boot
  • 17:52 jynus: running alter table on db2044 T147747
  • 15:47 joal@tin: Finished deploy [analytics/refinery@f4a5020]: (no justification provided) (duration: 02m 33s)
  • 15:45 marostegui: Resume pt-table-checksum on idwiki (s2) - T154485
  • 15:45 joal@tin: Started deploy [analytics/refinery@f4a5020]: (no justification provided)
  • 15:44 joal@tin: Finished deploy [analytics/refinery@b4a8fcc]: (no justification provided) (duration: 00m 13s)
  • 15:44 joal@tin: Started deploy [analytics/refinery@b4a8fcc]: (no justification provided)
  • 15:35 jynus: running alter table on db1034 T147747
  • 15:28 gehel: deploying on eqiad completed - T158782
  • 15:26 elukey@tin: Finished deploy [analytics/refinery@b4a8fcc]: (no justification provided) (duration: 02m 15s)
  • 15:23 elukey@tin: Started deploy [analytics/refinery@b4a8fcc]: (no justification provided)
  • 15:18 gehel: testing a few host on codfw looks good, deploying on eqiad - T158782
  • 15:10 gehel: mw1209 looks good, deploying on codfw - T158782
  • 15:05 gehel: mwdebug1001 looks good, deploying on mw1209 - T158782
  • 14:54 gehel: starting deployment of mediawiki apache config - T158782
  • 14:31 elukey@tin: Finished deploy [analytics/refinery@33db287]: (no justification provided) (duration: 01m 13s)
  • 14:30 elukey@tin: Started deploy [analytics/refinery@33db287]: (no justification provided)
  • 14:29 dcausse: EU SWAT Done
  • 14:27 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] cleanup old A/B test (duration: 00m 40s)
  • 14:27 elukey@tin: Finished deploy [analytics/refinery@33db287]: (no justification provided) (duration: 01m 24s)
  • 14:26 elukey@tin: Started deploy [analytics/refinery@33db287]: (no justification provided)
  • 14:12 dcausse@tin: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] Test disable super_detect_noop script (duration: 00m 47s)
  • 13:16 marostegui: run pt-table-checksum on idwiki - T154485
  • 12:43 moritzm: installing apache2 security updates on mw1261
  • 12:22 godog: upgrade thumbor to 0.1.13 on thumbor100[12]
  • 11:32 jynus: running alter table on db2037 T147747
  • 11:27 moritzm: upgrading nginx on meiterium/archiva.wikimedia.org to 1.11.4 (using openssl 1.1)
  • 11:02 moritzm: uploaded lz4 0.0~r131 for jessie-wikimedia to apt.wikimedia.org (required by HHVM 3.18)
  • 09:33 jynus: running alter table on db1037 T147747
  • 09:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 after maintenance (duration: 00m 41s)
  • 09:14 marostegui: Deploy alter table s3 (all wikis) user_groups table - T155605
  • 08:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 after maintenance (duration: 00m 40s)
  • 08:18 moritzm: installing libgd2 security updates on trusty (jessie already fixed)
  • 07:05 marostegui: Deploy alter table enwiki.revision - dbstore2002 - T132416
  • 03:06 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Mar 1 03:06:24 UTC 2017 (duration 5m 46s)
  • 03:00 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.14) (duration: 13m 51s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 08m 03s)
  • 01:00 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Remove DonationInterface loading as gone from master (primarily to unbreak beta) (duration: 00m 42s)
  • 00:59 eileen1: Update CiviCRM from 04b49b0 to b3f6eef
  • 00:59 reedy@tin: Synchronized wmf-config/CommonSettings.php: Remove DonationInterface loading as gone from master (primarily to unbreak beta) (duration: 00m 40s)

2017-02-28

  • 22:29 mutante: (T157675) - delete salt keys - [neodymium:~] $ for mcnode in $(seq 2001 2016); do sudo salt-key -d mc${mcnode}.codfw.wmnet; done
  • 22:26 mutante: (T157675) - revoke puppet certs, deactivate nodes, rm from icinga. [puppetmaster1001:~] $ for mcnode in $(seq 2001 2016); do puppet node clean mc${mcnode}.codfw.wmnet && puppet node deactivate mc${mcnode}.codfw.wmnet ; done
  • 21:58 awight: update payments from 2a0c3b2 to 66d8125
  • 21:51 eileen1: update CiviCRM from a2875c5 to 04b49b0
  • 21:44 urandom: Updating RESTBase mobileapps tables (all remaining) to use time-windowed compaction
  • 21:40 maxsem@tin: Finished deploy [kartotherian/deploy@81db48c]: Second attempt at 81db48c (duration: 06m 39s)
  • 21:34 maxsem@tin: Started deploy [kartotherian/deploy@81db48c]: Second attempt at 81db48c
  • 21:23 MaxSem: Completely disabled kartotherian on maps-test2004, it just logs errors
  • 21:05 _joe_: manually installing nodejs on wasat T156922
  • 20:50 maxsem@tin: Synchronized wmf-config/wikitech.php: https://gerrit.wikimedia.org/r/#/c/340357/2 (duration: 00m 40s)
  • 20:33 urandom: Updating RESTBase mobileapps tables (phase0) to use time-windowed compaction
  • 20:30 demon@tin: Synchronized wmf-config/wikitech.php: no moar forms on wikitech (duration: 00m 39s)
  • 20:03 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.14
  • 19:34 demon@tin: Synchronized php-1.29.0-wmf.14/extensions/WikimediaEvents/modules/ext.wikimediaEvents.geoFeatures.js: Roan made me do it (duration: 00m 39s)
  • 19:26 demon@tin: Finished scap: testwiki to wmf.14 + l10n bootstrap (duration: 55m 14s)
  • 19:04 urandom: Updating RESTBase mobileapps tables (wikimedia) to uses time-windowed compaction
  • 18:31 demon@tin: Started scap: testwiki to wmf.14 + l10n bootstrap
  • 17:43 urandom: Updating RESTBase mobileapps tables (wikipedia) to uses time-windowed compaction
  • 17:11 elukey: Analytics Hadoop cluster upgraded to CDH 5.10
  • 17:09 jynus: disabling replication lag alerts on db1026 (depooled)
  • 17:05 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1001.eqiad.wmnet
  • 17:04 gehel: restarting blazegraph on wdqs1001 - T159245
  • 15:47 jynus: running alter table on db1056 T147747
  • 15:30 gehel: depooling wdqs1001 due to instability
  • 15:29 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
  • 15:23 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 for maintenance (duration: 00m 40s)
  • 14:51 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1003.eqiad.wmnet
  • 14:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1053 after maintenance (duration: 00m 39s)
  • 14:35 elukey: start the Analytics Hadoop cluster upgrade (https://etherpad.wikimedia.org/p/analytics-cdh5.10)
  • 14:32 marostegui: run pt-table-checksum on eowiki (s2) - T154485
  • 14:08 phuedx@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Make Page Previews use RESTBase on Beta Cluster (duration: 00m 42s)
  • 14:02 reedy@tin: Synchronized php-1.29.0-wmf.13/extensions/Dashiki/extension.json: Register JsonConfigModels (duration: 00m 42s)
  • 13:57 jynus: running alter table on db1036 T147747
  • 13:23 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055 after maintenance (duration: 00m 39s)
  • 13:03 Reedy: ran namespaceDupes on meta to fix some Config pages
  • 12:48 _joe_: flushed memcached in codfw, restarting hhvm on appserver to flush APC in order to test warmup script
  • 11:47 gehel: restarting wdqs-blazegraph on wdqs1003
  • 11:40 gehel: depooling wdqs1003 for investigation (high 5xx rate)
  • 11:40 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1003.eqiad.wmnet
  • 11:22 jynus: running alter table on db1053 T147747
  • 11:18 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 for maintenance (duration: 00m 40s)
  • 10:56 elukey: restart zookeeper on conf1002
  • 10:53 marostegui: run pt-table-checksum on enwiktionary (s2) - T154485
  • 10:35 elukey: restar zookeeper on conf1003
  • 10:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 for maintenance (duration: 00m 39s)
  • 10:23 marostegui: run pt-table-checksum on enwikiquote (s2) - T154485
  • 10:09 marostegui: Deploy alter table s2 on all wikis for table user_groups - T155605
  • 10:00 elukey: restart zookeeper on conf1001
  • 09:47 jynus: running alter table on db1055 T147747
  • 09:43 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 for maintenance (duration: 00m 40s)
  • 09:38 marostegui: Deploy alter table s7 on all wikis for table user_groups - T155605
  • 09:06 jynus: running alter table on db2042 T147747
  • 09:03 marostegui: Deploy alter table s1 (enwiki).user_groups - T155605
  • 08:59 marostegui: run pt-table-checksum on cswiki (s2) - T154485
  • 08:43 moritzm: installing python-crypto security updates
  • 08:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 after maintenance (duration: 00m 40s)
  • 08:37 hashar: nodepool deleted alien instances 541585 541586 and 541587
  • 08:35 marostegui: Deploy alter table s6 (frwiki,jawiki,ruwiki).user_groups - T155605
  • 08:24 marostegui: run pt-table-checksum on bgwiktionary (s2) - T154485
  • 08:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after maintenance (duration: 00m 41s)
  • 08:18 marostegui: Deploy alter table s5 wikidatawiki.user_groups - T155605
  • 08:15 marostegui: Deploy alter table s5 dewiki.user_groups - T155605
  • 07:41 marostegui: Deploy alter table s4.user_groups - T155605
  • 07:12 marostegui: run pt-table-checksum on bgwiki (s2) - T154485
  • 07:00 marostegui: Deploy alter table enwiki.revision db2034 - T132416
  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 28 02:35:56 UTC 2017 (duration 5m 20s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 11m 40s)
  • 02:18 mutante: rsyncing prometheus metrics data from bast3001 to bast3002 (T156506)
  • 01:42 mutante: mw1198 - restart hhvm
  • 01:01 demon@tin: Synchronized scap/plugins/clean.py: No-op, more cleanups for clean.py (duration: 00m 42s)
  • 00:33 ebernhardson: restart elasticsearch on relforge1002, putting too much load on the machine got it stuck in a GC spiral with 1minute+ collections
  • 00:29 ebernhardson: restart elasticsearch on relforge1001, putting too much load on the machine got it stuck in a GC spiral with 1minute+ collections
  • 00:15 demon@tin: Synchronized php-1.29.0-wmf.13/extensions/MobileFrontend/resources/skins.minerva.base.styles/ui.less: Fix the incorrect magnify glass icon position in lang search (duration: 00m 39s)
  • 00:13 demon@tin: Synchronized php-1.29.0-wmf.13/extensions/Nuke/Nuke_body.php: Move back to old caller names (duration: 00m 43s)
  • 00:09 demon@tin: Synchronized wmf-config/CommonSettings.php: Enable editmyoptions right for all users on loginwiki (duration: 00m 41s)

2017-02-27

  • 23:50 demon@tin: Finished scap: Enabling Dashiki on meta (duration: 20m 46s)
  • 23:29 demon@tin: Started scap: Enabling Dashiki on meta
  • 23:17 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 48s)
  • 22:37 otto@tin: Finished deploy [eventstreams/deploy@76c763e]: Deploying swagger-ui /?doc endpoint (duration: 01m 45s)
  • 22:36 otto@tin: Started deploy [eventstreams/deploy@76c763e]: Deploying swagger-ui /?doc endpoint
  • 22:34 otto@tin: Finished deploy [eventstreams/deploy@76c763e]: Deploying /?doc swagger-ui endpoint only to scb2001 (duration: 00m 17s)
  • 22:34 otto@tin: Started deploy [eventstreams/deploy@76c763e]: Deploying /?doc swagger-ui endpoint only to scb2001
  • 22:10 otto@tin: Finished deploy [eventstreams/deploy@2f73b52]: Deploying /?doc swagger-ui endpoint only to scb2001 (duration: 00m 18s)
  • 22:10 otto@tin: Started deploy [eventstreams/deploy@2f73b52]: Deploying /?doc swagger-ui endpoint only to scb2001
  • 21:42 bsitzmann@tin: Finished deploy [mobileapps/deploy@872a615]: Update mobileapps to c924126 (duration: 03m 14s)
  • 21:39 bsitzmann@tin: Started deploy [mobileapps/deploy@872a615]: Update mobileapps to c924126
  • 21:16 mutante: ganglia - switching esams aggregator to bast3002 - except short gaps in esams graphs
  • 20:51 robh: disabled puppet on einstienium for icinga update of config
  • 18:26 gehel: restarting wdqs-updater on all wdqs servers
  • 18:25 gehel@tin: Finished deploy [wdqs/wdqs@daca9b3]: (no justification provided) (duration: 01m 39s)
  • 18:24 gehel: redeploying wdqs (previous deploy was not latest version)
  • 18:24 gehel@tin: Started deploy [wdqs/wdqs@daca9b3]: (no justification provided)
  • 18:18 awight: update civicrm from 20660c4 to a2875c5
  • 18:14 gehel: restarting wdqs-updater on all wdqs servers
  • 18:14 gehel@tin: Finished deploy [wdqs/wdqs@62354ed]: (no justification provided) (duration: 00m 52s)
  • 18:13 gehel@tin: Started deploy [wdqs/wdqs@62354ed]: (no justification provided)
  • 18:12 gehel@tin: Finished deploy [wdqs/wdqs@62354ed]: log (duration: 00m 12s)
  • 18:12 ema: temporarily bumping timeout_idle to 120s on cache_misc T154558
  • 18:12 gehel@tin: Started deploy [wdqs/wdqs@62354ed]: log
  • 18:04 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1001.eqiad.wmnet
  • 16:08 jynus: starting schema change on db1051 T147747
  • 16:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 for maintenance (duration: 00m 40s)
  • 15:55 jynus: starting schema change on db2038 T147747
  • 14:58 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: name=eqiad
  • 14:42 Dereckson: Fix namespace dupes pages on ext.wikipedia (T158914)
  • 14:30 hashar: European SWAT done. Pushed https://gerrit.wikimedia.org/r/#/c/339446/ and https://gerrit.wikimedia.org/r/#/c/339348/
  • 14:29 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: New namespace aliases for itwikiversity - T158775 (duration: 00m 43s)
  • 14:13 moritzm: installed apache2 security updates on mwdebug*
  • 14:10 aude@tin: Synchronized wmf-config/Wikibase-production.php: Disable geo-shape datatype on wikidata for now (duration: 00m 41s)
  • 13:58 marostegui: Manually deploy gtid_domain_id on s2 - T149418
  • 13:06 elukey: restart zookeeper on conf2003
  • 12:39 elukey: restart zookeeper on conf2002
  • 12:14 _joe_: reissuing the certificate for etcd.codfw.wmnet due to a previous error
  • 12:00 elukey: rebooting mw2092 due to puppet errors for mw-cgroup - T151427
  • 11:58 volans: re-enabled icinga-wm
  • 11:37 ema: cp1052 repooled T148891
  • 11:19 elukey: zookeeper status report - new changes rolled out to druid nodes and conf2001 - conf1* and conf200[23] still pending, waiting for more metrics before proceeding
  • 11:09 volans: temporarily stopped ircecho (icinga-wm)
  • 11:04 ema: rebooting cp1052 into kernel 4.4.2-3+wmf8 T148891
  • 10:49 moritzm: uploaded apache2 2.4.10-10+deb8u8+wmf1 to apt.wikimedia.org (rebase of local patches on top on latest DSA)
  • 10:34 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: wme: Set ReadingDepth sampling rate to 0.1% - T155639 (duration: 00m 40s)
  • 10:31 elukey: limiting the Zookeeper Maximum heap size to 1G (https://gerrit.wikimedia.org/r/#/c/337797/) - setting applied gradually to Zookeeper on Druid and Conf* hosts
  • 10:11 _joe_: upgrading conftool to 0.4.0 across the cluster T149617
  • 10:03 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1045 for maintenance (duration: 00m 43s)
  • 09:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 after maintenance with full weight (duration: 00m 39s)
  • 08:42 _joe_: upload conftool 0.4.0 to trusty-wikimedia
  • 08:42 _joe_: promote conftool 0.4.0 to jessie-wikimedia main
  • 07:59 marostegui: Run pt-table-checksum on s2 (nlwiki) on revision table - T154485
  • 07:29 marostegui: Deploy alter table enwiki.revision - db2034 - T132416
  • 07:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2034 - T132416 (duration: 00m 40s)
  • 07:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 - T132416 (duration: 00m 40s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 27 02:25:10 UTC 2017 (duration 5m 21s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 23s)

2017-02-26

  • 17:10 Reedy: ran namespaceDupes for extwiki
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 26 02:25:07 UTC 2017 (duration 5m 21s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 26s)

2017-02-25

  • 20:06 elukey: depooled cp2017 (via local sudo -i depool command) since the host froze (it got back after a powercycle)
  • 19:54 elukey: powercycled cp2017, mgmt console stuck
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 25 02:25:10 UTC 2017 (duration 5m 21s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 20s)
  • 01:43 mutante: bast3002 - sign puppet cert, initial run with basic "bastion" role, to replace broken bast3001, but WIP, ganglia/prometheus roles not moved yet (T156506)

2017-02-24

  • 22:46 Krinkle: (terbium) sql --write mediawikiwiki 'DELETE FROM module_deps' (in batches of 500; 42292 rows affected) - per T158105.
  • 22:28 smalyshev@tin: Finished deploy [wdqs/wdqs@62354ed]: Deploy new updater on 1001 for timeout increase (duration: 00m 16s)
  • 22:27 smalyshev@tin: Started deploy [wdqs/wdqs@62354ed]: Deploy new updater on 1001 for timeout increase
  • 22:23 smalyshev@tin: Finished deploy [wdqs/wdqs@62354ed]: Deploy new updater on 2001 for testing (duration: 00m 26s)
  • 22:23 smalyshev@tin: Started deploy [wdqs/wdqs@62354ed]: Deploy new updater on 2001 for testing
  • 20:50 ebernhardson: restart elasticsearch on logstash1002
  • 20:05 demon@tin: Synchronized wmf-config/wikitech.php: (no justification provided) (duration: 00m 48s)
  • 19:30 Pchelolo: restarting RESTBase on xenon.eqiad.wmnet in staging
  • 17:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1026 after maintenance with low load (duration: 00m 40s)
  • 16:55 volans: manually cleaning ferm leftovers on dbproxy1011 - T158798
  • 15:35 ema: temporarily bumping timeout_idle to 60s on cache_misc T154558
  • 14:27 volans: re-started and re-armed keyholder after upgrade on: mira.codfw.wmnet,neodymium.eqiad.wmnet,sarin.codfw.wmnet,tin.eqiad.wmnet T158660 T158659
  • 10:41 ema: cache_misc: upgrading to varnish 4.1.5
  • 10:30 moritzm: installing imagemagick regression update for security update on trusty (the Debian update seems unaffected)
  • 10:23 moritzm: installing spice updates on trusty
  • 09:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1036 - T154485 (duration: 00m 40s)
  • 09:39 elukey: stop Redis and Memcached on mc2001->mc2016 as extra precautionary step before decom - T157675
  • 08:44 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
  • 08:16 volans: temporary disabled puppet on neodymium and sarin to deploy Gerrit 339183 - T158753
  • 07:32 marostegui: Deploy alter table enwiki.revision on db2070 - T132416
  • 07:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 and depool db2070 - T132416 (duration: 00m 45s)
  • 02:32 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 24 02:32:21 UTC 2017 (duration 5m 22s)
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 07m 02s)
  • 00:26 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Store goodfaith scores in the ORES tables T137966 (duration: 00m 40s)
  • 00:17 mobrovac: restbase deploying b477ab46

2017-02-23

  • 21:11 dereckson@tin: Finished scap: Full scap to deploy new l10n keys on wikitech (gerrit:339456), take two (duration: 22m 55s)
  • 20:49 dereckson@tin: Started scap: Full scap to deploy new l10n keys on wikitech (gerrit:339456), take two
  • 20:48 dereckson@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap"; owner is "dereckson"; reason is "Full scap to deploy new l10n keys on wikitech (gerrit:339456)" (duration: 00m 00s)
  • 20:46 dereckson@tin: Started scap: Full scap to deploy new l10n keys on wikitech (gerrit:339456)
  • 20:04 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.13
  • 19:47 dereckson@tin: Synchronized php-1.29.0-wmf.13/extensions/WikimediaMessages/extension.json: Create user group messages for wikitech.wikimedia.org (T158417) (duration: 00m 39s)
  • 19:45 dereckson@tin: Synchronized php-1.29.0-wmf.13/extensions/WikimediaMessages/i18n/wikitech/: (no justification provided) (duration: 00m 43s)
  • 18:29 chasemp: labnodepool1001:~# service nodepool restart
  • 17:40 gehel: removing old prod indices from relforge1002 - T156150
  • 17:37 gehel: removing old prod indices from relforge1002 (jawikiprod_content, enprodwiki_content, ruwikiprod_content) - T156150
  • 16:33 paravoid: cleaning up openstack packages from einstenium & tegment
  • 16:19 gehel: starting upgrade relforge cluster to elasticsearch 5.2.1 - expect significant downtime - T156150
  • 16:19 gehel: unban relforge1001 - T156150
  • 15:45 gehel: banning relforge1001 from clsuter to prepare for ES5 upgrade - T156150
  • 15:18 godog: roll-restart pybal in codfw to pick up swift https service
  • 15:08 marostegui: Power off dbstore1001 to change its disks and reimage - T153768
  • 14:42 addshore: addshore@tin scap clean 1.29.0-wmf.6 && scap clean 1.29.0-wmf.7 (to remove warning on scap pull on mwdebug1002, T157030)
  • 14:39 addshore: EU SWAT done
  • 14:39 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T158832 nable TwoColConflict on hewiki (duration: 00m 40s)
  • 14:29 addshore@tin: Synchronized php-1.29.0-wmf.13/extensions/ContentTranslation/ContentTranslation.hooks.php: SWAT T158297 Really disable europeana2802016 campaign (duration: 00m 39s)
  • 14:26 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT T156794 Enable v2 of Minerva's header on cawiki and itwiki (duration: 00m 42s)
  • 14:18 paravoid: upgrading grafana to 4.1 on krypton
  • 13:52 gehel: restart logstash on relforge1001 to test logging configuration - T158664
  • 13:03 ema: cache_maps: upgrading to varnish 4.1.5
  • 12:40 moritzm: installing libssh security updates on trusty (jessie already fixed)
  • 12:40 moritzm: installing libssh security updates (jessie already fixed)
  • 12:35 moritzm: installing tomcat updates
  • 09:39 elukey: increase cassandra system_auth replication from 6 to 12 on AQS
  • 09:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 - T154485 (duration: 00m 40s)
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 - T154485 (duration: 00m 40s)
  • 09:06 _joe_: uploaded conftool 0.4.0 to jessie-wikimedia experimental
  • 08:54 marostegui: Stop pt-table-checksum on nlwiki.revision - T154485
  • 08:51 marostegui: Run pt-table-checksum on s2 (nlwiki) on revision table - T154485
  • 07:59 marostegui: Run pt-table-checksum on s2 (nlwiki) on logging table - T154485
  • 07:16 marostegui: Deploy alter table enwiki.revision db2069 - T132416
  • 07:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 and depool db2069 - T132416 (duration: 00m 42s)
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1060 original load - T158194 (duration: 00m 40s)
  • 03:02 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 23 03:02:10 UTC 2017 (duration 5m 47s)
  • 02:56 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 14m 38s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 08m 40s)
  • 00:19 dereckson@tin: Synchronized wmf-config/throttle.php: Throttle rule for it.wikiversity (T158767) (duration: 00m 40s)
  • 00:18 Krinkle: mwscript deleteEqualMessages.php --wiki simplewikibooks (T45917)

2017-02-22

  • 23:46 ejegg: turned off 3DS requirement for Denmark on payments-wiki
  • 23:17 matt_flaschen: Exported https://meta.wikimedia.org/wiki/Talk:Flow/Developer_test_page to https://meta.wikimedia.org/wiki/Talk:Flow/Developer_test_page/Wikitext using extensions/Flow/maintenance/convertToText.php
  • 23:17 matt_flaschen: Migrated https://meta.wikimedia.org/wiki/Research_talk:ORES_paper to https://www.mediawiki.org/wiki/Talk:ORES/Paper using extensions/Flow/maintenance/dumpBackup.php and importDump.php
  • 22:53 Pchelolo: update RESTBase to 3340714f0
  • 22:52 jynus: stopping dbstore1001 mariadb in preparation for tomorrow's reimage T153768
  • 22:50 Pchelolo: update RESTBase to 3340714f0: canary on restbase1007
  • 22:46 Pchelolo: update RESTBase to 3340714f0: staging
  • 21:57 maxsem@tin: Finished deploy [kartotherian/deploy@81db48c]: Deploying https://gerrit.wikimedia.org/r/#/c/339093/ (duration: 15m 05s)
  • 21:42 maxsem@tin: Started deploy [kartotherian/deploy@81db48c]: Deploying https://gerrit.wikimedia.org/r/#/c/339093/
  • 20:30 demon@tin: Finished scap: group1 to wmf.13 (duration: 25m 39s)
  • 20:04 demon@tin: Started scap: group1 to wmf.13
  • 20:02 gehel@tin: Finished deploy [wdqs/wdqs@7768422]: (no justification provided) (duration: 02m 04s)
  • 19:59 gehel@tin: Started deploy [wdqs/wdqs@7768422]: (no justification provided)
  • 19:56 gehel: deploying latest wdqs version
  • 19:46 godog: roll-HUP rsyslog on mw1* to pick up DNS udplog change - T123728
  • 19:45 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Finish removing "shellmanagers" on Wikitech T158482 (duration: 00m 40s)
  • 19:37 thcipriani@tin: Synchronized php-1.29.0-wmf.13/extensions/Flow: SWAT: Import dump: support importing a board that exist in the farm T154830 (duration: 00m 56s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removing the "shellmanagers" group from Wikitech T158482 (duration: 00m 49s)
  • 19:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configuration changes for wikitech.wikimedia.org T158516 T158554 T158482 (duration: 00m 40s)
  • 18:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 with less weight - T158194 (duration: 00m 39s)
  • 18:17 Dereckson: Last two deployment entries were to rollback portals/ to last known state (T158782)
  • 18:17 dereckson@tin: Synchronized portals: (no justification provided) (duration: 00m 39s)
  • 18:17 dereckson@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 16:29 gehel: reimage of relforge1001 starting
  • 16:21 marostegui: Shutdown db1060 for BBU replacement - T158194
  • 16:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 - T158194 (duration: 00m 40s)
  • 16:19 ema: cp3006 upgraded to varnish 4.1.5
  • 16:15 ema: cp4019 upgraded to varnish 4.1.5
  • 15:48 moritzm: installing tcpdump security updates on ubuntu systems (jessie already fixed for a while)
  • 15:43 jynus: stopping mariadb replication on db1026 for maintenance T147747
  • 15:21 marostegui: Restart MySQL on db1095 to apply new replication filters - https://phabricator.wikimedia.org/T154485
  • 15:16 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 for maintenance (duration: 00m 41s)
  • 15:11 marostegui: Restart MySQL on db1069 to apply new replication filters - https://phabricator.wikimedia.org/T154485
  • 14:50 zeljkof: finished EU SWAT
  • 14:49 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T158762) (duration: 00m 41s)
  • 14:30 gehel: resetting to usual values for low/high watermark on elasticsearch eqiad (75% / 80%)
  • 14:17 hashar: Nuked Jenkins workspaces for the job operations-puppet-typos
  • 14:17 zfilipin@tin: Synchronized dblists/compact-language-links.dblist: SWAT: Deploy Compact Language Links in Swedish Wikipedia (T157114) (duration: 00m 50s)
  • 14:17 gehel: temporary raising high/low watermarks on elasticsearch eqiad to allow allocation of all shards
  • 14:04 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic1047.eqiad.wmnet
  • 12:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 with low load (3rd time a charm) (duration: 00m 39s)
  • 12:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 with low load (again) (duration: 02m 47s)
  • 12:18 dcausse: rebuild of translation memories index is done
  • 12:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 with low load (duration: 02m 49s)
  • 12:03 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(45|46).eqiad.wmnet
  • 11:48 paravoid: upgrading labmon1001 to grafana 4.1
  • 10:55 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(45|46|47).eqiad.wmnet
  • 10:54 moritzm: upgrading remaining mediawiki servers to HHVM 3.12.14
  • 10:54 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(35|37|39|43|44).eqiad.wmnet
  • 10:42 elukey: reinstall mw211[89] as MW videoscalers (trusty) and mw2243 as MW jobrunner
  • 10:05 filippo@tin: Synchronized wmf-config/ProductionServices.php: Move udp2log from fluorine to mwlog1001 - T123728 (duration: 00m 41s)
  • 10:01 hashar: enabling puppet on contint1001 and running it
  • 09:56 volans: restarting salt-master on neodymium after openssl upgrade
  • 09:37 ema: cache_text, cache_upload: libssl1.1 upgraded to 1.1.0e-1+wmf1, libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 09:28 hashar: disable puppet on contint1001. Will use contint2001 as a canary
  • 09:14 marostegui: Run pt-table-checksum on s2.nlwiki over some tables - T154485
  • 09:04 dcausse: rebuilding translation memories index - ETA ~4hours (from terbium, logs in ~dcausse/ttm-refresh)
  • 09:02 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(35|39|43|44).eqiad.wmnet
  • 08:07 moritzm: upgrading openssl on redis clusters / various base service restarts
  • 07:44 gehel: restart elasticsearch on elastic1035
  • 07:43 gehel: trncating logs on elastic10(35|39|44)
  • 07:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 - T132416 (duration: 00m 40s)
  • 07:23 marostegui: Deploy alter table enwiki.revision db2062 - T132416
  • 07:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 - T132416 (duration: 00m 40s)
  • 03:10 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 22 03:10:13 UTC 2017 (duration 5m 46s)
  • 03:04 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.13) (duration: 13m 54s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 11m 48s)
  • 01:03 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ORES review tool in cswiki T151611 (duration: 00m 39s)
  • 00:48 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Send "exception" channel to logstash Do not send "exception-json" channel to logstash T136849 (duration: 00m 40s)
  • 00:34 thcipriani@tin: Synchronized wmf-config: SWAT: Set $wgSoftBlockRanges T154698 PART II (duration: 00m 42s)
  • 00:33 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgSoftBlockRanges T154698 PART I (duration: 00m 40s)
  • 00:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Fix SiteConfiguration array merge syntax T157656 Fix Sentry URL scheme on beta Fix PageViewInfo config T158698 (beta-only changes) (duration: 00m 39s)
  • 00:17 thcipriani@tin: Synchronized php-1.29.0-wmf.12/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Turn off sister search AB test. T157942 (duration: 00m 39s)
  • 00:16 thcipriani@tin: Synchronized php-1.29.0-wmf.13/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: SWAT: Turn off sister search AB test. T157942 (duration: 00m 43s)
  • 00:13 smalyshev@tin: Finished deploy [wdqs/wdqs@7768422]: Deploy 2.1.5RC WAR on 2001 for testing (duration: 00m 25s)
  • 00:13 smalyshev@tin: Started deploy [wdqs/wdqs@7768422]: Deploy 2.1.5RC WAR on 2001 for testing
  • 00:05 demon@tin: Synchronized scap/plugins/clean.py: More code cleanup (duration: 00m 40s)

2017-02-21

  • 23:30 MaxSem: Kartotherian deploy did not happen
  • 23:22 demon@tin: Synchronized scap/plugins/clean.py: Code cleanup (duration: 00m 46s)
  • 23:21 demon@tin: scap aborted: scap/plugins/clean.py Code cleanup (duration: 00m 10s)
  • 23:21 demon@tin: Started scap: scap/plugins/clean.py Code cleanup
  • 22:01 mutante: carbon - removed from icinga, shutdown -h now (T158020)
  • 21:31 mutante: carbon - puppet node clean, node deactivate (T158020)
  • 21:10 demon@tin: Synchronized scap/plugins/prep.py: Completeness (duration: 00m 42s)
  • 20:48 Krinkle: (terbium) sql --write test2wiki 'DELETE FROM module_deps' (3687 rows affected, 0.01 sec) - per T158105.
  • 20:47 Krinkle: (terbium) sql --write testwiki 'DELETE FROM module_deps' (per T158105)
  • 20:44 mutante: carbon - backup /root data to install1002:/root/root-carbon/ before shutdown (T158020)
  • 20:36 mutante: rsyncing /home/ dirs excl. dot files, from carbon to install1002 (T158020)
  • 20:15 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(35|39|43|44).eqiad.wmnet
  • 20:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.13
  • 19:50 demon@tin: Finished scap: prime wmf.13 - testwiki plus l10n build (pt 3 because ugh) (duration: 17m 17s)
  • 19:32 demon@tin: Started scap: prime wmf.13 - testwiki plus l10n build (pt 3 because ugh)
  • 19:32 demon@tin: scap failed: RuntimeError 2 test canaries had check failures (rerun with --force to override this check) (duration: 15m 00s)
  • 19:17 demon@tin: Started scap: prime wmf.13 - testwiki plus l10n build (pt 2 because T156851)
  • 19:16 demon@tin: Finished scap: prime wmf.13 - testwiki plus l10n build (duration: 26m 15s)
  • 18:49 demon@tin: Started scap: prime wmf.13 - testwiki plus l10n build
  • 18:45 moritzm: installing PHP security updates on iridium (phabricator.wikimedia.org)
  • 18:36 ppchelko@tin: Finished deploy [changeprop/deploy@4706f9d]: Change-Prop: Make ORES return minified responses T157693 (duration: 00m 55s)
  • 18:35 ppchelko@tin: Started deploy [changeprop/deploy@4706f9d]: Change-Prop: Make ORES return minified responses T157693
  • 18:34 Pchelolo: changeprop deploy 4706f9da
  • 18:14 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(27|32|34|38|41).eqiad.wmnet
  • 18:12 godog: roll-restart nodepool on labnodepool1001 to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 18:12 godog: roll-restart zuul on cont1001 to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 18:04 godog: roll-restart eventstreams in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 18:03 godog: roll-restart trendingedits in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:58 demon@tin: Synchronized tests/multiversion/MWMultiVersionTest.php: No op in prod, completeness, etc (duration: 00m 40s)
  • 17:57 demon@tin: Synchronized multiversion/MWMultiVersion.php: Shut up dumb invalid hostname errors (duration: 00m 52s)
  • 17:50 godog: roll-restart ocg in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:47 godog: roll-restart jmxtrans in codfw/eqiad on conf* to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:35 godog: roll-restart parsoid in codfw/eqiad to pick up statsd.eqiad.wmnet DNS changes - T157022
  • 17:35 Amir1: done restarting ores services
  • 17:20 Amir1: restarting ores uwsgi and celery services in scb nodes
  • 16:59 ema: cache_misc, cache_maps: libssl1.1 upgraded to 1.1.0e-1+wmf1, libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 16:37 gehel: restarting elasticsearch on elastic1030
  • 16:34 gehel: truncating elasticsearch logs on elastic1023
  • 16:31 gehel: truncating elasticsearch logs on elastic1030
  • 16:18 dcausse: truncated main elastic log, daemon.log and syslog on elastic1023
  • 16:08 moritzm: restarting apache on uranium for openssl update
  • 16:06 dcausse: truncated main log file on elastic1030
  • 15:50 gehel: restarting wdqs-updater on wdqs1002
  • 15:40 elukey: restart eventlogging on kafka200[123] for openssl upgrades
  • 15:40 godog: restart navtiming ve asset-check statsd-mw-js-deprecate on hafnium to pick up statsd.eqiad.wmnet change - T157022
  • 15:39 elukey: restart jmxtrans on kafka[12]00[123] for T157022
  • 15:34 mobrovac@tin: Started restart [mobileapps/deploy@cd3b897]: Restarting for Graphite DNS switch T157022
  • 15:32 elukey: correction on my previous entry: restart eventlogging on kafka100[123] for openssl upgrades
  • 15:30 mobrovac@tin: Started restart [graphoid/deploy@da37386]: Restarting for Graphite DNS switch T157022
  • 15:22 elukey: restart eventlogging on kafka200[123] for openssl upgrades
  • 15:21 mobrovac@tin: Started restart [cxserver/deploy@0e4ae4f]: Restarting for Graphite DNS switch T157022
  • 15:20 moritzm: rolling restart of swift frontend servers to pick up openssl update
  • 15:19 mobrovac@tin: Started restart [citoid/deploy@95df861]: Restarting for Graphite DNS switch T157022
  • 15:18 mobrovac@tin: Started restart [mathoid/deploy@ba3217e]: Restarting for Graphite DNS switch T157022
  • 15:17 hashar: European SWAT complete
  • 15:17 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/UniversalLanguageSelector/UniversalLanguageSelector.hooks.php: Fix site picks: missing from globals (duration: 01m 00s)
  • 15:12 gehel: restarting kartotherian / tilerator(ui) on maps1*
  • 15:09 gehel: restarting kartotherian / tilerator(ui) on maps2*
  • 15:06 gehel: restarting kartotherian / tilerator(ui) on maps-test*
  • 15:06 godog: roll-restart restbase after statsd move to graphite1001 - T157022
  • 15:06 elukey: Increased manually maximum httpd keep alive requests and timeout on bohrium (piwik) - T154558
  • 14:56 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(33|34|38|42).eqiad.wmnet
  • 14:43 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php: Add a maintenance script for opt-in T133031 (duration: 00m 41s)
  • 14:35 moritzm: upgrading openssl on logstash cluster / various base service restarts
  • 14:29 dcausse: truncated main log file on elastic1030
  • 14:29 hashar@tin: Synchronized portals: (no justification provided) (duration: 00m 40s)
  • 14:29 moritzm: restarting NTP servers on dns_recursors to pick up openssl update (one by one)
  • 14:28 hashar@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 14:25 moritzm: upgrading openssl on memcached clusters / various base service restarts
  • 14:22 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict extension on arwiki - T158493 (duration: 00m 40s)
  • 14:13 hashar@tin: Synchronized portals: (no justification provided) (duration: 00m 41s)
  • 14:12 hashar@tin: Synchronized portals/prod/wikipedia.org/assets: (no justification provided) (duration: 00m 40s)
  • 14:10 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/UniversalLanguageSelector/: Fix broken site picks feature for compact language links (duration: 01m 04s)
  • 14:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ReadingDepth logging on Wikipedias - T148262 T155639 (duration: 00m 45s)
  • 14:00 moritzm: upgrading openssl on maps clusters / various base service restarts
  • 13:41 elukey: restarting nodejs on aqs1* to pick up openssl security upgrades
  • 13:21 moritzm: upgrading openssl on aqs cluster / various base service restarts
  • 13:06 moritzm: upgrading openssl on parsoid clusters / various base service restarts
  • 12:55 moritzm: upgrading openssl on database servers / various base service restarts
  • 12:53 volans: re-enabled puppet on neodymium and puppetmaster1001 after Gerrit 330436 was merged T154588
  • 12:51 volans: re-enabled puppet on planet2001, was disabled since a week without reason
  • 12:39 volans: reenabled ircecho aftrer fixing ferm issue and run puppet on affected hosts
  • 12:08 volans: stopped ircecho temporarily while fixing ferm
  • 12:01 volans: temporarily disabled puppet on neodymium and puppetmaster1001 to merge Gerrit 330436 T154588
  • 11:32 moritzm: upgrading openssl on kafka clusters / various base service restarts
  • 11:15 moritzm: upgrading openssl on restbase clusters / various base service restarts
  • 11:05 moritzm: upgrading openssl on hadoop cluster / various base service restarts
  • 11:02 elukey: rolling restart of cassandra-metrics-collector on aqs1* for T157022
  • 10:55 elukey: rolling restart of the analyics jmxtrans daemons for T157022
  • 10:29 moritzm: restarting base services on mw2* after openssl update
  • 10:14 godog: downgrade carbon-c-relay on graphite1001 to trusty's version and bounce daemons
  • 09:58 moritzm: upgrading mira/tin to HHVM 3.12.14
  • 09:46 godog: upgrade graphite on graphite1001 and bounce carbon daemons
  • 09:26 ema: cp3030: libssl1.1 upgraded to 1.1.0e-1+wmf1, libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 08:53 godog: switch statsd/graphite DNS to graphite1001 - T157022
  • 08:32 moritzm: upgrading mw1170-mw1208 to HHVM 3.12.14
  • 08:30 gehel: increasing concurrent recoveries / relocations to 8 on elasticsearch eqiad
  • 08:24 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(27|32|37|41).eqiad.wmnet
  • 07:31 marostegui: Deploy alter table enwiki.revision db2055 - T132416
  • 07:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2048 and depool db2055 - T132416 (duration: 00m 51s)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 21 02:24:37 UTC 2017 (duration 5m 20s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 07m 20s)
  • 01:17 tstarling@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 42s)

2017-02-20

  • 20:31 gehel: taking threaddumps and restarting elastic1017 (high load)
  • 20:20 gehel: reducing concurrent recoveries / relocations to 4 on elasticsearch eqiad
  • 19:07 ariel@tin: Finished deploy [dumps/dumps@9757356]: fix retries of page content dumps with checkpoint, no dup ranges (duration: 00m 02s)
  • 19:07 ariel@tin: Started deploy [dumps/dumps@9757356]: fix retries of page content dumps with checkpoint, no dup ranges
  • 18:30 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(27|32|37|41).eqiad.wmnet
  • 18:29 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(26|31|36|40).eqiad.wmnet
  • 17:56 ppchelko@tin: Finished deploy [changeprop/deploy@30873eb]: Update change-prop to 30873ebd5: enabling DNS caching for T158338 (duration: 01m 41s)
  • 17:54 ppchelko@tin: Started deploy [changeprop/deploy@30873eb]: Update change-prop to 30873ebd5: enabling DNS caching for T158338
  • 17:52 Pchelolo: update change-prop to 30873ebd5
  • 16:40 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(26|31|36|40).eqiad.wmnet
  • 14:55 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1002.eqiad.wmnet
  • 14:49 ema: cp2002, cp4008: libssl1.1 upgraded to 1.1.0e-1+wmf1 and libevent-2.0-5 upgraded to 2.0.21-stable-2+deb8u1
  • 14:32 ema: upgrading pinkunicorn to varnish 4.1.5-1wm1
  • 14:30 ema: varnish 4.1.5-1wm1 uploaded to apt.w.o
  • 14:10 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(25|28|29|30).eqiad.wmnet
  • 13:52 gehel: resetting ownership of new .wsp files for wdqs1002 on graphite[12]001
  • 13:49 moritzm: installing remaining lcms security updates
  • 13:41 hashar@tin: Synchronized wmf-config/throttle.php: [throttle] New rule - T158312 (duration: 00m 42s)
  • 13:35 marostegui: Transferring dbstore1001:/srv/backups (the last 2 backups) to dbstore2001:/srv/backup/dbstore1001 - T153768
  • 13:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(25|28|29|30).eqiad.wmnet
  • 13:04 moritzm: installing jasper security updates
  • 12:20 godog: remove syslog from graphite1001, bump max open files for carbon-c-relay
  • 11:00 godog: switch diamond traffic to graphite1001 - T157022
  • 10:54 moritzm: rolling restart of nginx on remaining mediawiki servers in eqiad to pick up openssl update
  • 10:26 ariel@tin: Finished deploy [dumps/dumps@dee43ca]: fix prefetch on retries of partially complete page content dumps (duration: 00m 02s)
  • 10:26 ariel@tin: Started deploy [dumps/dumps@dee43ca]: fix prefetch on retries of partially complete page content dumps
  • 10:24 hashar@tin: Synchronized wmf-config/throttle.php: Add new throttle rule - T158432 (duration: 00m 49s)
  • 09:46 moritzm: upgrading mediawiki servers in codfw to HHVM 3.12.14
  • 09:33 marostegui: Manually deploy gtid_domain_id on s6 hosts - T149418
  • 08:47 gehel: restarting diamond on wdqs1002 after initial data import
  • 08:41 marostegui: Increase 100G dbstore1002 lv /dev/mapper/tank-data
  • 08:17 ariel@tin: Finished deploy [dumps/dumps@d50e129]: cleanup tmp files before checkpoint file rerun (duration: 00m 02s)
  • 08:17 ariel@tin: Started deploy [dumps/dumps@d50e129]: cleanup tmp files before checkpoint file rerun
  • 07:39 marostegui@tin: Synchronized wmf-config/db-codfw.php: Update ticket number for db2048 depool reason (duration: 00m 44s)
  • 07:29 marostegui: Deploy alter table on db2048 enwiki.revision - T132416
  • 07:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2048 - T132416 (duration: 00m 41s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 20 02:25:07 UTC 2017 (duration 5m 19s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 07m 53s)

2017-02-19

  • 20:08 ariel@tin: Finished deploy [dumps/dumps@364470e]: fix private table dumping, report failed runs correctly (duration: 00m 03s)
  • 20:08 ariel@tin: Started deploy [dumps/dumps@364470e]: fix private table dumping, report failed runs correctly
  • 15:54 hashar: Restarted Zuul to clear out a stall function in Gearman server.
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 19 02:37:07 UTC 2017 (duration 5m 18s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 13m 08s)

2017-02-18

  • 18:00 reedy@tin: Synchronized wmf-config/CommonSettings.php: Rv reservedusernames addition from CS (duration: 00m 42s)
  • 17:59 reedy@tin: Synchronized php-1.29.0-wmf.12/includes/DefaultSettings.php: Unknown user to reserved usernames in defaultsettings (duration: 00m 45s)
  • 17:29 reedy@tin: Synchronized wmf-config/CommonSettings.php: Add Unknown user to reservedusernames (duration: 00m 48s)
  • 02:23 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 18 02:23:52 UTC 2017 (duration 5m 20s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 06m 41s)

2017-02-17

  • 21:34 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic1023.eqiad.wmnet
  • 19:20 gehel: upgrading maps-test2004 to nodejs6 for testing - T150354
  • 18:34 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(21|22|24).eqiad.wmnet
  • 17:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(21|22|23|24).eqiad.wmnet
  • 17:17 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(17|18|19|20).eqiad.wmnet
  • 16:18 ejegg: updated 3DS rules for DK, SE, and NO
  • 16:15 bblack: restarting cp1074 varnish backend (cron due in 24h, but mb lag looks pretty bad)
  • 15:26 urandom: T155120: Restarting Cassandra on restbase1007-a.eqiad.wmnet to disable Prometheus exporter agent
  • 14:48 _joe_: uploaded clustershell 1.7.3, tqdm, pyparsing to jessie-wikimedia in preparation for cumin
  • 13:47 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 - T156478 (duration: 00m 45s)
  • 12:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)
  • 12:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)
  • 11:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clean up db1028 old comments - T153300 (duration: 00m 41s)
  • 11:11 moritzm: restarting nginx on sodium (mirrors.wikimedia.org) to pick up openssl update
  • 10:01 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic10(17|18|19|20).eqiad.wmnet
  • 09:54 moritzm: rolling restart of nginx on mediawiki servers in codfw to pick up openssl update
  • 09:33 moritzm: rolling restart of nginx on mw canaries to pick up openssl update
  • 09:32 gehel: upgrade nginx on elasticsearch codfw for ssl upgrade
  • 09:30 gehel: upgrade nginx on elastic1049-1052 for ssl upgrade
  • 09:22 moritzm: restarting nginx on install1002/2002 to pick up new openssl
  • 08:35 moritzm: upgrading mw1262-mw1265 to HHVM 3.12.14
  • 08:35 godog: restart nginx on prometheus in eqiad/codfw to pick up openssl update
  • 08:19 moritzm: restarted nginx/prometheus in esams/ulsfo to pick up openssl update
  • 08:00 moritzm: installing openssl 1.1.0e updates
  • 07:58 moritzm: upgrading mw1261 to HHVM 3.12.14
  • 07:41 moritzm: installing spice security updates
  • 06:59 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 - T156478 (duration: 00m 48s)
  • 02:33 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 17 02:33:21 UTC 2017 (duration 5m 32s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 06m 49s)

2017-02-16

  • 23:18 maxsem@tin: Finished scap: Another time, just ot make sure some files synched cuz lat time there were some mid-air collisions (duration: 15m 44s)
  • 23:02 maxsem@tin: Started scap: Another time, just ot make sure some files synched cuz lat time there were some mid-air collisions
  • 22:58 maxsem@tin: Finished scap: Update messages for https://gerrit.wikimedia.org/r/#/c/338013/ (duration: 24m 29s)
  • 22:46 mutante: tin - apt-get clean - 4.6G avail (T158359)
  • 22:33 maxsem@tin: Started scap: Update messages for https://gerrit.wikimedia.org/r/#/c/338013/
  • 22:32 maxsem@tin: Synchronized php-1.29.0-wmf.12/extensions/JsonConfig/: https://gerrit.wikimedia.org/r/#/c/338013/ (duration: 00m 42s)
  • 22:31 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/338208/ (duration: 00m 53s)
  • 22:17 XenoRyet: updated paymentswiki from 4466b9d to 2a0c3b2
  • 22:04 mutante: phab2001 - start/stop phd, testing gerrit 338163
  • 21:39 Reedy: make that 2017
  • 21:39 Reedy: Deleted around 9500 pre 2013 captchas
  • 21:08 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.12
  • 20:13 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.12
  • 19:13 chasemp: clean out /var/log/ on labnet1001 as it filled up
  • 19:12 gehel: restarting kartotherian / tilerator on maps-test*
  • 19:07 chasemp: bump up nodepool allocated fixed ips set (I think it exhausted them errantly somehow?)
  • 18:52 chasemp: clean out nodepool instances
  • 18:50 chasemp: stop noodepool to reset state on pool
  • 18:03 halfak@tin: Started deploy [ores/deploy@e9bbda3]: (no justification provided)
  • 18:03 halfak: deploying ores:e9bbda3
  • 17:08 hashar: reenable puppet on contint1001
  • 17:03 hashar: stopped puppet on contint1001 for https://gerrit.wikimedia.org/r/#/c/336978/
  • 16:30 moritzm: uploaded HHVM 3.12.14 to apt.wikimedia.org
  • 16:25 jynus: SET GLOBAL thread_pool_size=64; on db1074's mariadb
  • 16:01 moritzm: upgrading mwdebug1002 to HHVM 3.12.14
  • 15:53 moritzm: upgrading mwdebug1001 to HHVM 3.12.14
  • 14:27 moritzm: uploaded openssl 1.1.0e to apt.wikimedia.org
  • 13:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2070 IP as it goes to another rack - T156478 (duration: 00m 41s)
  • 13:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2070 IP as it goes to another rack - T156478 (duration: 00m 56s)
  • 13:19 marostegui: Shutdown db2070 for maintenance - T156478
  • 10:18 godog: roll-restart hhvm in eqiad to pick up fluorine -> mwlog1001 changes - T123728
  • 10:17 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(48|49|50|51|52).eqiad.wmnet
  • 10:15 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic10(48|49|50|51|52).codfw.wmnet
  • 09:39 moritzm: installing libgc security updates on trusty systems
  • 09:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore origina db1082 weight - T158188 (duration: 00m 41s)
  • 08:39 godog: roll-restart jobrunner in codfw/eqiad to pick up fluorine -> mwlog1001 redis change - T123728
  • 07:54 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2060 - T156161 (duration: 00m 44s)
  • 07:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase load db1082 - T158188 (duration: 00m 42s)
  • 03:11 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 16 03:11:01 UTC 2017 (duration 5m 42s)
  • 03:05 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.12) (duration: 14m 27s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 46s)
  • 00:59 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 with low load (duration: 00m 41s)
  • 00:49 eileen1: Update CiviCRM from 1ffc090 to 20660c4
  • 00:12 maxsem@tin: Synchronized php-1.29.0-wmf.12/extensions/Gadgets: https://gerrit.wikimedia.org/r/#/c/338004/ (duration: 00m 42s)

2017-02-15

  • 22:59 eileen1: update CIviCRM from da6ba1b to 1ffc090
  • 22:55 thcipriani@tin: Synchronized php-1.29.0-wmf.12/includes/libs/rdbms/ChronologyProtector.php: Add version to ChronologyProtector key T158217 (duration: 00m 41s)
  • 22:52 demon@tin: Synchronized php-1.29.0-wmf.12/extensions/Dashiki: prep-type stuff (duration: 00m 50s)
  • 21:41 ejegg: updated civicrm from 0fb289f to da6ba1b
  • 20:06 mutante: contint1001 - logrotate --force /etc/logrotate.d/jenkins to test gerrit:337383
  • 19:50 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Popups by default on se.wikimedia (T68374) (duration: 00m 41s)
  • 18:59 demon@tin: Synchronized multiversion/submodules.json: no-op (duration: 00m 50s)
  • 18:30 marostegui: Stop MySQL and shutdown db2062 for maintenance - T156478
  • 17:58 jynus: stopping labsdb1005 mariadb + puppet in preparation for reimage
  • 17:41 thcipriani@tin: Synchronized php-1.29.0-wmf.11/includes/libs/rdbms/ChronologyProtector.php: Make ChronologyProtector::init() use instanceof instead of empty() T158127 (duration: 00m 41s)
  • 17:17 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: Group0 to 1.29.0-wmf.12 T155527
  • 17:08 thcipriani@tin: Synchronized php-1.29.0-wmf.12/includes/libs/rdbms/ChronologyProtector.php: Make ChronologyProtector::init() use instanceof instead of empty() T158127 (duration: 00m 43s)
  • 17:05 thcipriani: starting wmf.12 to group0
  • 16:39 godog: flip xenon redis and apache from fluorine to mwlog1001 - T123728
  • 16:23 Jeff_Green: authdns-update to deploy fundraising host rename db1008->frav1001
  • 16:16 urandom: T155120: restarting Cassandra on restbase1007-a to enable Prometheus exporter (canary)
  • 16:01 dcausse@tin: Synchronized php-1.29.0-wmf.11/extensions/CirrusSearch/: T156234: Fold some problematic whitespaces with completion (duration: 01m 01s)
  • 15:57 marostegui: (Old action but for the sake of getting it logged) Force RAID controller to work on WriteBack even with the broken BBU it has now on db1060 so it can keep up with the replication thread - T158194
  • 15:51 filippo@tin: Synchronized wmf-config/StartProfiler.php: Switch xenon redis to mwlog1001.eqiad.wmnet (duration: 00m 42s)
  • 15:44 hashar: Zuul reducing gate-and-submit minimum amount of changes to process from the wrong 12 down to 2. In case of repeating failures it would end up running jobs for only two jobs which would prevent cancelling jobs for up to 11 changes!
  • 15:37 jynus: stopping slave and repartitioning db1045
  • 14:45 jynus: offlined 2 disks with media + other errors on db1060
  • 14:34 marostegui: Stop MySQL and shutdown db1082 - T158188
  • 14:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 00m 44s)
  • 14:04 hashar@tin: Synchronized php-1.29.0-wmf.12/extensions/CirrusSearch/includes/Maintenance/SuggesterAnalysisConfigBuilder.php: Fold some problematic whitespaces with completion - T156234 (duration: 00m 48s)
  • 13:59 elukey: disabled mod_deflate on bohrium (piwik) and disabled puppet. Testing 503 reduction.
  • 13:16 dereckson@tin: Synchronized wmf-config/throttle.php: Throttle rule for Royal College of Nursing event (T158171) (duration: 00m 43s)
  • 12:56 elukey: restart of jmxtrans on all the analytics kafka brokers
  • 12:33 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1045 (duration: 00m 42s)
  • 11:38 marostegui: Running pt-table-checksum on db1043 (m3 - phabricator master) - T154485
  • 11:35 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp1067.eqiad.wmnet
  • 11:20 godog: upgrade git on tin/mira - T140927
  • 10:59 moritzm: installing PHP security updates on californium (running horizon)
  • 10:49 moritzm: installing PHP security updates on uranium (running ganglia)
  • 10:46 moritzm: installing PHP security updates on siliver (running wikitech)
  • 08:21 moritzm: installing PHP security updates on Ubuntu systems
  • 07:33 marostegui: Deploy alter table on x1 master (db1031) for the echo_notification tables - T136428
  • 07:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 - T156478 (duration: 00m 42s)
  • 02:40 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 15 02:40:31 UTC 2017 (duration 5m 23s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 12m 50s)
  • 00:22 ebernhardson@tin: Synchronized php-1.29.0-wmf.11/extensions/CirrusSearch/: Provide per-index settings from configuration for elasticsearch 5 (duration: 00m 55s)
  • 00:16 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: Configure cirrus per-index setings for elasticsearch 5 (duration: 00m 43s)
  • 00:07 ebernhardson@tin: Synchronized php-1.29.0-wmf.11/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSatisfaction.js: (no justification provided) (duration: 00m 50s)

2017-02-14

  • 22:23 thcipriani@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Cleanup popups beta cluster config (beta-only-change) (duration: 00m 41s)
  • 22:15 chasemp: start staged nova-fullstack testing daemon on labnet1002 for metric inspection
  • 22:11 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.11 for T158127
  • 22:07 otto@tin: Finished deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script (duration: 01m 13s)
  • 22:06 otto@tin: Started deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script
  • 22:05 otto@tin: Finished deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script (duration: 01m 41s)
  • 22:03 otto@tin: Started deploy [analytics/refinery@4cd6305]: Deploying refinery with another update to drop hourly partitions script
  • 22:01 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.12
  • 21:47 thcipriani@tin: Synchronized php-1.29.0-wmf.11/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Type check the APC value in LoadBalancer::doWait() (duration: 00m 50s)
  • 21:10 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache (wikiversions.json not updated previously) (duration: 09m 25s)
  • 21:00 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache (wikiversions.json not updated previously)
  • 20:51 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache (duration: 48m 04s)
  • 20:38 eileen1: update civicrm from 8d04e75 to 0fb289f
  • 20:36 otto@tin: Finished deploy [analytics/refinery@67c3924]: Deploying refinery with update to drop hourly partitions script (duration: 02m 25s)
  • 20:33 otto@tin: Started deploy [analytics/refinery@67c3924]: Deploying refinery with update to drop hourly partitions script
  • 20:22 Dereckson: Update site statistics for pam.wikipedia (T158110, now 454 images)
  • 20:03 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.12 and rebuild l10n cache
  • 19:23 krinkle@tin: Synchronized docroot/noc/conf: I67194f (duration: 00m 48s)
  • 19:22 krinkle@tin: Synchronized dblists/: I67194f (duration: 01m 37s)
  • 18:39 arlolra: Updated Parsoid to 79ccfb93 (T58381, T108216)
  • 18:28 thcipriani: starting branch cut for 1.29.0-wmf.12
  • 18:27 arlolra@tin: Finished deploy [parsoid/deploy@1bfb86b]: Updating Parsoid to 79ccfb93 (duration: 09m 58s)
  • 18:17 arlolra@tin: Started deploy [parsoid/deploy@1bfb86b]: Updating Parsoid to 79ccfb93
  • 18:11 MaxSem: Purged https://he.wikipedia.org/static/images/mobile/copyright/wikipedia-wordmark-he.svg with purgeList.php
  • 17:34 bblack@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp1067.eqiad.wmnet
  • 16:18 andrewbogott: rebooting californium
  • 16:15 andrewbogott: dist-upgrade californium (as part of the liberty->mitaka upgrade)
  • 16:06 Niharika: Updated wikimania app to 5c44d06 Removed stale translations
  • 16:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2062 IP after its move to another rack - T156478 (duration: 00m 39s)
  • 16:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2062 IP after its move to another rack - T156478 (duration: 00m 40s)
  • 15:11 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Allow sysops to add/revoke account creator on it.wikiversity (T158062) (duration: 00m 41s)
  • 15:05 marostegui: Shutdown mysql (and later the whole host) on db2062 for maintenance - T156478
  • 15:04 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 to change its rack - T156478 (duration: 00m 41s)
  • 15:01 ema: lvs10*: upgrade to pybal 1.13.5 T147425
  • 14:48 moritzm: installing php security updates on einsteinium
  • 14:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073,89,90,91,92 (duration: 00m 41s)
  • 14:38 ema: lvs300[12]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 14:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073,89,90,91,92 (duration: 00m 40s)
  • 14:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1055,72,83,56,84,76,78,87,71 (duration: 00m 40s)
  • 14:17 hashar: European SWAT is complete
  • 14:16 hashar@tin: Synchronized php-1.29.0-wmf.11/extensions/CirrusSearch/profiles/SimilarityProfiles.php: Explicitly use BM25 as default for wmf_defaults similarity profile (duration: 00m 47s)
  • 14:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055,72,83,56,84,76,78,87,71 (duration: 00m 41s)
  • 14:11 ema: lvs300[34]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 14:09 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict on dewiki - T155721 (duration: 00m 40s)
  • 14:06 hashar@tin: Synchronized wmf-config/throttle.php: Throttle rule for cswiki - T158040 (duration: 00m 40s)
  • 13:50 ema: lvs400[12]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 13:32 ema: lvs400[34]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 13:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051,66,80,74,77,56,81,70,82 (duration: 00m 44s)
  • 13:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051,66,80,74,77,56,81,70,82 (duration: 00m 40s)
  • 12:59 jynus: reloading/restarting gerrint on cobalt, too slow
  • 11:57 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2221.codfw.wmnet
  • 11:30 moritzm: manual fix up of exim spool permissions on krypton (used to run the heavy exim variant)
  • 11:28 jynus: performing schema change on all mariadb servers T150474
  • 10:51 ema: lvs200[123]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 10:48 moritzm: installing tomcat security updates
  • 10:23 ema: lvs200[456]: upgrade to jessie 8.7, pybal 1.13.5, reboot into kernel 4.4.2-3+wmf8 T155401 T147425
  • 10:18 ema: uploading pybal 1.13.5 to apt.w.o T147425
  • 09:05 moritzm: upgrading firejail on sca cluster
  • 08:45 moritzm: installing vim security updates
  • 08:33 moritzm: restarting zookeeper on conf1003
  • 08:23 moritzm: restarting zookeeper on conf1002 to pick up OpenJDK update (restarts were stopped yesterday to further investigate gc behaviour)
  • 06:56 marostegui: Deploy alter table on x1 echo_notification tables - T136428
  • 02:40 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 14 02:40:22 UTC 2017 (duration 5m 19s)
  • 02:35 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 50s)
  • 00:40 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/337442/ (duration: 00m 40s)
  • 00:39 eileen1: update civicrm from 7b36996 to 8d04e75
  • 00:38 maxsem@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-he.svg: https://gerrit.wikimedia.org/r/#/c/337442/ (duration: 00m 40s)
  • 00:22 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/337441/ (duration: 00m 40s)
  • 00:21 maxsem@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-hi.svg: https://gerrit.wikimedia.org/r/#/c/337441/ (duration: 00m 42s)
  • 00:16 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/337064/ (duration: 00m 48s)

2017-02-13

  • 23:45 Pchelolo: update RESTBase to 0e9106ab8
  • 23:35 Pchelolo: update RESTBase to 0e9106ab8 - canary on restbase1007
  • 23:29 Pchelolo: update RESTBase to 0e9106ab8 - staging
  • 23:27 bsitzmann@tin: Finished deploy [mobileapps/deploy@cd3b897]: Update mobileapps to 776211b (duration: 03m 19s)
  • 23:24 bsitzmann@tin: Started deploy [mobileapps/deploy@cd3b897]: Update mobileapps to 776211b
  • 22:39 Pchelolo: rollback RESTBase to ea980cc5d - staging
  • 22:26 Pchelolo: update RESTBase to 0e9106ab8 - staging
  • 22:14 kaldari@tin: Synchronized dblists/: Turning on Echo for loginwiki (duration: 00m 41s)
  • 22:14 kaldari: scap sync-dir dblists/ 'Turning on Echo for loginwiki'
  • 21:43 bsitzmann@tin: Finished deploy [mobileapps/deploy@f6b4435]: Update mobileapps to 3af473f (duration: 03m 44s)
  • 21:39 bsitzmann@tin: Started deploy [mobileapps/deploy@f6b4435]: Update mobileapps to 3af473f
  • 20:51 mutante: carbon/install - adjusted Letsencrypt cert creation, deactivated reprepro to protect from accidental use, switching rsync direction from install1002->install2002, disabled cron on carbon (T132757)
  • 20:33 Reedy: dropped old out of date echo tables from extension1.loginwiki T157105
  • 19:38 mutante: carbon - synced /srv/ data to install1002/2002 for the last time, switching apt.wikimedia.org CNAME to install1002 - carbon deprecated (T132757)
  • 19:22 legoktm: running namespaceDupes.php on fiwiki (T103786)
  • 19:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: properly set wgCirrusSearchUseIcuFolding T155515 (duration: 00m 41s)
  • 19:00 moritzm: upgrading mw1236 with the security updates it missed while it was powered off
  • 18:33 chasemp: labstore1005 service maintain-dbusers restart
  • 18:18 mutante: scandium - shutdown -h now (T150936)
  • 18:05 mutante: scandium - ex-zuul merger - removing from puppet, revoking puppet cert, salt key..
  • 16:08 ema: lvs1003: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:58 ema: lvs1002: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:52 gehel: copy old blazegraph metrics to new path (wikidata.query.(triple|lag).* -> servers.<server_name>...)
  • 15:36 ema: lvs1001: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:18 moritzm: switched krypton to exim4-daemon-light (the -heavy variant was installed from an earlier role it carried)
  • 15:06 addshore: EU SWAT done
  • 15:06 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T101634 Update Wikiquote talk namespace in Sanskrit Wikisource and Support legacy Wikiquote talk namespace in Sanskrit Wikisource (duration: 00m 40s)
  • 14:37 addshore@tin: Synchronized portals: Updating portal stats Gerrit (duration: 00m 40s)
  • 14:36 addshore@tin: Synchronized portals/prod/wikipedia.org/assets: Updating portal stats Gerrit (duration: 00m 40s)
  • 14:28 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155721 Enable TwoColConflict on metawiki (duration: 00m 40s)
  • {{safesubst:SAL entry|1=14:23 addshore@tin: Synchronized php-1.29.0-wmf.11/extensions/TwoColConflict/includes/TwoColConflictHooks.php: [[gerrit:337186|Change beta feature info and talk links (duration: 00m 40s)}}
  • 14:19 addshore@tin: Synchronized php-1.29.0-wmf.11/extensions/RevisionSlider/modules/ext.RevisionSlider.lazy.css: T157800 Dont set min-height and min-width for oo-ui buttons 2/2 (duration: 00m 55s)
  • 14:18 addshore@tin: Synchronized php-1.29.0-wmf.11/extensions/RevisionSlider/modules/ext.RevisionSlider.css: T157800 Dont set min-height and min-width for oo-ui buttons 1/2 (duration: 01m 07s)
  • 13:57 moritzm: rolling restart of zookeeper in eqiad to pick up Java security updates
  • 13:57 marostegui: Shutdown db2060 for maintenance - T156161
  • 13:47 ema: lvs100[56]: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 13:37 ema: lvs1004: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 13:33 moritzm: rolling restart of zookeeper in codfw to pick up Java security updates
  • 12:32 elukey: updating elastic search ACLs on cr1/cr2 for the analytics-ip4 filter
  • 11:59 moritzm: removing unneeded PHP packages from mw1261-mw1265 (these were installed before we changed puppet trim most PHP packages in favour of HHVM)
  • 11:21 moritzm: installing PHP security updates
  • 11:18 elukey: stopped ircecho on einsteinium
  • 11:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1002.eqiad.wmnet
  • 10:47 godog: remove big/spammy log files from thubmro100[12] - T157949
  • 10:35 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1001.eqiad.wmnet
  • 09:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 - T136428 (duration: 00m 41s)
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T136428 (duration: 00m 40s)
  • 09:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 - T136428 (duration: 00m 40s)
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 - T136428 (duration: 00m 40s)
  • 09:00 marostegui: Deploy alter table s3 officewiki and mediawikiwiki for echo_notification tables on eqiad - T136428
  • 08:10 elukey: removed empty log files from elastic1022,1024,2001,1026,1040 to fix logrotate cronspam
  • 07:55 moritzm: upgrade HHVM on remaining mw servers in eqiad
  • 07:03 marostegui: Compressing commonswiki tables on labsdb1010 and labsdb1011 - T153743
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 13 02:38:43 UTC 2017 (duration 5m 18s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 13m 18s)

2017-02-12

  • 15:18 reedy@tin: Synchronized php-1.29.0-wmf.11/extensions/ConfirmEdit/maintenance: Instrumentation to script (duration: 00m 41s)
  • 02:24 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 12 02:24:56 UTC 2017 (duration 5m 20s)
  • 02:19 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 07m 12s)

2017-02-11

  • 10:02 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(33|34|35|36).codfw.wmnet
  • 09:53 elukey: mw1236 back in production (scap pull executed before pooled=yes) - T156610
  • 09:52 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1236.eqiad.wmnet
  • 09:35 elukey: rebooting mw1236 to make sure that it comes up cleanly - T156610
  • 09:15 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(33|34|35|36).codfw.wmnet
  • 09:09 gehel: cleanup logs on elastic20(01|25) - T139043
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 11 02:37:10 UTC 2017 (duration 5m 19s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 31s)
  • 00:00 mutante: switching apt.wikimedia.org from carbon to install1002 - there might be a short time until the LE SSL cert is also adjusted

2017-02-10

  • 23:41 RainbowSprinkles: gerrit: Restarting to pick up config changes
  • 23:12 RainbowSprinkles: gerrit: restarting service
  • 22:42 godog: start rsync of whisper metrics graphite2001 -> graphite1001 - T157022
  • 22:29 mutante: carbon - stopping puppet and most services, adding deprecation warning to motd, rsyncing data one last time (T132757)
  • 22:17 mutante: install1001 - shutdown ganeti instance and deleting it and its disk (T132757)
  • 21:27 mutante: install1001, install2001 - removed from Icinga, shutting down (T84380, T132757)
  • 21:18 mutante: install1001, install2001 - revoke puppet certs, puppet node deactivate, delete salt keys (T84380, T132757)
  • 21:03 ladsgroup@tin: Synchronized php-1.29.0-wmf.11/extensions/Nuke/Nuke_body.php: gerrit:337076 Fixing Special:Nuke (T156112, T156949, T156314) (duration: 00m 58s)
  • 21:03 Amir1: ladsgroup@tin:/srv/mediawiki-staging$ scap sync-file php-1.29.0-wmf.11/extensions/Nuke/Nuke_body.php 'gerrit:337076 Fixing Special:Nuke (T156112, T156949, T156314)'
  • 20:38 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(29|30|31|32).codfw.wmnet
  • 20:11 godog: silence graphite1001 for ssd reinstall - T157022
  • 18:27 brion: brion running throttled version of requeueTranscodes.php for low-res transcodes. expect increased load on video scalers but should remain responsive.
  • 18:23 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(29|30|31|32).codfw.wmnet
  • 18:08 jynus: renabling delayed replication for dbstore2001 T130128
  • 17:57 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(27|28).codfw.wmnet
  • 16:15 mutante: scandium - stopping zuul-merger service (T150936)
  • 15:15 jynus: temporarily disabling mariadb replication lag checks to deploy new version of the icinga check script
  • 15:09 godog: bounce cassandra-a on xenon after https://gerrit.wikimedia.org/r/335826
  • 14:26 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(25|26).codfw.wmnet
  • 12:33 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,service=nginx,name=mw1227.eqiad.wmnet
  • 12:32 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,service=nginx,name=mw1228.eqiad.wmnet
  • 12:32 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,service=nginx,name=mw1229.eqiad.wmnet
  • 12:11 elukey: updating firewall rules for analytics on cr1/cr2
  • 12:00 godog: bounce mwerrors on eventlog1001 to pick up statsd cname change - T157022
  • 11:46 godog: roll-restart tileratorui in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 11:41 godog: roll-restart trendingedits on scb in eqiad/codfw to pick up new statsd.eqiad.wmnet - T157022
  • 11:36 godog: roll-restart mathoid/citoid/mobileapps/cxserver/eventstreams/graphoid on scb in eqiad/codfw to pick up new statsd.eqiad.wmnet - T157022
  • 11:30 godog: roll-restart changeprop on scb in eqiad/codfw to pick up new statsd.eqiad.wmnet - T157022
  • 11:25 godog: roll-restart nodepool on labnodepool1001 to pick up new statsd.eqiad.wmnet - T157022
  • 11:23 godog: roll-restart parsoid on ruthenium to pick up new statsd.eqiad.wmnet - T157022
  • 11:19 godog: roll-restart jmxtrans on conf* in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 11:16 godog: roll-restart tilerator in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 11:10 godog: restart navtiming ve asset-check statsd-mw-js-deprecate on hafnium to pick up statsd.eqiad.wmnet change - T157022
  • 10:54 godog: roll-restart karthoterian in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 10:39 godog: roll-restart parsoid in codfw/eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 10:37 elukey: roll-restart of aqs to pick up new statsd.eqiad.wmnet - T157022
  • 10:34 godog: roll-restart ocg to pick up new statsd.eqiad.wmnet - T157022
  • 10:30 godog: roll-restart restbase in eqiad to pick up new statsd.eqiad.wmnet - T157022
  • 10:20 godog: restart of jmxtrans on analytics by elukey - T157022
  • 10:10 ema: lvs1007-10: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 10:06 godog: roll-restart restbase in codfw to pick up new statsd.eqiad.wmnet - T157022
  • 10:03 hashar: rebooting contint2001
  • 09:51 hashar: Reenabling puppet and zuul-merger on contint1001 and contint2001. The git-daemon is running now T140297 T150936. The 'systemctl status git-daemon' thought that the service was running when it was not (filled T157785 )
  • 09:26 hashar: stopped zuul-merger process on contint1001 and contint2001. They lack the git-daemon service to expose the merges.
  • 08:45 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(25|26|27|28).codfw.wmnet
  • 08:44 moritzm: upgrading hhvm on mw1200-mw1229
  • 08:41 elukey: restarting kafka mirror maker and jmxtrans of kafka[12]00[123] for java security upgrades
  • 08:30 marostegui: Deploye alter table s3 officewiki.echo_notification and mediawikiwiki.echo_notification tables only on codfw - T136428
  • 04:06 mutante: ganglia - switching aggregators from 1001 to 1002 and 2001 to 2002, there might be minor gaps in the graphs, but hey, it's deprecated anyways
  • 03:48 mutante: icinga - live hack fixing config - due to partially removed decom hosts mc2001-mc2016
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 10 02:37:59 UTC 2017 (duration 5m 26s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 11m 53s)
  • 01:48 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: Setup sister search prefix display types T149806 (duration: 00m 40s)
  • 01:43 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: Setup sister search prefix display types T149806 (duration: 00m 48s)
  • 01:01 brion: transcode queue back to normal
  • 00:53 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgPageAssessmentsSubprojects to true on English Wikipedia T157654 (duration: 00m 43s)
  • 00:53 brion: transcode high-prio queue may be briefly blocked by an influx of low-res transcodes queued in bulk. should return to normal in a bit.
  • 00:32 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/WikimediaEvents: SWAT: Enable Sister project search AB test T149806 (duration: 00m 45s)

2017-02-09

  • 22:58 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 22:57 bblack: cp2006: unresponsive control, powercycled from racadm, normal boot, no evidence in logs - repooling for now
  • 22:48 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.11
  • 22:44 thcipriani@tin: Synchronized php-1.29.0-wmf.11/skins/CologneBlue/SkinCologneBlue.php: Revert "Remove warning suppression" (duration: 00m 59s)
  • 22:42 bblack: cp2006 depooled due to icinga report of host-down
  • 22:42 bblack@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 22:02 brion: brion running tests of requeueTranscodes.php on terbium to restart subsets of video scaler work
  • 21:54 mutante: cp3014,cp3020,cp3022 - puppet node deactivate - cp3020 delete salt key (T130883)
  • 20:45 thcipriani@tin: Synchronized php-1.29.0-wmf.11/skins/CologneBlue/SkinCologneBlue.php: Fix a bunch of undefined indexes T157619 (sync actual skin file) (duration: 00m 40s)
  • 20:41 thcipriani@tin: Synchronized php-1.29.0-wmf.11/skins/CologneBlue/CologneBlue.php: Fix a bunch of undefined indexes T157619 (duration: 00m 41s)
  • 20:24 otto@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 00m 20s)
  • 20:23 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:23 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:21 otto@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 00m 04s)
  • 20:21 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:21 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:20 otto@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 00m 07s)
  • 20:20 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:20 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:20 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:19 otto@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 20:05 thcipriani@tin: Synchronized php-1.29.0-wmf.11/resources/src/mediawiki.special/mediawiki.special.search.interwikiwidget.styles.less: SWAT: Temporary hax to hide cawiki hacked in search sidebar T149806 (duration: 00m 40s)
  • 20:02 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/ConfirmEdit: SWAT: Add script for counting captchas Use an accurate number of captchas (duration: 00m 43s)
  • 19:57 ottomata: restarting main kafka brokers in codfw and then eqiad to pick up jvm updates
  • 19:50 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler: SWAT: TMH job queue split into low and high priority PART III T155098 (duration: 00m 44s)
  • 19:49 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler/TimedMediaHandler.hooks.php: SWAT: TMH job queue split into low and high priority PART II T155098 (duration: 00m 40s)
  • 19:47 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler/TimedMediaHandler.php: SWAT: TMH job queue split into low and high priority PART I T155098 (duration: 00m 41s)
  • 19:37 smalyshev@tin: Finished deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 1003 (duration: 00m 16s)
  • 19:37 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 1003
  • 19:34 smalyshev@tin: Finished deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 2003 (duration: 00m 26s)
  • 19:34 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build on 2003
  • 19:29 ladsgroup@tin: Started deploy [ores/deploy@10fa16b]: (no justification provided)
  • 19:28 ladsgroup@tin: Started deploy [ores/deploy@a3a410b]: (no justification provided)
  • 19:26 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/TimedMediaHandler/SpecialTimedMediaHandler.php: SWAT: Only load necessary fields on Special:TimedMediaHandler lists (T157621) (duration: 00m 41s)
  • 19:19 ladsgroup@tin: Started deploy [ores/deploy@a3a410b]: (no justification provided)
  • 19:17 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: Deploy new WAR build
  • 19:15 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: (no justification provided)
  • 19:14 bd808: Restarted logstash on logstash1001. Dead since 2017-02-09T06:39:46 with "java.lang.UnsupportedOperationException" crash in worker thread.
  • 19:13 smalyshev@tin: Started deploy [wdqs/wdqs@1a7cd32]: (no justification provided)
  • 19:08 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Setting $wgPageAssessmentsSubprojects to true on testwiki T157654 (duration: 00m 54s)
  • 19:05 ladsgroup@tin: Started deploy [ores/deploy@4fdaf7d]: (no justification provided)
  • 18:50 godog: test bouncing jmxtrans on kafka1012 to pick up statsd changes
  • 18:35 godog: bounce zuul to pick up statsd DNS change - T157022
  • 18:34 ladsgroup@tin: Finished deploy [ores/deploy@e27e845]: (no justification provided) (duration: 04m 33s)
  • 18:30 ladsgroup@tin: Started deploy [ores/deploy@e27e845]: (no justification provided)
  • 18:29 Amir1: starting deploy of ores:e27e845 to canary node
  • 17:55 elukey: proactively restarted statsv on hafnium after the kafka broker restarts
  • 17:42 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(18|21|22|23|24).codfw.wmnet
  • 16:38 godog: flip dns records for statsd/carbon to graphite2001 - T157022
  • 16:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2040 (duration: 00m 52s)
  • 16:22 ema: lvs1011: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 16:17 marostegui: Shutdown db2060 for maintenance - T156161
  • 16:15 marostegui: Compressing commonswiki on labsdb1009 - T153743
  • 16:08 ema: lvs1012: upgrade to jessie 8.7, pybal 1.13.4, reboot into kernel 4.4.2-3+wmf8 T155401
  • 16:06 jynus: rolling restart of replication threads for dbstore1002/2001/2002 T111654
  • 15:49 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1009.eqiad.wmnet
  • 15:42 godog: roll-restart diamond to pick up graphite2001 changes
  • 15:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T136428 (duration: 00m 44s)
  • 15:23 ema: shutdown cp3020 T130883
  • 15:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T136428 (duration: 00m 40s)
  • 15:19 elukey: restarting all Analytics Kafka brokers for Java security upgrades
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T136428 (duration: 00m 40s)
  • 15:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T136428 (duration: 00m 43s)
  • 14:53 moritzm: upgrading hhvm on mw1189-mw1199 and mw1293/mw1294
  • 14:48 godog: move diamond traffic to graphite2001 - T157022
  • 14:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1062 - T136428 (duration: 00m 41s)
  • 14:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1062 - T136428 (duration: 00m 45s)
  • 14:01 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(21|22|23|24).codfw.wmnet
  • 13:45 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2020.codfw.wmnet
  • 13:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1028 - T153300 (duration: 00m 41s)
  • 13:11 moritzm: upgrading firejail on sca cluster
  • 12:52 gehel: killing salt runs stuck on failing reimage of elastic2018
  • 12:37 mforns@tin: Finished deploy [analytics/refinery@9e689f3]: (no justification provided) (duration: 03m 05s)
  • 12:34 mforns@tin: Started deploy [analytics/refinery@9e689f3]: (no justification provided)
  • 11:56 moritzm: upgrading hhvm on mw1170-mw1188 (also effecting updates of openssl, libgd, lcms, gnutls, sqlite, libxpm and glibc)
  • 11:39 gehel: failed reimage on elastic201[89], restarting
  • 10:54 moritzm: deploy exim and openssh bugfix updates from jessie point release
  • 10:51 moritzm: upgrading java on kafka clusters and druid
  • 10:49 elukey: restarting Java daemons on druid100[123] for security upgrades
  • 10:42 jynus: preparing to reimage db2040 T111654
  • 10:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2040 (duration: 00m 40s)
  • 10:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1034 - T111654 (duration: 00m 41s)
  • 10:09 hashar: Restarted Jenkins on contint1001
  • 10:04 hashar: Running package upgrades on contint2001
  • 10:03 elukey: restore Hadoop master to an1001
  • 09:57 elukey: failover Hadoop masters from an1001 to an1002 to allow Java upgrades
  • 09:52 gehel: cleaning up logs on elastic20(01|16) - T139043
  • 09:50 elukey: restarting oozie and hive on analytics1003 for java security upgrades
  • 09:39 marostegui: Deploy alter table on eqiad hosts for s7 metawiki and wiki on the echo_notification tables - T136428
  • 09:38 jynus: upgrading and restarting db1034 T111654
  • 09:34 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1034 (duration: 00m 44s)
  • 09:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(17|18|19|20).codfw.wmnet
  • 09:20 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2057 (duration: 00m 41s)
  • 09:17 gehel: restarting blazegraph on wdqs1003 to ensure proper war is loaded
  • 09:10 marostegui: Deploy alter table on codfw hosts for s7 metawiki and wiki on the echo_notification tables - T136428
  • 09:08 moritzm: restarting archiva on meitnerium for java security update
  • 09:07 elukey: Executing Cassandra nodetool cleanup on aqs1006-{a,b} (one at the time) and aqs1009-a
  • 09:01 elukey: restarting java daemons on all the Hadoop nodes for security upgrades
  • 08:59 gehel: cleaning empty logs on elastic10(22|24|40) - thanks elukey !
  • 08:51 moritzm: installing Java security updates on Hadoop cluster
  • 08:45 moritzm: installing Java security updates on stat* and contint1001
  • 08:17 marostegui: Compressing commonswiki tables on db1095 - T153743
  • 07:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Added comment for db1064 being master of db1095 - T153743 (duration: 00m 40s)
  • 07:46 elukey: Renamed some logs in /var/log (adding _renamed) on aluminum, elastic102[46]/1040 to avoid cronspam and logrotate failures
  • 07:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 - T153743 (duration: 00m 40s)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T156126 (duration: 00m 42s)
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 9 03:14:55 UTC 2017 (duration 5m 44s)
  • 03:09 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 15m 19s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 15m 05s)
  • 01:10 twentyafterfour: phabricator upgrade finished.
  • 01:07 dereckson@tin: Synchronized wmf-config/throttle.php: Update throttle rules (Gerrit:336552 for it.wikiversity + Gerrit:336741 for cleaning) (duration: 00m 40s)
  • 01:02 twentyafterfour: starting phabricator deployment #phab-2017-02-08
  • 00:57 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Switch to SiteMatrixInterwikiResolver for AB test (Gerrit:336738) (duration: 00m 41s)
  • 00:47 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Configuration change for RelatedArticles (labs only, Gerrit:336740) (duration: 00m 40s)
  • 00:34 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Quiz on Spanish Wikibooks (duration: 00m 41s)
  • 00:24 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Configuration changes for RelatedArticles (labs only, Gerrit:336732 and Gerrit:336733) (duration: 00m 40s)
  • 00:20 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Prune wgMinervaUseFooterV2 (T157075) (duration: 00m 41s)
  • 00:11 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Use https:// urls when communicating with PediaPress (T157398) (duration: 00m 41s)

2017-02-08

  • 23:24 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2016.codfw.wmnet
  • 23:04 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 to 1.29.0-wmf.11 -- T157621 is not code-change related
  • 22:45 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 back to 1.29.0-wmf.10
  • 22:43 thcipriani: rolling back for wmf.11 from group1 due to T157621
  • 22:33 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2015.codfw.wmnet
  • 22:26 demon@tin: Synchronized php-1.29.0-wmf.11/includes/WebResponse.php: Debugging fun times (duration: 00m 50s)
  • 22:21 gehel: elastic2016 not coming up after reimage - powercycling
  • 22:01 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0efa7b8]: Update service-mobileapp-node to f45bfff (duration: 02m 55s)
  • 21:58 mholloway-shell@tin: Started deploy [mobileapps/deploy@0efa7b8]: Update service-mobileapp-node to f45bfff
  • 21:43 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(13|14).codfw.wmnet
  • 21:42 halfak@tin: Finished deploy [ores/deploy@7c80636]: (no justification provided) (duration: 01m 26s)
  • 21:41 halfak@tin: Started deploy [ores/deploy@7c80636]: (no justification provided)
  • 21:36 halfak@tin: Finished deploy [ores/deploy@7c80636]: (no justification provided) (duration: 03m 45s)
  • 21:32 halfak@tin: Started deploy [ores/deploy@7c80636]: (no justification provided)
  • 20:54 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.11
  • 20:52 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(13|14|15|16).codfw.wmnet
  • 20:47 thcipriani@tin: Synchronized php-1.29.0-wmf.11/extensions/MobileFrontend/includes/api/ApiMobileView.php: Pass revision id to parseSectionsData to avoid warnings T157515 (duration: 00m 42s)
  • 19:54 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable VE on fr.wiktionary Projet: namespace (T156660) (duration: 00m 44s)
  • 19:44 Dereckson: mwscript updateCollation.php --wiki=olowiki --previous-collation=uppercase (T147064, 4238 rows processed)
  • 19:39 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set category collation for olo.wikipedia (T146612, T147064) (duration: 00m 43s)
  • 19:29 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Namespace configuration for ml. projects (T56951) (duration: 00m 41s)
  • 19:17 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Create autopatrolled and rollbacker permissions for fa.wikiquote (T156163) (duration: 00m 43s)
  • 19:08 dereckson@tin: Synchronized php-1.29.0-wmf.11/extensions/UploadWizard/UploadWizard.config.php: Disable Firefogg support (T157201) (duration: 00m 44s)
  • 19:07 mutante: bastion hosts, people.wm: deluser volkere, let puppet create volker-e, move data, delete old home dir (T157591)
  • 19:05 dereckson@tin: Synchronized php-1.29.0-wmf.10/extensions/UploadWizard/UploadWizard.config.php: Disable Firefogg support (T157201) (duration: 00m 46s)
  • 19:02 mutante: temp. disabling puppet and doing some debugging on bastion hosts, renaming a user
  • 18:33 demon@tin: Synchronized multiversion/: Dropping old MWVersion shim (duration: 00m 57s)
  • 18:05 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic20(09|10|11|12).codfw.wmnet
  • 18:02 jynus: upgrading and restarting db2057 T111654
  • 17:46 jynus@tin: Synchronized wmf-config/db-codfw.php: depool db2057 (duration: 00m 41s)
  • 17:45 elukey: added some annotations to the aqs analytics ACLs on cr1/cr2
  • 17:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: repool db1030 (duration: 00m 40s)
  • 17:04 jynus: rolling restart of replication thread of 29 mysql hosts T111654
  • 17:02 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic20(09|10|11|12).codfw.wmnet
  • 16:32 ema: cp3045 stuck rebooting, power-cycled
  • 16:20 ema: cp2017 stuck rebooting, power-cycled
  • 16:19 jynus: upgrading and restarting db1030 T111654
  • 16:15 ema: pybal 1.13.4 built and uploaded to carbon
  • 16:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: depool db1030 (duration: 00m 41s)
  • 16:10 chasemp: maintain-views and maintain-meta_p full runs on labsdb1009/10/11
  • 15:46 marostegui@tin: Synchronized wmf-config/db-codfw.php: db1073 change IP - T156126 (duration: 00m 40s)
  • 15:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1073 change IP - T156126 (duration: 00m 40s)
  • 15:41 ema: cp2011 stuck rebooting, power-cycled
  • 15:40 jynus@tin: Synchronized wmf-config/db-eqiad.php: repool db1026 (duration: 00m 41s)
  • 15:28 elukey: Eqiad cr1/cr2 - Updated analytics-in4 for new aqs nodes and removed decommed ones
  • 15:20 hoo@tin: Synchronized php-1.29.0-wmf.10/extensions/Wikidata: Wikibase uses multiple EntityPrefetchers (T157380) (duration: 02m 07s)
  • 15:15 hoo@tin: Synchronized php-1.29.0-wmf.11/extensions/Wikidata: Wikibase uses multiple EntityPrefetchers (T157380) (duration: 02m 11s)
  • 14:54 marostegui: Shutdown db1073 for maintenance - https://phabricator.wikimedia.org/T156126
  • 14:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2008.codfw.wmnet
  • 14:30 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2007.codfw.wmnet
  • 14:30 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2006.codfw.wmnet
  • 14:29 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2005.codfw.wmnet
  • 14:26 elukey: restarting nutcracker in all the codfw mw servers to pick up the new shards
  • 14:23 ema: cp2022 stuck rebooting, power-cycled
  • 14:23 gehel: drain shards from elastic20(09|10|11|12) in preparation for reimage - T151326
  • 14:17 jynus: upgrading and restarting db1026 T111654
  • 13:46 elukey: replacing the codfw memcached/redis shards 12->16
  • 13:41 marostegui: Start replication on db1064 - T153743
  • 13:40 marostegui: Enable replication between db1095 and db1064 - T153743
  • 13:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1026 for maintenance (duration: 00m 41s)
  • 13:10 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2008.codfw.wmnet
  • 13:10 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2007.codfw.wmnet
  • 13:09 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2006.codfw.wmnet
  • 13:09 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2005.codfw.wmnet
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T156126 (duration: 00m 40s)
  • 12:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1037 after maintenance (duration: 00m 41s)
  • 12:17 jynus: upgrading and restarting db1037 T111654
  • 12:16 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1037 for maintenance (duration: 00m 40s)
  • 12:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1045 after maintenance (duration: 00m 42s)
  • 10:39 jynus: upgrading and restarting db1045 T111654
  • 10:38 moritzm: upgrading openssl, libgd, lcms, gnutls, sqlite, libxpm and glibc in codfw mediawiki cluster (so get get effected by the restart during the HHVM upgrade)
  • 10:11 moritzm: upgrading hhvm on codfw mediawiki cluster
  • 09:55 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 09:44 elukey: boostrapping aqs1009-b (last new AQS Cassandra instance)
  • 09:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2004.codfw.wmnet
  • 09:31 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2003.codfw.wmnet
  • 08:56 ema: cache_upload: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 08:42 gehel: drain shards from elastic200[5678] in preparation for reimage - T151326
  • 08:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2004.codfw.wmnet
  • 08:18 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=elastic2003.codfw.wmnet
  • 08:16 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2002.codfw.wmnet
  • 07:59 marostegui: Adding 100G to the lv on dbstore1001
  • 07:23 marostegui: Restart MySQL db1095 and labsdb1009 for maintenance - T153743
  • 03:08 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 8 03:08:05 UTC 2017 (duration 5m 43s)
  • 03:02 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.11) (duration: 15m 32s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 07m 35s)
  • 01:02 mutante: mw1294 - run puppet because it popped up in Icinga as failed - removes a bunch of /var/tmp/core/../rsvg-convert.*, all else normal
  • 00:56 thcipriani@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT Setting $wgPageAssessmentsSubprojects to true on beta cluster (housekeeping sync) (duration: 00m 40s)
  • 00:55 mutante: mw1189 service hhvm restart
  • 00:55 mutante: iridum - apache graceful'ed
  • 00:51 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update footer logos on mobile site for various projects PART II T157476 (duration: 00m 41s)
  • 00:50 thcipriani@tin: Synchronized static/images/mobile/copyright: SWAT: Update footer logos on mobile site for various projects PART I T157476 (duration: 00m 41s)
  • 00:16 thcipriani@tin: Synchronized wmf-config: SWAT: Deploy TextCat Improvements T149324 T142140 (duration: 00m 45s)

2017-02-07

  • 23:42 mutante: carbon - stopping DHCP service (install* should be used)
  • 22:31 otto@tin: Finished deploy [eventstreams/deploy@e86077c]: (no justification provided) (duration: 02m 26s)
  • 22:28 otto@tin: Started deploy [eventstreams/deploy@e86077c]: (no justification provided)
  • 21:17 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.11
  • 21:02 eileen1: Update CiviCRM from e45da6d to 7b36996
  • 21:01 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.11 and rebuild l10n cache (duration: 51m 53s)
  • 20:10 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.11 and rebuild l10n cache
  • 19:43 gehel: deploying analysis-stempel plugin on relforge and cluster restart
  • 19:34 gehel: drain shards from elastic200[34] in preparation for reimage - T151326
  • 19:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1015 after maintenance (duration: 01m 34s)
  • 18:58 jynus: preparing db2043 for reimage T152188
  • 18:55 jynus: restarting and upgrading db1015 T152188
  • 18:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1015 after maintenance (duration: 00m 54s)
  • 18:33 thcipriani: starting branch cut for MediaWiki and extensions 1.29.0-wmf.11
  • 18:24 arlolra: Updated Parsoid to f0732260 (T109897)
  • 18:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1022 after maintenance (duration: 00m 40s)
  • 18:18 arlolra@tin: Finished deploy [parsoid/deploy@c3a5c55]: Updating Parsoid to f0732260 (duration: 09m 05s)
  • 18:09 arlolra@tin: Started deploy [parsoid/deploy@c3a5c55]: Updating Parsoid to f0732260
  • 17:50 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=elastic2001.codfw.wmnet
  • 17:07 jynus: restarting and upgrading db1022 T152188
  • 17:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1022 for maintenance (duration: 00m 40s)
  • 16:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1021 (duration: 00m 57s)
  • 16:08 jynus: restarting and upgrading db2064 T152188
  • 14:56 jynus: preparing db2036 for reimage T152188
  • 14:42 gehel: drain shards from elastic2001 / elastic2002 in preperation for reimage - T151326
  • 14:28 hashar: European swat copleted
  • 14:27 elukey: restarting hhvm on mw1304 (load very high, no queue, threads locked - /tmp/hhvm.62070.bt.)
  • 14:22 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Namespace changes for elwikisource - T157187 (duration: 00m 40s)
  • 14:19 elukey: restarting all the Yarn Node Managers on the Hadoop worker nodes to pick up the new config - T156932
  • 14:12 hoo@tin: Synchronized wmf-config/: Search index article placeholders on cywiki up to Q2794 (T144592) (duration: 00m 42s)
  • 14:10 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Introduce $wmgArticlePlaceholderSearchEngineIndexed (duration: 00m 52s)
  • 14:05 marostegui: Importing commonswiki tables on labsdb1009 - T153743
  • 13:54 jynus: restarting and upgrading db1021
  • 13:45 marostegui: Importing commonswiki tables on db1095 - T153743
  • 12:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1021 (duration: 00m 41s)
  • 12:16 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1036 (duration: 00m 40s)
  • 11:56 moritzm: restarting hhvm on appserver canaries to pick up lcms, sqlite, libxpm, gnutls and glibc updates (from jessie 8.7 release and security updates)
  • 11:53 godog: stop puppet on ms-be1012 and change rsyslog to avoid local syslog spam - T157237
  • 11:42 moritzm: installing libxpm security updates
  • 11:39 jynus: restarting and upgrading db1036
  • 11:37 elukey: restarting hhvm on mw1226 (hhvm dump debug in /tmp/hhvm.33183.bt.)
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1036 (duration: 00m 40s)
  • 11:27 moritzm: installing libgd security updates
  • 11:13 moritzm: installing libpng security updates
  • 10:34 jynus: preparing db2046 for reimage T152188
  • 10:02 _joe_: uploaded etcd-mirror 0.0.1 to jessie-wikimedia (T156009)
  • 09:48 moritzm: installing cairo security updates
  • 09:46 elukey: stopped and masked cassandra-{a,b} - T157425
  • 08:40 marostegui: Transferring commonswiki tables from db1064 to db1095 - T153743
  • 07:31 elukey: added "> /dev/null" manually to the carbon's root crontab (rsync job) to avoid cronspam. The change was already merged in https://gerrit.wikimedia.org/r/#/c/336218 but puppet is disabled on carbon.
  • 07:08 marostegui: Transferring commonswiki tables from db1064 to labsdb1009 - T153743
  • 07:06 marostegui: Importing commonswiki tables on labsdb1010 - T153743
  • 06:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 - T156226 (duration: 00m 50s)
  • 05:41 volans: ms-be1012 running out of space on /, manually compressed /var/log/swift/server.log.1 and cleaned up apt cache T157237
  • 02:37 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Feb 7 02:37:35 UTC 2017 (duration 5m 16s)
  • 02:35 cwd: imported triggers into staging civi
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 13m 23s)
  • 01:31 mutante: prometheus1004 - installed OS, signing puppet cert, initial run.. (T152504)
  • 00:49 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Fix $wmgVisualEditorAvailableNamespaces code style (Gerrit:336346) (no-op) (duration: 00m 40s)
  • 00:31 mutante: install1001 - re-enabled puppet, start DHCP service
  • 00:26 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Show again svwiki logo between 1.5x and 2x zoom (T157387) (duration: 00m 40s)

2017-02-06

  • 23:45 mutante: prometheus1003 - installed OS, signing puppet cert, initial run (T152504)
  • 22:57 Krinkle: Purged https://en.wikipedia.org/static/apple-touch/wikipedia.png (mwscript purgeList.php) for T152538
  • 21:39 bsitzmann@tin: Finished deploy [mobileapps/deploy@9b42448]: Update mobileapps to 034a391 (duration: 03m 59s)
  • 21:37 otto@tin: Finished deploy [eventstreams/deploy@c938a57]: (no justification provided) (duration: 01m 47s)
  • 21:35 otto@tin: Started deploy [eventstreams/deploy@c938a57]: (no justification provided)
  • 21:35 bsitzmann@tin: Started deploy [mobileapps/deploy@9b42448]: Update mobileapps to 034a391
  • 20:49 mutante: cp3011 thru cp3022 - shutdown / poweroff (T130883)
  • 20:39 tgr@tin: Synchronized php-1.29.0-wmf.10/extensions/JsonConfig/includes/JCUtils.php: T155532: Update JsonConfig login API call (duration: 01m 00s)
  • 20:27 mutante: cp3011 thru cp3022 - revoke puppet certs, puppet node deactivate (T130883)
  • 19:37 jynus: restarting db2060 after kernel upgrade
  • 19:30 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Second half of changing project logo for hi.wikibooks.org (duration: 00m 39s)
  • 19:29 ebernhardson@tin: Synchronized static/images/project-logos/hiwikibooks.png: First part of Changing project logo for hi.wikibooks.org (duration: 00m 39s)
  • 19:25 ebernhardson: re-pulled 336242 to mwdebug1002
  • 19:16 ebernhardson: pulled 336242 to mwdebug1002
  • 19:13 jynus: preparing to reimage db2045 T152188
  • 19:10 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Configure A/B test for CrossProject search results sidebar (duration: 00m 49s)
  • 19:08 ebernhardson: pulled 334673 to mwdebug1002
  • 18:16 jynus: preparing to reimage db2050 T152188
  • 18:06 marostegui: Start to transfer commonswiki ibd and cfg from db1064 to labsdb1010 - https://phabricator.wikimedia.org/T153743
  • 17:41 mobrovac: restbase end deploy of ea980cc5
  • 17:10 mobrovac: restbase start deploy of ea980cc5
  • 17:03 hashar: Nodepool/CI back up
  • 16:51 marostegui: Stop MySQL and shutdown db1072 for raid and BBU replacement - T156226
  • 16:51 marostegui: Stop MySQL and shutdown db1072 for raid
  • 16:35 hashar: Nodepool Jessie images are back up. Trusty one is being rebuild..
  • 15:55 elukey: mc2029 shutdown for DC ops
  • 15:46 hashar: Stopping Nodepool for maintenance
  • 15:42 oblivian@tin: Finished deploy [changeprop/deploy@5f932a3]: Revert ORES throttling (duration: 03m 49s)
  • 15:39 oblivian@tin: Started deploy [changeprop/deploy@5f932a3]: Revert ORES throttling
  • 15:38 moritzm: installing lcms security updates on mediawiki canaries
  • 15:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.eqiad.wmnet
  • 15:16 gehel: starting reimage of wdqs1001 - T144536
  • 15:13 ema: cache_text: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401!log cache_maps: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 15:10 addshore@tin: Synchronized php-1.29.0-wmf.10/extensions/ORES/extension.json: T157206 ORES - Remove all (except meta) API funcationality hooks (take2) (duration: 00m 54s)
  • 14:51 moritzm: upgrading mw1262-mw1265 to hhvm 3.12.11+dfsg-1+wmf2
  • 14:46 addshore: EU SWAT all done!
  • 14:45 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: PROD-NOOP Enable InterwikiSorting on beta (duration: 00m 39s)
  • 14:42 addshore@tin: Synchronized wmf-config/Wikibase.php: T155995 Rm InterwikiSorting settings from wmgWikibaseClientSettings PT 2/2 (duration: 00m 39s)
  • 14:40 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Rm InterwikiSorting settings from wmgWikibaseClientSettings PT 1/2 (duration: 00m 41s)
  • 14:33 gehel: resetting analytics-wmde/scripts on stat1002 to the correct "production" branch
  • 14:32 ema: cp2018 stuck rebooting, powercycled
  • 14:30 addshore@tin: Synchronized dblists/compact-language-links.dblist: T157108 & T157112 Deploy Compact Language Links out of beta in French/Dutch Wikipedia (duration: 00m 40s)
  • 14:23 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Copy InterwikiSorting settings from wmgWikibaseClientSettings (duration: 00m 40s)
  • 14:19 marostegui: Stop MySQL and shutdown db2060 for maintenance - T156161
  • 14:15 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155717 Enable TwoColConflict on mw.org (duration: 00m 40s)
  • 14:09 addshore@tin: Synchronized php-1.29.0-wmf.10/extensions/ORES/extension.json: T157206 ORES - Remove all (except meta) API funcationality hooks (duration: 00m 51s)
  • 14:05 volans: fixed duplicate entries in source.list on db2040 and es2002 (trusty)
  • 13:44 ema: cache_misc: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 13:30 elukey: applied https://gerrit.wikimedia.org/r/#/c/336203/ manually to analytics1028 (hadoop worker node) as live test - T156932
  • 13:23 gehel: removing stale puppet lock file on elastic10(22|26)
  • 12:03 moritzm: upgrading mwdebug* and mw1261 to hhvm 3.12.11+dfsg-1+wmf2
  • 10:21 oblivian@tin: Finished deploy [changeprop/deploy@ac11ebe]: Deploying ores concurrency/disabling (duration: 00m 38s)
  • 10:20 oblivian@tin: Started deploy [changeprop/deploy@ac11ebe]: Deploying ores concurrency/disabling
  • 10:18 oblivian@tin: Started deploy [changeprop/deploy@ac11ebe]: Deploying ores concurrency/disabling to canary
  • 10:17 gehel: data import complete for wdqs1003, repooling - T152643
  • 10:14 marostegui: Started to transfer commonswiki (ibd and cfg) from db1064 to labsdb1011 - T153743
  • 10:14 ema: cache_maps: upgrade to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 09:40 gehel: elasticsearch - reindexing from 2017-02-04T20:00:00Z to 2017-02-05T23:59:00Z - T139043
  • 09:36 hoo: Removed 2fa from an account, per T157191
  • 09:30 marostegui: Stop MySQL Replication on db1064 for maintenance - T153743
  • 09:26 marostegui: Deploy ALTER table db1028 metawiki.pagelinks - T153300
  • 09:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1028 - T153300 (duration: 00m 42s)
  • 08:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 - T153743 (duration: 00m 41s)
  • 08:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1008.eqiad.wmnet
  • 07:38 elukey: bootstrapping aqs1009-a (new AQS cassandra instance)
  • 07:30 marostegui: Stop MySQL on db1095 to snapshot it to es1017 - T153743
  • 07:03 marostegui: Upgrade mariadb+packages db1039 - T153300
  • 07:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2060 - T156161 (duration: 00m 40s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Feb 6 02:25:51 UTC 2017 (duration 5m 20s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 07m 30s)

2017-02-05

  • 23:42 gehel: truncating elasticsearch logs on elastic10(24|26|40) - T139043
  • 23:41 gehel: truncating elasticsearch logs on elastic1022 - T139043
  • 03:28 Amir1: ladsgroup@scb100[1-4]:~$ sudo service celery-ores-worker restart (T157206)
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Feb 5 02:26:26 UTC 2017 (duration 5m 18s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 08m 20s)

2017-02-04

  • 19:24 elukey: Started nodetool-b cleanup on aqs1005 (after 1008-{ab} bootstraps)
  • 11:44 elukey: Started nodetool-a cleanup on aqs1008 (after 1008-{ab} bootstraps)
  • 09:09 elukey: Started nodetool-a cleanup on aqs1005 (after 1008-{ab} bootstraps)
  • 02:28 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Feb 4 02:28:33 UTC 2017 (duration 5m 23s)
  • 02:23 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 07m 53s)

2017-02-03

  • 22:27 mutante: switching webproxy.*.wmnet CNAMEs from carbon to new install servers (T123733) - watching squid access logs
  • 19:02 ostriches: gerrit: flushed all web_sessions, you'll have to login again. Sorry
  • 18:07 godog: stop carbon-cache on graphite1001 to prevent useless write load
  • 17:05 godog: fail over read traffic from graphite1001 to graphite2001 https://gerrit.wikimedia.org/r/335761 - T157022
  • 16:10 godog: rsync coal data graphite1001 -> graphite2001
  • 15:03 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2061 (duration: 00m 40s)
  • 15:01 jynus: preparing to reimage db2039 T111654
  • 14:35 chasemp: restart apache on graphite1001 to see if it helps sqlite lock isssue
  • 14:30 jynus: upgrade and restart db2061 T111654
  • 14:26 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2061 (duration: 00m 40s)
  • 13:58 jynus: restarting and upgrading db2041 T111654
  • 12:16 gehel: restarting relforge1001 to pick up new master configuration
  • 11:21 jynus: preparing to reimage db2054 T111654
  • 11:01 marostegui: Alter table metawiki.pagelinks on db1039 (depooled) - T153300
  • 10:54 jynus: preparing to reimage db2053 T111654
  • 10:50 moritzm: mwdebug* and mw1261 have been reverted to previous HHVM package
  • 10:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 - T153743 (duration: 00m 42s)
  • 10:04 marostegui: Reboot db1064 to pick up the new kernels T153743
  • 09:59 marostegui: Upgrade db1064 from MariaDB 10.0.23 to 10.0.29 - T153743
  • 09:55 gehel: restarting relforge1002 to pick up new master configuration
  • 09:54 jynus: upgrade & restart of db2063 T111654
  • 09:48 marostegui: Restart mysql on db1064 to get its binary log changed to ROW - T153743
  • 09:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 - T153743 (duration: 00m 40s)
  • 09:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1064 - T153743 (duration: 00m 40s)
  • 09:39 moritzm: upgraded mwdebug* and mw1261 to the new HHVM package
  • 09:22 moritzm: uploaded hhvm_3.12.11+dfsg-1+wmf2 to apt.wikimedia.org
  • 09:10 elukey: Replace Redis/Memcached shards mc2008->2011 with mc2026->mc2029
  • 08:50 moritzm: installing tomcat regression updates on trusty hosts (jessie update was fine)
  • 08:15 moritzm: restarting prometheus servers to pick up openssl update
  • 08:05 elukey: bootstrapping aqs1008-b (AQS Cassandra instance)
  • 07:41 moritzm: upgrading firejail on remaining wtp/Parsoid hosts
  • 07:15 marostegui: Stop MySQL db1095 to snapshot it to es1013:/srv/tmp - T153743
  • 02:38 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Feb 3 02:38:10 UTC 2017 (duration 5m 3s)
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 12m 06s)
  • 00:51 dereckson@tin: Synchronized dblists/related-articles-footer-blacklisted-skins.dblist: Adjust RelatedArticles deployment scale for Mobile English Wikipedia (T154681) (duration: 00m 39s)
  • 00:48 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set site name for ku.wiktionary (T29878) (duration: 00m 39s)
  • 00:45 dereckson@tin: Synchronized wmf-config/: Adjust RelatedArticles deployment scale for Mobile English Wikipedia (T154681) (duration: 00m 42s)
  • 00:42 dereckson@tin: Synchronized dblists/related-articles-footer-blacklisted-skins.dblist: Enable RelatedArticles on Mobile French Wikipedia (T156362) (duration: 00m 44s)
  • 00:33 dereckson@tin: Synchronized static/apple-touch/wikipedia.png: Update apple touch icon for Wikipedia (T152538) (duration: 00m 39s)
  • 00:24 dereckson@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Limit page images on beta cluster to images in the lead section (no-op in prod) (duration: 00m 41s)
  • 00:17 dereckson@tin: Synchronized wmf-config/interwiki.php: Interwiki map update (duration: 00m 40s)

2017-02-02

  • 23:51 twentyafterfour: 1.29.0-wmf.10 appears to be stable. Train deployment complete.
  • 23:38 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.10
  • 23:33 twentyafterfour@tin: Synchronized php-1.29.0-wmf.10/includes/cache/MessageCache.php: deploy I5b84b1 refs T156996 (duration: 00m 45s)
  • 23:11 greg-g: Gerrit: we'll be flushing session caches momentarily, sorry for the inconvenience
  • 21:50 gehel: reimaging relforge1002.eqiad.wmnet
  • 20:35 mutante: carbon - disabling puppet (to stop it from re-adding second IPv6 address causing issues with ferm rules)
  • 20:17 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to 1.29.0-wmf.9
  • 20:14 twentyafterfour: rolling back to wmf.9 due to T156996
  • 20:10 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.10
  • 20:04 twentyafterfour: deploying MediaWiki 1.29.0-wmf.10 to all wikis
  • 19:29 tgr: reset wikimedia 2FA for jdlrobson
  • 19:24 tgr: reset wikitech 2FA for jdlrobson
  • 19:13 hashar: Gracefully restarting Jenkins
  • 19:09 ejegg: updated fundraising tools from 931c8cf to 8a65e38
  • 19:03 ejegg: re-enabled thank you mail job
  • 18:59 ejegg: disabled thank you mail job
  • 18:04 bsitzmann@tin: Finished deploy [mobileapps/deploy@09101f7]: try previous deploy again (at least on canary) (duration: 00m 51s)
  • 18:03 bsitzmann@tin: Started deploy [mobileapps/deploy@09101f7]: try previous deploy again (at least on canary)
  • 18:01 bearND: ^ reverted previous deploy due to incorrect links in the news endpoint
  • 18:00 bsitzmann@tin: Finished deploy [mobileapps/deploy@09101f7]: (no justification provided) (duration: 01m 56s)
  • 17:58 bsitzmann@tin: Started deploy [mobileapps/deploy@09101f7]: (no justification provided)
  • 17:56 jynus: upgrade & restart of db2059 T111654
  • 17:43 jynus: upgrade & restart of db2052 T111654
  • 17:32 mobrovac: restbase deploy end of 634faea2
  • 17:08 mobrovac: restbase deploying 634faea2
  • 16:56 reedy@tin: Synchronized php-1.29.0-wmf.10/extensions/ConfirmEdit/maintenance/GenerateFancyCaptchas.php: Fix inclusion path (duration: 00m 41s)
  • 16:24 elukey: reboot mc2019->mc2025 to see if they come up cleanly (currently codfw replicas of eqiad redis shards)
  • 16:13 elukey: rebooting mc202[6789] (not serving any traffic) to see if they come up cleanly
  • 16:00 elukey: rebooting mc203[01234] (not serving any traffic) to see if they come up cleanly
  • 15:43 moritzm: upgrading firejail on remaining app servers
  • 15:35 moritzm: upgrading firejail on wtp1001
  • 15:15 moritzm: upgrading firejail on eqiad imagescalers
  • 15:11 elukey: rebooting mc203[56] (not taking any traffic) to test if they come up cleanly
  • 15:01 elukey: Replace Redis/Memcached shards mc200[4567] with mc202[2345]
  • 14:51 godog: manually fail sdc on graphite1001 - T157022
  • 14:23 addshore: EU SWAT really finished
  • 14:23 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT Create Wikiprojekti namespace on Finnish Wikipedia T156621 (duration: 00m 41s)
  • 14:14 addshore: EU SWAT finished
  • 14:14 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT Add Wikinews languages (en, pt, ca, fr, de, it) as import sources on eswikinews T156737 (duration: 00m 40s)
  • 14:10 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT Enable ElectronPdfService extension on dewiki T150942 (duration: 00m 41s)
  • 12:53 gehel: starting reimage to jessie of elasticsearch relforge - T151326
  • 11:59 joal@tin: Finished deploy [analytics/refinery@bc4b4ed]: (no justification provided) (duration: 01m 14s)
  • 11:58 joal@tin: Started deploy [analytics/refinery@bc4b4ed]: (no justification provided)
  • 11:40 elukey: Swap mc2002 with mc2020, mc2003 with mc2021 (Redis codfw replicas) - T155755
  • 11:35 joal@tin: Finished deploy [analytics/refinery@bc4b4ed]: (no justification provided) (duration: 03m 03s)
  • 11:32 joal@tin: Started deploy [analytics/refinery@bc4b4ed]: (no justification provided)
  • 10:53 elukey: Swap mc2001 with mc2019 (Redis codfw replicas) - T155755
  • 10:34 moritzm: restarted hadoop-mapreduce-historyserver on analytics1001
  • 10:23 moritzm: rolling restart of nginx tls terminators running on mw* application servers in eqiad to pick up openssl 1.1 update
  • 09:54 moritzm: rolling restart of logstash cluster to pick up openjdk/NSS security updates
  • 09:19 jynus: deploying schema change to page_assessments_projects on enwikivoyage T156305
  • 09:18 moritzm: uograding remaining canary servers to new HHVM packages
  • 09:17 marostegui: Remove dbstore1001:/srv/tmp/db1063.tar.gz after it has been transferred to db1095:/srv/tmp/db1063.tar.gz to get more disk space
  • 08:57 jynus: deploying schema change to page_assessments_projects on enwiki T156305
  • 08:42 filippo@tin: Finished deploy [prometheus/jmx_exporter@23a8f0b]: jmx_exporter deploy (duration: 00m 04s)
  • 08:42 filippo@tin: Started deploy [prometheus/jmx_exporter@23a8f0b]: jmx_exporter deploy
  • 08:30 moritzm: installing ntfs-3g security update on labnodepool (other servers had it deinstalled)
  • 08:26 filippo@tin: Finished deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided) (duration: 00m 07s)
  • 08:26 filippo@tin: Started deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided)
  • 08:24 marostegui: Transfer /srv/tmp/db1063.tar.gz from dbstore1001 to db1095:/srv/tmp to gain disk space
  • 08:24 marostegui: Remove /srv/tmp/db1067.tar.gz from dbstore1001 to gain disk space
  • 08:23 filippo@tin: Finished deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided) (duration: 00m 12s)
  • 08:23 filippo@tin: Started deploy [prometheus/jmx_exporter@23a8f0b]: (no justification provided)
  • 08:10 legoktm@tin: Synchronized php-1.29.0-wmf.10/includes/specials/pagers/ActiveUsersPager.php: Make last remaining user_groups queries honor $wgDisableUserGroupExpiry https://gerrit.wikimedia.org/r/#/c/335587/ (T156995) (duration: 00m 51s)
  • 08:08 legoktm@tin: Synchronized php-1.29.0-wmf.10/includes/api/ApiQueryAllUsers.php: Make last remaining user_groups queries honor $wgDisableUserGroupExpiry https://gerrit.wikimedia.org/r/#/c/335587/ (T156995) (duration: 00m 58s)
  • 07:41 marostegui: Restart MySQL on db2012 to tune some innodb_ft flags - T156905
  • 03:24 eileen1: renable all other jenkins jobs - only some dedupe & one-off jobs disabled
  • 03:17 eileen1: re-enable dedupe jobs
  • 03:14 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Feb 2 03:14:06 UTC 2017 (duration 5m 44s)
  • 03:08 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 13m 49s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 14m 39s)
  • 02:32 eileen1: update civicrm from d06db92 to e45da6d
  • 02:23 eileen1: drush dis -y module_missing_message_fixer
  • 02:22 eileen1: drush mmmff --all
  • 02:22 eileen1: run drush en -y module_missing_message_fixer
  • 02:20 eileen1: Update civicrm from 7a86121 to d06db92
  • 02:02 mutante: carbon - remove unmapped IPv6 address making ferm rules fail, use only the _mapped_ IP (ip addr del 2620:0:861:1:7a2b:cbff:fe09:ea0/64 dev eth0) (T84380 T132757)
  • 01:50 eileen1: update CiviCRM from e17622b to 7a86121
  • 01:25 eileen1: jenkins disable Dedupe CiviCRM contacts & Dedupe major gifts 500_
  • 01:04 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/335578/ (duration: 00m 42s)
  • 00:30 maxsem@tin: Synchronized wmf-config/CirrusSearch-common.php: https://gerrit.wikimedia.org/r/#/c/335265/2 (duration: 00m 40s)
  • 00:21 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/335561/ (duration: 01m 00s)

2017-02-01

  • 22:54 mutante: carbon - rsyncing /srv/ data to install1002 (T132757)
  • 22:46 dereckson@tin: Synchronized wmf-config/: Folder sync to get around caching issue in previous deployments (T156942) (duration: 00m 45s)
  • 22:08 jynus: deploying schema change to page_assessments_projects on testwiki T156305
  • 21:49 bsitzmann@tin: Finished deploy [mobileapps/deploy@09101f7]: Update mobileapps to e48a88c (duration: 03m 04s)
  • 21:46 bsitzmann@tin: Started deploy [mobileapps/deploy@09101f7]: Update mobileapps to e48a88c
  • 21:39 twentyafterfour@tin: Synchronized wmf-config/InitialiseSettings.php: sync InitializeSettings to activate change from previous patches refs T156942 (duration: 00m 41s)
  • 21:21 reedy@tin: Synchronized php-1.29.0-wmf.10/includes/api/: Guard more ug_expiry queries (duration: 00m 48s)
  • 20:36 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.10
  • 20:29 twentyafterfour@tin: Synchronized wmf-config/abusefilter.php: deploy I0b4e02 refs T156942 (duration: 00m 39s)
  • 20:15 twentyafterfour@tin: Synchronized wmf-config/flaggedrevs.php: deploy I1683b1 refs T156942 (duration: 00m 40s)
  • 20:10 jynus: continuing mariadb rolling restart of db2044, db2051, db2058, db2065
  • 20:01 ejegg: updated payments-wiki from dd8a16d to 4466b9d
  • 19:59 twentyafterfour: scheduled downtime in icinga for phab2001's phd service
  • 19:59 twentyafterfour: Freshening phabricator's elasticsearch index, currently 50% complete
  • 19:27 twentyafterfour: disabled read-only in phabricator
  • 19:25 twentyafterfour: running puppet on iridium to activate the config change
  • 19:20 jynus: reloading haproxy on dbproxy1003
  • 19:11 jynus: remaining 7 minute with phabricator up, but read-only
  • 19:10 ostriches: phabricator: now in read-only mode
  • 19:08 jynus: scheduling 10 minutes of emergency downtime on phabricator
  • 19:06 mobrovac: restbase deploy end of 96a641aa
  • 18:49 joal@tin: Finished deploy [analytics/refinery@2b9a70a]: (no justification provided) (duration: 02m 33s)
  • 18:46 joal@tin: Started deploy [analytics/refinery@2b9a70a]: (no justification provided)
  • 18:34 mobrovac: restbase deploy start of 96a641aa
  • 16:54 marostegui: Optimize table phabricator_search.search_documentfield on db2012 - T156905
  • 16:41 jynus: mariadb rolling restart of db2037, db2044, db2051, db2058, db2065
  • 16:20 elukey: restarting Yarn Node Manager daemons on all the Hadoop nodes to bandaid a memory leak causing OOMs
  • 16:18 marostegui: Optimizing table search_documentfield on db1048 - T156905
  • 15:50 akosiaris: stop ircecho for a while to weather out most of the puppet alert storm
  • 15:46 akosiaris: restart puppetdb on nihal (openjdk upgrade)
  • 15:43 akosiaris: restart puppetdb on nitrogen
  • 15:40 jynus: preparing db1067 for reimage to jessie
  • 15:37 moritzm: upgrading canary app servers to new HHVM package (initially mwdebug and mw1261)
  • 15:17 Dereckson: `mwscript populateCategory.php plwikisource --force` to refresh categories stats (T156670)
  • 15:17 dereckson@tin: Finished scap: Full scap to propagate a core namespace l10n change (duration: 40m 10s)
  • 14:41 godog: upgrade thumbor to 0.1.34
  • 14:37 dereckson@tin: Started scap: Full scap to propagate a core namespace l10n change
  • 14:25 jynus: dropping and replacing events on db1057 - db1052 T156008
  • 14:24 dereckson@tin: Synchronized php-1.29.0-wmf.9/languages/messages/MessagesJv.php: Update namespace localisation in Javanese (T155957) (duration: 00m 40s)
  • 14:21 dereckson@tin: Synchronized php-1.29.0-wmf.10/languages/messages/MessagesJv.php: Update namespace localisation in Javanese (T155957) (duration: 00m 45s)
  • 14:12 moritzm: uploaded hhvm 3.12.12 to carbon
  • 14:10 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ElectronPdfService on meta (T150943) (duration: 00m 48s)
  • 13:39 marostegui: Deploy alter table dbstore1002 metawiki.pagelinks - T153300
  • 13:38 akosiaris: issue sudo hdparm -Y /dev/sdb on bast3001 to force a problematic drive to sleep
  • 13:21 marostegui: Clean up db1043 replication thread (it was replicating from db1048 which looks like an old thing) - T156905
  • 12:09 elukey@tin: Finished deploy [analytics/refinery@e6254a4]: (no justification provided) (duration: 04m 41s)
  • 12:04 elukey@tin: Started deploy [analytics/refinery@e6254a4]: (no justification provided)
  • 11:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2061 - T153300 (duration: 00m 40s)
  • 11:25 moritzm: removing ntfs-3g from various trusty servers
  • 11:14 godog: bounce leaking thumbor@8813 on thumbor1001
  • 11:08 kartik@tin: Finished deploy [cxserver/deploy@0e4ae4f]: (no justification provided) (duration: 02m 04s)
  • 11:06 kartik@tin: Started deploy [cxserver/deploy@0e4ae4f]: (no justification provided)
  • 07:53 marostegui: Deploy alter table metawiki.pagelinks db2061 - T153300
  • 07:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2061 - T153300 (duration: 00m 53s)
  • 07:43 moritzm: rolling restart of cassandra in eqiad to pick up openjdk and NSS security updates
  • 07:41 elukey: bootstrapping aqs1008-a on aqs1008 (new AQS cassandra node)
  • 07:31 marostegui: Force WB policy on the raid controller db1072 - T156226
  • 07:13 akosiaris: restart thumbor process on thumbor1001, thumbor1002, apply a different LimitNOFILE on thumbo1002
  • 04:17 mutante: carbon - rsyncing entire /srv over to install2002 (T156440)
  • 03:00 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Feb 1 03:00:32 UTC 2017 (duration 5m 35s)
  • 02:54 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 03m 42s)
  • 02:48 mutante: install1002, install2002 - install jessie, sign puppet certs, initial puppet run (T132757, T156440)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 11m 52s)
  • 02:20 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 39s)
  • 01:19 mutante: ganeti: create instance install2002 with 80G disk, 2G RAM (T156440)
  • 01:15 mutante: ganeti: install1001 - remove virtual disk 1 from instance | create instance install1002 instead (T132757)
  • 00:57 mutante: Ganglia is now deprecated in favor of Grafana (https://phabricator.wikimedia.org/T145659#2925104)
  • 00:33 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/334025/ (duration: 00m 40s)
  • 00:32 maxsem@tin: Synchronized php-1.29.0-wmf.9/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/335263/ (duration: 00m 58s)

2017-01-31

  • 23:52 ppchelko@tin: Finished deploy [changeprop/deploy@e27c3a0]: Update change-prop to fix wikidata rollback rule (duration: 01m 32s)
  • 23:51 ppchelko@tin: Started deploy [changeprop/deploy@e27c3a0]: Update change-prop to fix wikidata rollback rule
  • 22:27 twentyafterfour: cleaned up old branches: wmf.3 and wmf.4
  • 21:58 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.10
  • 21:50 twentyafterfour@tin: Synchronized wmf-config/: sync ExtensionMessages-1.29.0-wmf.10.php (duration: 00m 47s)
  • 21:41 twentyafterfour@tin: scap sync-l10n completed (1.29.0-wmf.10) (duration: 19m 49s)
  • 21:11 twentyafterfour: wmf-config/ExtensionMessages-1.29.0-wmf.10.php is missing refs T155525
  • 21:05 twentyafterfour@tin: Finished scap: (no justification provided) (duration: 27m 16s)
  • 20:37 twentyafterfour@tin: Started scap: (no justification provided)
  • 20:37 twentyafterfour: syncing 1.29.0-wmf.10 to test wikis
  • 20:32 jynus: stopping db1063 mariadb before full host reimage
  • 18:58 arlolra: Updated Parsoid to version 734dc996 (T98960)
  • 18:51 arlolra@tin: Finished deploy [parsoid/deploy@dc2323d]: Updating Parsoid to 734dc996 (duration: 12m 58s)
  • 18:46 ottomata: recentchange events now flowing into Kafka via EventBus T152030
  • 18:45 otto@tin: Synchronized wmf-config/CommonSettings.php: Enabling RCFeed -> EventBus (duration: 00m 42s)
  • 18:44 otto@tin: Synchronized wmf-config/CommonSettings-labs.php: Enabling RCFeed -> EventBus (duration: 00m 43s)
  • 18:38 arlolra@tin: Started deploy [parsoid/deploy@dc2323d]: Updating Parsoid to 734dc996
  • 18:06 jynus: end up tendril and dbtree maintenance, things should be back up, report if you see degradations of service
  • 17:37 jynus: stopping mysql, upgrading and restarting db1011- temporary outage of tendril & dbtree T111654
  • 16:59 robh: disabled puppet on einsteinium while i try to figure out what i broke in my config for icinga
  • 16:11 elukey: started Cassandra nodetool cleanup for aqs1007-a
  • 16:03 elukey: started Cassandra nodetool cleanup for aqs1004-b
  • 14:57 jynus: upgrading and restarting db1095 (sanitarium2)
  • 14:12 elukey: restarting hhvm on mw1204 (dump debug in /tmp/hhvm.29120.bt)
  • 14:07 aude@tin: Synchronized wmf-config/Wikibase.php: Update property suggester config (duration: 00m 42s)
  • 13:59 elukey: rebooted analytics1039 to pick up uuids in fstab - T147879
  • 12:26 addshore: TwoColConflict deploy slot done!
  • 12:26 addshore@tin: Synchronized wmf-config/CommonSettings-labs.php: Enable TwoColConflict on test wikis (T155716) 5/5 (duration: 00m 40s)
  • 12:25 addshore@tin: Synchronized wmf-config/CommonSettings.php: Enable TwoColConflict on test wikis (T155716) 4/5 (duration: 00m 40s)
  • 12:24 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TwoColConflict on test wikis (T155716) 3/5 (duration: 00m 42s)
  • 12:23 addshore@tin: Synchronized wmf-config/extension-list-labs: Enable TwoColConflict on test wikis (T155716) 2/5 (duration: 00m 40s)
  • 12:22 addshore@tin: Synchronized wmf-config/extension-list: Enable TwoColConflict on test wikis (T155716) 1/5 (duration: 00m 40s)
  • 12:16 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Add twocolconflict to wgBetaFeaturesWhitelist (T150184) (duration: 00m 41s)
  • 11:14 elukey: updating the puppet compiler's facts
  • 10:42 gehel: starting reimage of wdqs1003
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool hosts in C2 - T155999 (duration: 00m 40s)
  • 09:38 moritzm: rolling restart of cassandra in codfw to pick up openjdk and NSS security updates
  • 09:00 gehel: aligning elasticsearch low watermark to 75% disk space on all clusters (eqiad was at 70%)
  • 08:44 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1007.eqiad.wmnet
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add a warning about a possible bad BBU on db1072 - T156226 (duration: 00m 46s)
  • 08:26 elukey: started Cassandra nodetool cleanup for aqs1004-a
  • 07:54 moritzm: installing chromium security update on osmium
  • 07:49 marostegui: Reboot db1072 to force BBU recharge - T156226
  • 07:10 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2034 - T156478 (duration: 00m 57s)
  • 03:49 andrewbogott: restarted nova-api on labnet1001 which actually fixed some things
  • 03:28 chasemp: (slightly belated) set logging level on serpens higher to see if ldap binding is an issue
  • 02:45 bd808: Setup temporary cron on silver as user bd808 until T156733 is fixed properly
  • 02:33 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 31 02:33:22 UTC 2017 (duration 5m 23s)
  • 02:31 bd808: Manually ran extensions/TorBlock/loadExitNodes.php on silver
  • 02:28 bd808@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TorBlock for Wikitech (duration: 00m 41s)
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 07m 09s)
  • 02:16 chasemp: restart uwsgi-keystone-admin and uwsgi-keystone-public on labcontrol1001
  • 00:25 ebernhardson: restarting elasticsearch on elastic1029, got stuck in RemoteTransportException loop again
  • 00:23 mobrovac@tin: Started restart [electron-render/deploy@f1df2d3]: Service restart for firejail upgrade
  • 00:22 mobrovac@tin: Started restart [mobileapps/deploy@7615bf9]: Service restart for firejail upgrade
  • 00:21 mobrovac@tin: Started restart [mathoid/deploy@ba3217e]: Service restart for firejail upgrade
  • 00:20 mobrovac@tin: Started restart [graphoid/deploy@da37386]: Service restart for firejail upgrade
  • 00:17 mobrovac@tin: Started restart [cxserver/deploy@5ae4f8b]: Service restart for firejail upgrade
  • 00:13 mobrovac@tin: Started restart [citoid/deploy@95df861]: Service restart for firejail upgrade
  • 00:10 mobrovac@tin: Started restart [changeprop/deploy@2b980fa]: Service restart for firejail upgrade

2017-01-30

  • 21:49 eileen1: killed long running user-initiated dedupe query
  • 21:24 eileen1: updated civicrm from 6b6f5d6 to e17622b
  • 21:13 mobrovac@tin: Finished deploy [trending-edits/deploy@5735f00]: (no justification provided) (duration: 03m 13s)
  • 21:10 mobrovac@tin: Started deploy [trending-edits/deploy@5735f00]: (no justification provided)
  • 21:09 mobrovac@tin: Finished deploy [trending-edits/deploy@5735f00]: (no justification provided) (duration: 03m 07s)
  • 21:06 mobrovac@tin: Started deploy [trending-edits/deploy@5735f00]: (no justification provided)
  • 20:57 mobrovac@tin: Finished deploy [trending-edits/deploy@5735f00]: Bump memory limit and heartbeat timeout (duration: 01m 48s)
  • 20:55 mobrovac@tin: Started deploy [trending-edits/deploy@5735f00]: Bump memory limit and heartbeat timeout
  • 20:50 godog: uploaded scap 3.5.1-1
  • 20:50 thcipriani@tin: Synchronized README: test scap (duration: 00m 43s)
  • 20:47 eileen1: updated localsettings to a346207
  • 20:45 mobrovac@tin: Finished deploy [trending-edits/deploy@9addcd0]: Bump max_age to 18h for T156411 (duration: 02m 39s)
  • 20:43 mobrovac@tin: Started deploy [trending-edits/deploy@9addcd0]: Bump max_age to 18h for T156411
  • 20:25 eileen1: disable drupal update module on prod. T155084, this should still be on on dev sites so not using update script
  • 20:12 Pchelolo: update RESTBase to 501ea47edc in staging
  • 20:09 ejegg: updated payments-wiki config to d98b30b
  • 19:47 gehel@tin: Finished deploy [wdqs/wdqs@81442a0]: (no justification provided) (duration: 01m 23s)
  • 19:46 gehel@tin: Started deploy [wdqs/wdqs@81442a0]: (no justification provided)
  • 19:44 gehel: deploying latest wdqs gui
  • 18:56 thcipriani: unlocking mediawiki deployments for test
  • 18:53 nuria@tin: Finished deploy [eventlogging/analytics@4b28b14]: (no justification provided) (duration: 00m 11s)
  • 18:53 nuria@tin: Started deploy [eventlogging/analytics@4b28b14]: (no justification provided)
  • 18:50 nuria: rollback deployment to eventlogging
  • 18:48 thcipriani: mediawiki deployments momentarily
  • 18:46 nuria@tin: Finished deploy [eventlogging/analytics@4b28b14]: (no justification provided) (duration: 00m 04s)
  • 18:46 nuria@tin: Started deploy [eventlogging/analytics@4b28b14]: (no justification provided)
  • 18:42 Pchelolo: update RESTBase to cd2b5e019
  • 18:38 Pchelolo: update RESTBase to cd2b5e019: canary on restbase2001
  • 18:36 godog: upload scap 3.5.0-1 - T127762
  • 18:18 gehel: nginx upgrade and wdqs restart complete - sorry for the noise
  • 18:13 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1002.codfw.wmnet
  • 18:12 Niharika: updated scholarships Fixed some bugs with the login form
  • 18:11 moritzm: upgrading firejail on scb cluster
  • 18:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1001.codfw.wmnet
  • 18:07 Pchelolo: update RESTBase to cd2b5e019: canary on restbase1007
  • 18:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2034 IP - T156478 (duration: 00m 40s)
  • 18:04 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2034 IP - T156478 (duration: 00m 40s)
  • 18:04 gehel: rolling restart of nginx and wdqs for updates
  • 17:41 Pchelolo: update RESTBase to cd2b5e019: staging
  • 17:09 legoktm@tin: Finished scap: Build l10n cache for linter (duration: 22m 43s)
  • 17:05 marostegui: Shutdown mysql and poweroff db2034 for maintenance - T156478
  • 16:46 legoktm@tin: Started scap: Build l10n cache for linter
  • 16:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2034 for maintenance - T156478 (duration: 00m 40s)
  • 16:34 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs2003.codfw.wmnet
  • 15:42 hashar@tin: Synchronized php-1.29.0-wmf.9/extensions/timeline/Timeline.body.php: debug log EasyTimeline error - T138036 (duration: 00m 46s)
  • 15:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with its original weight - T156226 (duration: 00m 52s)
  • 14:45 hashar@tin: Synchronized php-1.29.0-wmf.9/languages/Language.php: translateBlockExpiry: Duration is block expiry minus current time - T156453 (duration: 00m 42s)
  • 14:29 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RSS extension at metawiki, enable one feed - T155830 (duration: 00m 42s)
  • 14:15 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Create namespace alias وگ for NS_PROJECT in fawikiquote - T156451 (duration: 00m 40s)
  • 14:10 hashar@tin: Synchronized wmf-config/flaggedrevs.php: Remove flaggedrevs-protect-review page protection from enwiki - T156448 (duration: 00m 41s)
  • 14:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on tgwiki - T156473 (duration: 00m 40s)
  • 14:06 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on gdwiki - T156281 (duration: 00m 48s)
  • 11:23 gehel: upgrade and restart nginx on elasticsearch eqiad cluster
  • 10:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with less weight - T156226 (duration: 00m 49s)
  • 10:00 gehel: upgrade and restart nginx on elasticsearch codfw cluster
  • 09:58 gehel: upgrade and restart nginx on relforge cluster
  • 09:44 godog: upgrade to thumbor 0.1.33 - T151066
  • 09:37 ariel@tin: Finished deploy [dumps/dumps@4a9e952]: proper md5sum format for adds/changes dumps (duration: 00m 02s)
  • 09:37 ariel@tin: Starting deploy [dumps/dumps@4a9e952]: proper md5sum format for adds/changes dumps
  • 09:25 elukey: bootstrapping new cassandra instance (aqs1007-b) on AQS - https://gerrit.wikimedia.org/r/#/c/334753/
  • 09:19 moritzm: installing tcpdump security updates
  • 09:06 marostegui: Upgrade db2012 to 10.0.29-2 (this was done couple of hours ago, but for the record) - T156373
  • 09:05 marostegui: Start slaves from s1 to s7 on dbstore2001 - T156373
  • 08:54 moritzm: installing NSS security updates on kafka and Hadoop clusters
  • 08:45 elukey: restarting aqs on aqs100[4567] to pick up NSS updates
  • 08:19 elukey: set mw1236.eqiad.wmnet pooled=inactive because powered off (no mentions on the SAL, still trying to find why)
  • 08:05 moritzm: switched application servers in codfw to systemd-timesyncd
  • 08:04 marostegui: Stop mysql db1073 to use it to clone db1072 - T156226
  • 08:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T156226 (duration: 02m 45s)
  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 30 02:21:21 UTC 2017 (duration 4m 22s)
  • 02:16 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 06m 11s)

2017-01-29

  • 02:21 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jan 29 02:21:37 UTC 2017 (duration 4m 23s)
  • 02:17 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 06m 06s)

2017-01-28

  • 02:22 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jan 28 02:22:49 UTC 2017 (duration 4m 47s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 05m 39s)

2017-01-27

  • 22:45 mutante: install1001 - adding a second virtual hard disk, 80G
  • 22:31 mutante: carbon: rsync entire /srv/ to install2001 (this is APT data but also misc things like junos, megacli, firmware, ipmi
  • 21:41 volans: restored watchmouse checks for s5 (de wiki), Main_Page redirect was restored
  • 21:36 mobrovac@tin: Finished deploy [trending-edits/deploy@0e79bec]: Bump max_age to 12h T156411 (duration: 01m 58s)
  • 21:34 mobrovac@tin: Starting deploy [trending-edits/deploy@0e79bec]: Bump max_age to 12h T156411
  • 21:28 mobrovac@tin: Finished deploy [trending-edits/deploy@e0e32bb]: Restart the service to assess the load of replaying the last 6h T156411 (duration: 01m 03s)
  • 21:27 mobrovac@tin: Starting deploy [trending-edits/deploy@e0e32bb]: Restart the service to assess the load of replaying the last 6h T156411
  • 20:24 jynus: restart and upgrade mariadb on db1048
  • 20:14 mutante: db1019, db1042, analytics1015, analytics1026 - puppet node deactivate, remove from icinga, finish decom (T147313, T149793, T146265)
  • 20:07 volans: updated watchmouse checks for s5 (de wiki) because Main_Page was deleted, used the localized page instead
  • 19:52 mutante: db1019 - shutdown -h now (T146265)
  • 19:51 mutante: db1042 - i came to shut it down .. and noticed it had died (or somebody did it) about 3 hours ago .. there it goes (T149793)
  • 18:52 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to 1.29.0-wmf.9
  • 18:52 twentyafterfour: Rolling forward with group2 to 1.29.0-wmf.9 refs T156364 T154683
  • 17:55 papaul: OS installation on mc2019-mc2036
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1072 change IP - T156226 (duration: 00m 40s)
  • 16:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: db1072 change IP - T156226 (duration: 00m 40s)
  • 16:01 jynus: submitted wmf-mariadb10_10.0.29-2 for T156373 fix
  • 15:48 marostegui: Stop mysql and shutdown db1072 for maintenance - T156226
  • 15:45 ema: cache_text: ban req.url == "/apple-app-site-association" && obj.status == 404 (T155504)
  • 14:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s7 in codfw (duration: 00m 40s)
  • 14:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s7 in eqiad (duration: 00m 40s)
  • 13:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s6 in eqiad (duration: 00m 40s)
  • 13:44 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s6 in codfw (duration: 00m 40s)
  • 13:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s5 in eqiad (duration: 00m 40s)
  • 13:30 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s5 in codfw (duration: 00m 40s)
  • 13:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s4 in codfw (duration: 00m 41s)
  • 13:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s4 in eqiad (duration: 00m 43s)
  • 13:11 jynus: starting db1048 until db1043-bin.001457:753455353, expect it to stop soon
  • 13:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s3 in eqiad (duration: 00m 40s)
  • 13:00 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s3 in codfw (duration: 00m 40s)
  • 12:24 moritzm: upgrading mediawiki canaries to new openssl 1.1 package
  • 12:13 moritzm: upgrading openjdk-7 packages (security updates) on wdqs cluster
  • 11:56 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add rack positions for s2 in codfw (duration: 00m 47s)
  • 11:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions for s2 in eqiad (duration: 00m 59s)
  • 11:11 moritzm: initial installation of openssl bugfix/security updates
  • 10:54 moritzm: uploaded openssl 1.0.2k for jessie-wikimedia to carbon
  • 10:35 paravoid: manually running certspotter -all_time as my user on einstenium (will take a few days to complete)
  • 10:21 legoktm: added addshore to labs-tools-wikibugs2 gerrit group
  • 08:13 moritzm: uploaded openssl 1.1.0d packages for jessie-wikimedia to carbon
  • 02:58 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jan 27 02:58:52 UTC 2017 (duration 5m 46s)
  • 02:53 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 14m 14s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 07m 57s)
  • 01:20 twentyafterfour: deploying hotfix for phabricator refs T154479
  • 00:48 mutante: carbon - moved the 1.5TB /srv/"mirrors.off", which used to be mirrors but is now on sodium, into / to that /srv/ can be synced without this
  • 00:32 dereckson@tin: Synchronized wmf-config/throttle.php: Fix throttle rule for Her Girl Friday + Lenny Unconference (T156278) (duration: 00m 53s)
  • 00:12 volans: re-enabled puppet (with a temporary fix to keep parsoid-vd and parsoid-vd-client stopped) on ruthenium T156177

2017-01-26

  • 23:49 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 wikis to 1.29.0-wmf.8
  • 23:37 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.9
  • 23:33 robh: archiva.w.o maint done, uses new LE cert.
  • 23:14 robh: going to try to convert archiva.wikimedia.org from GS to LE cert. will require rehup of nginx
  • 22:58 mutante: db1019, db1042 - revoke puppet certs, delete salt keys, schedule icinga downtime, stop services (T149793, T146265)
  • 22:56 mutante: analytics1015, analytics1026 - puppet node clean (again?) - again having problems to remove decom'ed nodes from Icinga (T147313)
  • 22:53 mutante: analytics1015, analytics1026 - puppet node clean (again?) - again having problems to remove decom'ed nodes from Icinga
  • 22:46 mutante: mw1181, mw1272, mw1212, mw1174 - service hhvm restart
  • 22:06 mobrovac@tin: Finished deploy [trending-edits/deploy@e0e32bb]: Bump replay time to 6h for T156411 (duration: 01m 42s)
  • 22:04 mobrovac@tin: Starting deploy [trending-edits/deploy@e0e32bb]: Bump replay time to 6h for T156411
  • 20:30 andrewbogott: refreshing logins on wikitech
  • 19:13 elukey: restore analytics1001 as RM and HDFS masters
  • 18:56 otto@tin: Finished deploy [eventstreams/deploy@f1a1866]: (no message) (duration: 03m 16s)
  • 18:52 otto@tin: Starting deploy [eventstreams/deploy@f1a1866]: (no message)
  • 18:36 elukey: restarting Yarn node managers on an102[89] and an103[01], impacted by the switch restart
  • 18:32 paravoid: starting pybal on lvs1001/lvs1002/lvs1003
  • 18:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055, 56, 57, 59 (duration: 00m 54s)
  • 18:14 paravoid: rebooting newly provisioned asw-c2-eqiad to enable mixed mode
  • 17:57 elukey: boostrapping aqs1007-a cassandra instance
  • 17:51 paravoid: replacing asw-c2-eqiad
  • 17:46 paravoid: stopping pybal on lvs1001/lvs1002/lvs1003
  • 17:34 elukey@tin: Finished deploy [analytics/aqs/deploy@5917fd4]: (no message) (duration: 02m 25s)
  • 17:31 elukey@tin: Starting deploy [analytics/aqs/deploy@5917fd4]: (no message)
  • 15:43 bblack: cache_misc puppet re-enabled and up to date
  • 15:37 godog: bounce uwsgi on graphite1003 with less workers - T155872
  • 15:35 moritzm: installing gnupg2 updates from jessie point update
  • 15:15 akosiaris: T156242 add /dev/sdb partitions to mdadm devices
  • 15:15 bblack: puppet disabled on cache_misc for merging complicated stuff
  • 15:06 moritzm: installing gnupg updates from jessie point update
  • 14:52 jynus: stopping mysql on db1048 T156373
  • 14:29 zeljkof: finished with eu swat
  • 14:28 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: IP Cap Lift for Edit-a-Thon (T156258) [throttle] Her Girl Friday + Lenny Unconference / Editathon in NYC, 2017-01-28 (T156278) (duration: 00m 41s)
  • 13:54 godog: delete labs 'instances' graphite three for data >30d, graphite low on disk space
  • 13:53 elukey: restarting cassandra on aqs100[56] to complete the openjdk update
  • 13:32 moritzm: rolling restart of maps cluster in eqiad
  • 13:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1054 - T156225 (duration: 00m 40s)
  • 13:05 hashar@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 back to 1.29.0-wmf.9 T156310
  • 13:02 godog: reboot ms-be1013
  • 12:54 elukey: restarting the aqs1004-b casandra instance to pick up the new openjdk (last test before complete rollout)
  • 12:28 elukey: restarting the aqs1004-a casandra instance to pick up the new openjdk
  • 12:28 moritzm: upgrading java on maps cluster, rolling restart of maps cluster in codfw
  • 12:17 hashar@tin: Synchronized php-1.29.0-wmf.9/extensions/FlaggedRevs/backend/FlaggedRevision.php: Fix fatal in prod caused by deprecated function removal T156310 (duration: 00m 41s)
  • 12:04 moritzm: installing java security updates on aqs cluster
  • 11:12 hashar@tin: rebuilt wikiversions.php and synchronized wikiversions files: FlaggedRevs is broken in wmf.9 causing blank pages. T156356 T156310
  • 09:55 marostegui: Disable semi-sync on db1057 old s1 master - https://phabricator.wikimedia.org/T156008
  • 09:39 marostegui: Enable semi-sync replication on db1052 (s1 master) - T156008
  • 09:04 marostegui: Change dbstore1002 to replicate from the new s1 master db1052 - T156008
  • 08:57 marostegui: Change db1047 to replicate from the new s1 master db1052 - T156008
  • 08:48 marostegui: Change db1069 to replicate from the new s1 master db1052 - T156008
  • 08:48 jynus: deploying dns CNAME updates due to master swithover
  • 07:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change s1 master to db1057 - T156008 (duration: 00m 20s)
  • 07:18 jynus: last message was master of db1057
  • 07:18 jynus: change master of db1052 from db2016 to db1052
  • 06:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1052 - T156008 (duration: 00m 31s)
  • 03:50 mutante: rsyncing apt.wikimedia.org data from carbon to install2001 (T84380)
  • 02:07 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jan 26 02:07:05 UTC 2017 (duration 4m 42s)
  • 02:02 l10nupdate@tin: LocalisationUpdate failed (1.29.0-wmf.9) at 2017-01-26 02:02:23+00:00
  • 02:02 l10nupdate@tin: LocalisationUpdate failed (1.29.0-wmf.8) at 2017-01-26 02:02:23+00:00
  • 02:02 twentyafterfour: phabricator update complete
  • 01:47 twentyafterfour: upgrading phabricator, downtime should be minimal but expect the service to be offline for up to a few minutes
  • 01:11 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: No-op documentation change to InitialiseSettings.php (duration: 00m 46s)
  • 00:31 krenair@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/298397 part 2 (duration: 00m 42s)
  • 00:29 krenair@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/298397 (duration: 00m 43s)
  • 00:26 Krenair: sync 298397 to mwdebug1001
  • 00:11 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable deprecation logging with gerrit:334206 (duration: 00m 53s)
  • 00:09 ebernhardson: sync 334206 to mwdebug1002

2017-01-25

  • 23:39 mutante: mwdebug1002 - service hhvm restart
  • 23:38 mutante: mw1185, mw1268 - service hhvm restart
  • 23:37 ejegg: updated civicrm from cd058a0 to 6b6f5d6
  • 23:21 ejegg: restarted civicrm donation queue consumer and dedupe jobs
  • 22:47 demon@tin: Synchronized w/extract2.php: (no message) (duration: 00m 40s)
  • 22:35 ejegg: paused civicrm dedupe and donation import jobs
  • 22:33 ejegg: updated civicrm from af8d735 to cd058a0
  • 21:39 akosiaris: reload pfw1-codfw node 0 in an effort to debug high RTTs
  • 20:50 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.9
  • 20:44 twentyafterfour: deploying mediawiki 1.29.0-wmf.9 to group1 wikis
  • 20:36 volans: upgrading nodejs-legacy (it is just the symlink) to v6 on parsoid hosts T149331
  • 20:11 eileen: renabled dedupe job, disabled major gifts (one should be enough). Will investigate next error
  • 19:26 mutante: analytics1015,analytics1026 - decom: remove DNS names, delete salt keys, revoke puppet certs, puppet node clean (to remove from icinga) (T147313)
  • 18:22 bd808@tin: Finished deploy [striker/deploy@5aa3aa8]: Update Striker to 5aa3aa8 (T144710, T147024, T144712, T144711, T153935) (duration: 00m 24s)
  • 18:22 bd808@tin: Starting deploy [striker/deploy@5aa3aa8]: Update Striker to 5aa3aa8 (T144710, T147024, T144712, T144711, T153935)
  • 18:09 demon@tin: Synchronized docroot/foundation/logos: rm a junk logo (duration: 00m 50s)
  • 18:02 elukey: running authdns-update on ns0.w.o to pick up changes made in https://gerrit.wikimedia.org/r/334040
  • 17:38 jynus: restarting and upgrading db2060
  • 16:57 ostriches: gerrit: everything back up!
  • 16:56 ostriches: gerrit: quick service reboot to pick up new java version
  • 16:31 Jeff_Green: renamed fdb2001 to frdb2001
  • 16:25 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1054 IP - T156225 (duration: 00m 40s)
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db1054 IP - T156225 (duration: 00m 41s)
  • 16:07 marostegui: Stop mysql and power off db1054 for maintenance - T156225
  • 15:28 gehel: removing jieba / ltr / swift plugins from elasticsearch relforge - T156150
  • 15:27 gehel: deleting indices using jieba plugin from relforge - T156150
  • 15:19 chasemp: (slightly late) of 'maintain-views --all-databases --table watchlist_count --replace-all' across labsdbs
  • 15:11 godog: graphite1003 / graphite2002 at 94% utilization, increase lv size by 300G
  • 14:42 moritzm: installing ruby2.1 updates from jessie point release
  • 14:23 moritzm: installing wget updates from jessie point release
  • 14:13 dcausse: EU SWAT done
  • 14:11 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T156234 Revert [cirrus] properly set wgCirrusSearchUseIcuFolding (duration: 00m 41s)
  • 14:09 moritzm: removed totally outdated openjdk-8 packages from trusty-wikimedia (from 2014) on carbon
  • 13:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1054 - T156225 (duration: 00m 50s)
  • 13:52 moritzm: upgrading openjdk-8 on maps-test*
  • 13:11 gehel: pooling new elasticsearch nodes on codfw - T154251
  • 12:38 moritzm: installing libxml security updates
  • 12:04 Dereckson: Refresh site statistics on simple. (T156247)
  • 11:01 moritzm: upgrading restbase staging cluster to new openjdk (also piggyback reboot to latest 4.4 kernel)
  • 10:49 moritzm: uploaded openjdk-8 u121 to apt.wikimedia.org
  • 10:28 moritzm: uploaded ca-certificates-java 20161107~bpo8+1 to apt.wikimedia.org
  • 10:11 ema: repooled codfw
  • 09:25 elukey: updating puppet-compiler facts
  • 08:53 ema: upgrade cp3040 to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 08:15 ema: upgrade cp3034 to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8 T155401
  • 08:15 ema: upgrade cp3034 to jessie 8.7 and reboot into kernel 4.4.2-3+wmf8
  • 08:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2054 - T153300 (duration: 00m 51s)
  • 07:48 _joe_: restarting pybal on lvs1003
  • 07:28 elukey: upgrading aqs100[56] to node6
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T156005 (duration: 00m 42s)
  • 05:37 mobrovac: zotero restarting zotero, taking 95% of mem ...
  • 02:55 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 25 02:55:57 UTC 2017 (duration 5m 37s)
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.9) (duration: 12m 53s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 06m 25s)
  • 00:59 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1052 after maintenance (duration: 00m 40s)
  • 00:37 mutante: planet2001 - re-add new salt key, fix minion
  • 00:11 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Amend import sources for en.wikisource (T155922) (duration: 00m 47s)
  • 00:09 mutante: analytics1015,analytics1026 - revoked puppet cert, removing from puppet, shutting down (T147313)

2017-01-24

  • 23:50 mutante: carbon - stopping puppet, stopping atftpd
  • 23:49 mutante: carbon stopping DHCP
  • 23:49 mutante: analytics1015 (unused spare system) - use for test OS install
  • 23:26 jynus: restarting db1052 for kernel upgrade
  • 23:09 ebernhardson@tin: Synchronized php-1.29.0-wmf.9/includes/specials/SpecialSearch.php: Update special:search security patc h to not fatal (duration: 00m 44s)
  • 23:04 jynus: reimage db1066
  • 22:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for reimage (duration: 00m 55s)
  • 22:22 Pchelolo: update RESTBase to 69065e2
  • 22:19 Pchelolo: update RESTBase to 69065e2: canary on restbase1007
  • 22:13 Pchelolo: update RESTBase to 69065e2: staging
  • 21:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: repool db1065 as dump/vslow & clean up s1 comments (duration: 00m 43s)
  • 21:43 twentyafterfour: Finished group0 to wmf/1.29.0-wmf.9 (refs T15525) Changelog: https://www.mediawiki.org/wiki/MediaWiki_1.29/wmf.9/Changelog
  • 21:34 twentyafterfour@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 wikis to 1.29.0-wmf.9 refs T155525
  • 21:31 demon@tin: Synchronized docroot: Drop labs docroot, unused in prod (duration: 00m 44s)
  • 21:22 twentyafterfour@tin: Finished scap: test wikis to 1.29.0-wmf.9 refs T155525 (duration: 32m 37s)
  • 20:49 twentyafterfour@tin: Started scap: test wikis to 1.29.0-wmf.9 refs T155525
  • 20:49 volans: disabled puppet on ruthenium to avoid the restart of parsoid-vd and parsoid-vd-client processes T156177
  • 20:09 demon@tin: Synchronized docroot: Adding new wikimediafoundation.org docroot (duration: 01m 05s)
  • 19:51 volans: ruthenium: stopped parsoid-vd and parsoid-vd-client to avoid uncontrolled spawning of phantomjs childs
  • 19:39 volans: sudo service parsoid-vd stop on ruthenium
  • 19:37 twentyafterfour: branching 1.29.0-wmf.9 refs T154683
  • 19:35 volans: killed 822 "/srv/visualdiff/node_modules/phantomjs/lib/phantom/bin/phantomjs" processes on ruthenium. RAM and swap full, host unresponsive
  • 19:29 jynus: change replication master of db1095 to db1065
  • 19:07 jynus: change replication master of db1095 to db1052
  • 19:07 demon@tin: Synchronized docroot/foundation/logos: rm some old junk logos (duration: 00m 42s)
  • 18:58 arlolra: Updated Parsoid to version d000fdb4 (T58846, T154804, T152633)
  • 18:45 arlolra@tin: Finished deploy [parsoid/deploy@c1a14c0]: Retry updating Parsoid to d000fdb4 (duration: 04m 14s)
  • 18:41 arlolra@tin: Starting deploy [parsoid/deploy@c1a14c0]: Retry updating Parsoid to d000fdb4
  • 18:40 arlolra@tin: Finished deploy [parsoid/deploy@c1a14c0]: Updating Parsoid to d000fdb4 (duration: 21m 28s)
  • 18:37 demon@tin: Synchronized docroot: tidying up mobileportal docroot stuff (duration: 00m 41s)
  • 18:24 demon@tin: Synchronized docroot: Removing old wikidata docroot (duration: 00m 46s)
  • 18:19 arlolra@tin: Starting deploy [parsoid/deploy@c1a14c0]: Updating Parsoid to d000fdb4
  • 18:10 marostegui: restart mysql db1065 maintenance - https://phabricator.wikimedia.org/T155999)
  • 18:07 mutante: planet2001 - re-adding to puppet, revoke old cert, sign new cert, initial run
  • 18:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T156006 (duration: 00m 49s)
  • 16:54 andrewbogott: tools deleting tools-mail-01
  • 16:52 mutante: planet2001 - reinstalling to test DHCP/TFTP from install2001
  • 16:37 elukey: upgrading aqs1004 to node6
  • 16:26 marostegui: Restart mysql db1072
  • 16:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T156006 (duration: 00m 41s)
  • 16:18 paravoid: removing lvs4002_T151273 policy from cr1/2-ulsfo
  • 16:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 - T156006 (duration: 00m 47s)
  • 16:12 godog: kill stray swift-proxy processes from ms-fe1* T156143
  • 16:07 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1001.eqiad.wmnet
  • 16:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 - T155999 (duration: 00m 48s)
  • 15:58 moritzm: upgraded nodejs on thorium to 6.9 / restarted pivot
  • 15:57 papaul: shutting down ms-be2002 for maintenance
  • 15:54 chasemp: drbdadm adjust tools for 1004/1005 w/ 192.168.0.0/30
  • 15:49 moritzm: installing tomcat7 security updates on trusty hosts (jessie already fixed a while ago)
  • 15:14 godog: bounce pybal on lvs1003 - T134893
  • 15:10 chasemp: drbdadm adjust misc for 1004/1005 w/ 192.168.0.0/30
  • 15:09 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1001.eqiad.wmnet
  • 15:08 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1001.eqiad.wmnet
  • 15:07 chasemp: drbdadm adjust test for 1004/1005 w/ 192.168.0.0/30
  • 15:04 chasemp: recabling labstore1004/1005 eth1
  • 14:55 marostegui: Stop replication on db1052 and db1073 for maintenance - T156006
  • 14:54 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1001.eqiad.wmnet
  • 14:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T155999 (duration: 00m 39s)
  • 14:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 with less weight - T156004 (duration: 00m 41s)
  • 14:26 dcausse: EU SWAT Done
  • 14:23 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T155515 [cirrus] properly set wgCirrusSearchUseIcuFolding (duration: 00m 39s)
  • 14:13 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T155142 [cirrus] Increase weigths for content namespaces on mw.org (duration: 00m 39s)
  • 13:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1052 IP - T156006 (duration: 00m 39s)
  • 13:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db1052 IP - T156006 (duration: 00m 39s)
  • 13:41 marostegui: Shutdown db1052 for maintenance - T156006
  • 13:37 marostegui: Shutdown mysql on db1052 for maintenance - T156006
  • 13:16 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1051 IP - T156004 (duration: 00m 39s)
  • 13:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: wmf-config/db-codfw.php Change db1051 IP - T156004 (duration: 00m 39s)
  • 13:00 marostegui: Shutdown db1051 for maintenance - T156004
  • 12:56 marostegui: Shutdown mysql on db1051 for maintenance - T156004
  • 12:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T156004 (duration: 00m 39s)
  • 12:51 moritzm: installing pcsc-lite security updates on trusty hosts (jessie already fixed a while ago)
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T156004 (duration: 00m 39s)
  • 12:23 akosiaris: switch all networks to use install1001, install2001 as DHCP relay endpoint. T156109
  • 12:17 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Revert last (duration: 00m 39s)
  • 12:14 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Copy InterwikiSorting settings from wmgWikibaseClientSettings noop (duration: 00m 39s)
  • 12:07 addshore@tin: Synchronized wmf-config/CommonSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster & Populate InterwikiSortingInterwikiSortOrders with WB Client 4/4 noop (duration: 00m 39s)
  • 12:06 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T155995 Prepare to enable InterwikiSorting on beta cluster 3/4 noop (duration: 00m 39s)
  • 12:05 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster 2/4 noop (duration: 00m 39s)
  • 12:05 addshore@tin: Synchronized wmf-config/extension-list-labs: T155995 Prepare to enable InterwikiSorting on beta cluster 1/4 noop (duration: 00m 39s)
  • 09:53 akosiaris: mark /dev/sdb as faulty on md devices on bast3001 T154603
  • 09:36 akosiaris: add /dev/sdb partitions to md RAID device on mw2251
  • 09:33 addshore@tin: Synchronized wmf-config/CommonSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster 4/4 noop (duration: 00m 38s)
  • 09:33 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T155995 Prepare to enable InterwikiSorting on beta cluster 3/4 noop (duration: 00m 40s)
  • 09:32 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155995 Prepare to enable InterwikiSorting on beta cluster 2/4 noop (duration: 00m 41s)
  • 09:30 addshore@tin: Synchronized wmf-config/extension-list-labs: T155995 Prepare to enable InterwikiSorting on beta cluster 1/4 noop (duration: 00m 53s)
  • 09:21 marostegui: Alter table db2054 metawiki.pagelinks - T153300
  • 09:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2054 - T153300 (duration: 00m 39s)
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1065 original weight - T156005 (duration: 00m 39s)
  • 08:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add rack positions - T155999 (duration: 00m 41s)
  • 08:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: wmf-config/db-eqiad.php Add rack positions - T155999 (duration: 00m 50s)
  • 06:20 _joe_: repooling mw2098 after scap pull
  • 02:23 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 24 02:23:01 UTC 2017 (duration 4m 23s)
  • 02:18 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 06m 40s)
  • 01:37 ejegg: updated SmashPig from 03880ce to ab52dbe
  • 01:18 Krinkle: mwscript deleteEqualMessages.php --wiki gotwiki (T45917)
  • 01:16 ejegg: updated payments-wiki from c22353b to dd8a16d
  • 00:26 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 with low load after reimage (duration: 00m 45s)

2017-01-23

  • 22:38 Pchelolo: update RESTBase to 598fa56f
  • 22:37 Pchelolo: update RESTBase to 598fa56f: canary on restbase1007
  • 22:22 Pchelolo: update RESTBase to d1663345c
  • 22:21 Pchelolo: update RESTBase to d1663345c: canary on restbase1007
  • 22:18 Pchelolo: update RESTBase to d1663345c: staging
  • 21:54 demon@tin: Synchronized wmf-config: interwiki update, dropping some old ExtensionMessages files (duration: 00m 41s)
  • 21:43 mutante: sca2004 was out of memory but also fixed itself and i could run puppet again a few minutes later
  • 21:41 demon@tin: Synchronized w: Removing wiki.phtml, apache does the rewrites (duration: 00m 48s)
  • 21:39 bsitzmann@tin: Finished deploy [mobileapps/deploy@7615bf9]: Update mobileapps to 66ef3c2 (duration: 03m 16s)
  • 21:36 bsitzmann@tin: Starting deploy [mobileapps/deploy@7615bf9]: Update mobileapps to 66ef3c2
  • 20:29 nuria@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 08s)
  • 20:29 nuria@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:28 nuria@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 07s)
  • 20:28 nuria@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:18 otto@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 01s)
  • 20:18 otto@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:11 gehel@tin: Finished deploy [wdqs/wdqs@fd88fda]: (no message) (duration: 01m 56s)
  • 20:09 otto@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 01s)
  • 20:09 otto@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 20:09 gehel@tin: Starting deploy [wdqs/wdqs@fd88fda]: (no message)
  • 20:06 gehel: deplyoing latest wdqs version (2h behind planned schedule)
  • 20:04 otto@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 02m 16s)
  • 20:02 otto@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 19:26 reedy@tin: Synchronized wmf-config/CommonSettings.php: Make sure CommonSettings-labs is one of the last things loaded so we don't get problems from things being included after (duration: 00m 40s)
  • 18:57 nuria@tin: Finished deploy [analytics/aqs/deploy@025ef23]: (no message) (duration: 00m 35s)
  • 18:56 nuria@tin: Starting deploy [analytics/aqs/deploy@025ef23]: (no message)
  • 18:47 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 00m 01s)
  • 18:47 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:47 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 00m 01s)
  • 18:47 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:45 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 01m 08s)
  • 18:44 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:43 nuria@tin: Finished deploy [analytics/aqs/deploy@56ab863]: (no message) (duration: 00m 01s)
  • 18:43 nuria@tin: Starting deploy [analytics/aqs/deploy@56ab863]: (no message)
  • 18:33 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op, completeness (duration: 00m 40s)
  • 18:32 demon@tin: Synchronized wmf-config/extension-list-labs: no-op, completeness (duration: 00m 40s)
  • 18:30 jynus: reimaging db1065 to jessie
  • 18:18 papaul: shutting down ms-be2010 for maintenance
  • 18:12 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 (duration: 00m 39s)
  • 16:31 papaul: shutting down mw2098 for maintenance
  • 15:38 marostegui: Alter tables: flow_topic_list and flow_tree_node on db1031 (x1 master) - T149819
  • 15:19 elukey: whitelisted dbproxy1011 on cr1/cr2 for analytics-in4 input filter
  • 15:18 moritzm: installing mysql 5.5 security updates (as packaged by jessie/trusty, not the internal mariadb packages)
  • 15:17 moritzm: installing pdns-recursor security update on labservices1002
  • 15:17 Dereckson: Fixed namespaces dupes following NS_PROJECT update on sa.wikisource (T101634)
  • 15:15 gehel: reimage elastic2025 - T154251
  • 15:02 Dereckson: EU SWAT done (handled by addshore)
  • 15:00 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set site name and meta namespace for Sanskrit wikis (T101634) (duration: 00m 40s)
  • 14:34 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155844 [fix] Add finds.org.uk without wildcard too (duration: 00m 39s)
  • 14:24 addshore@tin: Synchronized wmf-config/Wikibase.php: T150183 Move InterwikiSortOrders to own file PT 2/2 (duration: 00m 39s)
  • 14:23 addshore@tin: Synchronized wmf-config/InterwikiSortOrders.php: T150183 Move InterwikiSortOrders to own file PT 1/2 (duration: 00m 40s)
  • 14:23 gehel: disabling puppet on elastic20(2[5-9]|3[0-6]) prior to reimage - T154251
  • 14:20 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155916 Amend category collation for de.wikisource to uca-de-u-kn (duration: 00m 39s)
  • 14:20 godog: depool ms-fe200[1234] T152612
  • 14:15 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: T155906 Add n, n:es and n:fr as import sources in test2wiki (duration: 00m 39s)
  • 14:14 Dereckson: Fix namespaces dupes on sa.wikisource to prepare T101634 / Gerrit:333640
  • 14:09 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: gerrit:333476 (NOOP) Temporarily set $wgDisableUserGroupExpiry to true on labs (duration: 00m 40s)
  • 14:07 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: gerrit:333294 Add *.finds.org.uk to wgCopyUploadsDomains (duration: 00m 41s)
  • 11:54 elukey: whitelisted dbproxy1010 on cr1/cr2 for analytics-in4 input filter
  • 10:50 moritzm: installing pdns-recursor security updates on trusty systems
  • 10:38 moritzm: installing openjpeg security updates
  • 09:06 marostegui: Compress s2 on dbstore2001 - T151552
  • 07:47 marostegui: Enabling gtid_domain_id on db1047 (eventlogging host) - T149418
  • 07:43 marostegui: Enabling gtid_domain_id on db1046 (eventlogging master) - T149418
  • 07:32 marostegui: Deploy gtid_domain_id db1043 (passive master) - last host pending in m3 - T149418
  • 07:28 marostegui: Compressing cebwiki.templatelinks on db1015 (224G table) - T153739
  • 03:02 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 23 03:02:32 UTC 2017 (duration 4m 44s)
  • 02:57 reedy@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 13m 14s)
  • 02:26 Reedy: running l10nupdate manually
  • 02:23 Reedy: cleaned up reCaptcha extension in l10ncache dirs
  • 02:02 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2017-01-22

  • 23:26 mobrovac: restbase deploying d1663345 - blacklist of a bot log page on enwiki
  • 02:01 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed
  • 00:08 krenair@tin: Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/333468 - needed to be handled before next window (duration: 00m 42s)

2017-01-21

  • 21:57 mobrovac@tin: Finished deploy [changeprop/deploy@2b980fa]: (no message) (duration: 00m 54s)
  • 21:56 mobrovac@tin: Starting deploy [changeprop/deploy@2b980fa]: (no message)
  • 20:02 legoktm@tin: Synchronized php-1.29.0-wmf.8/RELEASE-NOTES-1.29: for completeness (duration: 00m 39s)
  • 20:01 legoktm@tin: Synchronized php-1.29.0-wmf.8/resources: Revert "Added reason suggestion in block/delete/protect forms" (1/2) - T34950 (duration: 00m 39s)
  • 20:00 legoktm@tin: Synchronized php-1.29.0-wmf.8/includes: Revert "Added reason suggestion in block/delete/protect forms" (1/2) - T34950 (duration: 01m 31s)
  • 04:03 ema: graphite1003: carbon-cache@c restarted, it's been killed by OOM killer again
  • afk: disabled civicrm dedupe high numbers
  • afk: disabled civicrm dedupe
  • 02:02 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed
  • 01:21 mobrovac@tin: Finished deploy [changeprop/deploy@eb27062]: (no message) (duration: 01m 03s)
  • 01:20 mobrovac@tin: Starting deploy [changeprop/deploy@eb27062]: (no message)
  • 00:36 volans: restarted carbon-cache@c on graphite1003 (was killed by oom-killer)
  • 00:20 mobrovac: restbase deploying 7c753fe6

2017-01-20

  • 23:37 mattflaschen@tin: Synchronized docroot: No-op file rename (duration: 00m 46s)
  • 23:36 mattflaschen@tin: Synchronized dblists: No-op file rename (duration: 00m 54s)
  • 19:55 robh: done fixing ulsfo serial in ulsfo
  • 19:45 robh: messing with ulsfo serial connections
  • 19:29 robh: cp4012 donating its redundant power supply to lvs4002 with redundant supplies
  • 19:12 ejegg: re-enabled fundraising Jenkins jobs
  • 19:01 ejegg: disabled fundraising jenkins jobs
  • 17:22 chasemp: shutdown eth1 on labstore1004 for testing
  • 17:17 andrewbogott: graceful'd apache on silver, in hopes that the wikitech instance api will update
  • 16:18 cmjohnson1: swapping cable eth0 labstore1004 (chasemp)
  • 16:03 jynus: restart and upgrade of db2066
  • 14:42 jynus: restart and upgrade of db2067
  • 13:30 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 - T153300 (duration: 00m 39s)
  • 12:32 ladsgroup@tin: Synchronized php-1.29.0-wmf.8/extensions/ORES/includes/Hooks.php: ORES database query fix (T155500) (duration: 00m 40s)
  • 12:13 Amir1: deploy wmf.8 in mwdebug1002 (T155500)
  • 10:48 godog: reload swift-proxy on ms-fe100* to pick up https://gerrit.wikimedia.org/r/333222
  • 10:39 elukey: manually forcing a /etc/init.d/apache2 reload on mw1259 (videoscaler) to replicate the effects of a logrotate run and test why alarms go off.
  • 10:15 moritzm: installing exim bugfix updates from latest jessie point release
  • 10:02 godog: reload swift-proxy on ms-fe1001 to pick up https://gerrit.wikimedia.org/r/333222
  • 09:19 jynus: rolling restart and upgrade of labsdb1009/10/11 to mariadb 10.1.21-2
  • 08:41 marostegui: Remove partitions on metawiki.pagelinks db2047 - T153300
  • 08:25 _joe_: restarting pybal on lvs1003/1006 to pick up config changes
  • 07:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 - T153300 (duration: 00m 48s)
  • 07:09 marostegui: Compress pagelinks tables on db1015 - T153739
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jan 20 02:36:44 UTC 2017 (duration 5m 34s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 11m 23s)
  • 00:52 mutante: tin - keyholder disarm and arm again using new passphrase
  • 00:49 mutante: mira - arming keyholder after setting service/dumps/eventlogging/phabricator key passphrases to the same one (T154943)
  • 00:46 mutante: setting all deployment key passphrases to the one used for mw deploy - update key files in private repo (T154943)
  • 00:40 mobrovac@tin: Finished deploy [parsoid/deploy@465f9c4]: Restarting Parsoid everywhere for Node v6 switch T149331 (duration: 04m 21s)
  • 00:39 thcipriani@tin: Synchronized php-1.29.0-wmf.8/includes/specials/SpecialContributions.php: SWAT: SpecialContributions: Username input is not really required T155780 (duration: 00m 39s)
  • 00:35 mobrovac@tin: Starting deploy [parsoid/deploy@465f9c4]: Restarting Parsoid everywhere for Node v6 switch T149331
  • 00:34 thcipriani@tin: Synchronized php-1.29.0-wmf.8/resources/lib/oojs-ui: SWAT: resources: Update OOjs UI with fixes on top of v0.18.3 T155728 (duration: 00m 41s)
  • 00:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Wikidata description taglines shown on English Wikipedia T152743 (duration: 00m 39s)
  • 00:22 volans: apt-upgrading nodejs to v6 on the rest of parsoid hosts (a deploy with restart will follow) T149331

2017-01-19

  • 23:34 mutante: force puppet run on restbase1*
  • 23:32 volans: upgrading node to v6 on wtp1003 T149331
  • 23:31 mutante: force puppet run on restbase2*
  • 23:31 mutante: icinga - replace check command names in puppet_services.cfg for change 333010
  • 23:26 mobrovac@tin: Finished deploy [eventstreams/deploy@0d1d9c6]: Bump preq to 0.5.2 for Node v6 (duration: 02m 11s)
  • 23:24 mobrovac@tin: Starting deploy [eventstreams/deploy@0d1d9c6]: Bump preq to 0.5.2 for Node v6
  • 23:16 mobrovac@tin: Finished deploy [cxserver/deploy@5ae4f8b]: Bump preq to 0.5.2 for Node v6 (duration: 01m 56s)
  • 23:14 mobrovac@tin: Starting deploy [cxserver/deploy@5ae4f8b]: Bump preq to 0.5.2 for Node v6
  • 23:11 mobrovac@tin: Finished deploy [graphoid/deploy@da37386]: Bump preq to 0.5.2 for Node v6 (duration: 02m 21s)
  • 23:10 ejegg: updated SmashPig from f05c9a3 to 03880ce
  • 23:08 mobrovac@tin: Starting deploy [graphoid/deploy@da37386]: Bump preq to 0.5.2 for Node v6
  • 22:59 ppchelko@tin: Finished deploy [eventstreams/deploy@fe77f19]: Deploy for switching to node 6 T149331 (duration: 01m 30s)
  • 22:58 mobrovac@tin: Finished deploy [trending-edits/deploy@0abcf25]: Switching to node 6 T149331 (duration: 01m 59s)
  • 22:58 ppchelko@tin: Starting deploy [eventstreams/deploy@fe77f19]: Deploy for switching to node 6 T149331
  • 22:58 ppchelko@tin: Finished deploy [cxserver/deploy@ff0225e]: Deploy for switching to node 6 T149331 (duration: 02m 12s)
  • 22:56 mobrovac@tin: Finished deploy [electron-render/deploy@f1df2d3]: Switching to node 6 T149331 (duration: 01m 58s)
  • 22:56 mobrovac@tin: Starting deploy [trending-edits/deploy@0abcf25]: Switching to node 6 T149331
  • 22:56 mobrovac@tin: Finished deploy [mobileapps/deploy@cacb3c9]: Switching to node 6 T149331 (duration: 02m 41s)
  • 22:55 ppchelko@tin: Starting deploy [cxserver/deploy@ff0225e]: Deploy for switching to node 6 T149331
  • 22:55 ppchelko@tin: Finished deploy [citoid/deploy@95df861]: Deploy for switching to node 6 T149331 (duration: 02m 18s)
  • 22:54 mobrovac@tin: Starting deploy [electron-render/deploy@f1df2d3]: Switching to node 6 T149331
  • 22:54 mobrovac@tin: Finished deploy [mathoid/deploy@ba3217e]: (no message) (duration: 02m 02s)
  • 22:53 mobrovac@tin: Starting deploy [mobileapps/deploy@cacb3c9]: Switching to node 6 T149331
  • 22:53 mobrovac@tin: Finished deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331 (duration: 01m 39s)
  • 22:53 ppchelko@tin: Starting deploy [citoid/deploy@95df861]: Deploy for switching to node 6 T149331
  • 22:52 ppchelko@tin: Finished deploy [changeprop/deploy@ffd0b8b]: Deploy for switching to node 6 T149331 (duration: 00m 58s)
  • 22:52 mobrovac@tin: Starting deploy [mathoid/deploy@ba3217e]: (no message)
  • 22:51 mobrovac@tin: Starting deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331
  • 22:51 ppchelko@tin: Starting deploy [changeprop/deploy@ffd0b8b]: Deploy for switching to node 6 T149331
  • 22:51 mutante: scb1003,scb1004 - upgrade nodejs
  • 22:51 ppchelko@tin: Finished deploy [changeprop/deploy@ffd0b8b]: Canary deploy for switching to node 6 T149331 (duration: 07m 36s)
  • 22:48 mobrovac@tin: Finished deploy [mathoid/deploy@ba3217e]: (no message) (duration: 03m 24s)
  • 22:47 mobrovac@tin: Finished deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331 (duration: 01m 43s)
  • 22:45 mobrovac@tin: Starting deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331
  • 22:45 mobrovac@tin: Finished deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331 (duration: 01m 10s)
  • 22:45 mobrovac@tin: Starting deploy [mathoid/deploy@ba3217e]: (no message)
  • 22:44 mobrovac@tin: Starting deploy [graphoid/deploy@f872f94]: Switching to node 6 T149331
  • 22:44 ppchelko@tin: Starting deploy [changeprop/deploy@ffd0b8b]: Canary deploy for switching to node 6 T149331
  • 22:41 mutante: scb1001-1004 - upgraded nodejs version
  • 22:38 mutante: scb2003 - repool, scb2001,scb2002 - upgrade nodejs, libuv1 packages
  • 22:35 mutante: scb2003 - depool, upgrade nodejs, libuv1 packages
  • 22:33 mutante: scb2004 - re-pooled
  • 21:35 chasemp: rebooting labstore1004
  • 21:33 chasemp: failover secondary labstore cluster from 1004 to 1004
  • 21:17 chasemp: force non tools on NFS to go ro
  • 21:03 volans: upgrading node to v6 on wtp1002 T149331
  • 20:46 mutante: scb2004 - upgrading nodejs, libuv1
  • 20:45 volans: upgrading node to v6 on wtp2003 T149331
  • 20:44 mutante: depooling scb2004 for nodejs install
  • 20:26 volans: upgrading node to v6 on wtp2002 T149331
  • 20:09 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group2 to wmf.8
  • 20:05 dereckson@tin: Synchronized wmf-config/throttle.php: Add throttle rule for BAMU event (T154312) (duration: 00m 39s)
  • 19:44 ejegg: updated SmashPig from 48675c3 to f05c9a3
  • 19:40 mutante: switching dumps.wikimedia.org to Letsencrypt SSL cert
  • 19:20 jynus: restarting db1069:3311 due to query being "stuck" on tokudb table
  • 19:13 demon@tin: Synchronized wmf-config/CommonSettings.php: ContentTranslation: Enable publishing article in testwiki (2/2) (duration: 00m 39s)
  • 19:12 demon@tin: Synchronized wmf-config/InitialiseSettings.php: ContentTranslation: Enable publishing article in testwiki (1/2) (duration: 00m 39s)
  • 19:06 demon@tin: Synchronized wmf-config/CommonSettings.php: Double $wgTranscodeBackgroundTimeLimit to compensate for threading (duration: 00m 47s)
  • 18:54 ema: libvmod-header removed from carbon, varnish-modules provides it
  • 18:51 mutante: dataset1001 - temp disabling puppet, ms1001 - switching to Letsencrypt cert
  • 18:34 jynus: aborting rolling restart on labsdb1010, labsdb1011 due to package bug to be fixed on 10.1.21-2
  • 18:19 mobrovac: restbase updating firejail in production
  • 17:52 jynus: rolling restart and upgrade of labsdb1009/10/11 to mariadb 10.1.21
  • 17:47 Dereckson: Reattach Zlazstadpieroniebomiurwieszkabelodinternetu CentralAuth account (T155184)
  • 17:26 jynus: restarting and upgrading mariadb on labsdb1004 to 10.0.29
  • 14:40 Dereckson: EU SWAT done
  • 14:40 Dereckson: `mwscript namespaceDupes.php sawiki --fix` (T101634)
  • 14:36 ottomata: restarting apache/puppetmaster on labcontrol1001 to try to fix 'invalid byte sequence in US-ASCII' puppet error
  • 14:34 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Fix Portal talk namespace name on Sanskrit Wikipedia (T101634) (duration: 00m 39s)
  • 14:25 dereckson@tin: Synchronized wmf-config: Add noratelimit user right to translation admins on Commons (T155162) (duration: 00m 42s)
  • 13:21 marostegui: Compressing revision,pagelinks and templatelinks tables on db1035 - T110504
  • 11:41 marostegui: Compressing dewiki db1045 - T155399
  • 10:14 marostegui: Compressing templatelinks tables on db1015 - T153739
  • 09:32 moritzm: upgrading firejail on image scalers
  • 08:57 godog: bounce udp2log on fluorine after https://gerrit.wikimedia.org/r/313604
  • 07:45 volans@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2098.codfw.wmnet
  • 07:42 volans@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2098.wmnet
  • 07:28 dereckson@tin: Synchronized wmf-config/throttle.php: Fix throttle rule for KCES IMR edit-a-thon (duration: 02m 42s)
  • 07:18 marostegui: Compressing enwikivoyage.text and shwiki.logging tables on db1044 - T153826
  • 07:14 marostegui: Compressing enwikivoyage.text and shwiki.logging tables on db1038 - T154465
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 28s)
  • 01:50 mutante: install - if in private1-c-eqiad, private1-b-codfw, you are using install1001 for both DHCP and TFTP, if in other networks you still use carbon as DHCP but then also install1001 as TFTP
  • 01:47 mutante: install1001/2001 - re-enabled, carbon is still DHCP for some rows
  • 01:23 mutante: install1001 - re-enable puppet - install2001 - same thing, temp disable and live-hack mw2251 to use trusty installer
  • 01:16 mutante: switching mw2251 to trusty-installer for test
  • 01:14 mutante: temp disable puppet on install1001 for papaul debugging
  • 00:24 maxsem@tin: Synchronized php-1.29.0-wmf.8/extensions/Graph: SWAT https://gerrit.wikimedia.org/r/#/c/332916/1 (duration: 00m 40s)

2017-01-18

  • 23:38 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/Collection: Unbreak (duration: 00m 40s)
  • 22:59 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/ProofreadPage/includes/index/ProofreadIndexPage.php: Unbreak, T155682 (duration: 00m 39s)
  • 22:49 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/LiquidThreads/classes/Hooks.php: Unbreak hook mess (duration: 00m 41s)
  • 22:48 demon@tin: Synchronized php-1.29.0-wmf.8/includes/widget/search/FullSearchResultWidget.php: Unbreak hook mess (duration: 00m 45s)
  • 22:46 madhuvishy: Reenabled nfs-exportd and puppet on labstore1004. All of misc being exported as rw now. T154336
  • 21:32 bd808: Updated wikimania-scholarships to 29ba0ec "Add Tulu (tcy) to Communities" (T155666)
  • 21:18 volans: restarted pybal on lvs2001 (active) T134893
  • 21:00 volans: restarted pybal on lvs2004 (passive) T134893
  • 20:48 demon@tin: Finished scap: group1 to wmf.8 (duration: 46m 33s)
  • 20:47 volans: Upgraded nodejs to v6 on wtp1001 T149331
  • 20:31 nuria@tin: Finished deploy [analytics/refinery@666d98d]: (no message) (duration: 02m 19s)
  • 20:28 nuria@tin: Starting deploy [analytics/refinery@666d98d]: (no message)
  • 20:01 demon@tin: Started scap: group1 to wmf.8
  • 19:56 volans: restarted pybal on lvs2003
  • 19:51 volans: restarted pybal on lvs2006
  • 19:26 demon@tin: Synchronized dblists: Remove old compact lang list dblist (duration: 00m 39s)
  • 19:25 volans: Upgrading nodejs to v6 on wtp2001 T149331
  • 19:25 demon@tin: Synchronized docroot/noc/conf: Using new compact lang list dblist (duration: 00m 39s)
  • 19:24 demon@tin: Synchronized tests/cirrusTest.php: Use new compact lang list dblist (duration: 00m 39s)
  • 19:22 papaul: OS installation on mw2251-mw2260
  • 19:22 demon@tin: Synchronized wmf-config: Use new compact lang links dblist (duration: 00m 41s)
  • 19:21 demon@tin: Synchronized dblists/compact-language-links.dblist: New dblist (duration: 00m 39s)
  • 19:12 volans: Upgrading nodejs to v6 on ruthenium T149331
  • 19:06 dereckson@tin: Synchronized wmf-config/throttle.php: Add throttle rule for KCES IMR edit-a-thon (T154312) (duration: 00m 39s)
  • 18:35 demon@tin: Synchronized multiversion/MWMultiVersion.php: Swapping 500 -> 400 when specifying invalid host headers (duration: 00m 39s)
  • 18:28 demon@tin: Synchronized multiversion/MWMultiVersion.php: minor cleanup (duration: 00m 48s)
  • 18:03 madhuvishy: Rolling out https://gerrit.wikimedia.org/r/#/c/332735/ across labs instances T154336
  • 17:54 madhuvishy: Disabled (systemctl disable) nfs-export on labstore1001 and 1004 to prevent auto restart from bringing them back up T154336
  • 17:42 mobrovac: restbase deploying 3027682
  • 17:40 madhuvishy: Starting final sync of latest diff from labstore1001 to labstore-secondary T154336
  • 17:38 madhuvishy: Disabling puppet on labstore1001 and 1004 to make sure nfs exports are not overridden T154336
  • 17:37 madhuvishy: Exporting all misc shares from labstore1004 as RO T154336
  • 17:30 madhuvishy: Stopping nfs-exportd on labstore1004 T154336
  • 17:29 madhuvishy: Exported all misc exports as RO on labstore1001 T154336
  • 17:26 madhuvishy: Stopping nfs-exportd on labstore1001 T154336
  • 17:13 madhuvishy: Disabling puppet across labs instances with NFS (/home and/or /data/project) mounted for T154336
  • 17:12 madhuvishy: Silenced shinken, and icinga on labstore1001 for misc nfs migration T154336
  • 15:45 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw122[6-9].*
  • 15:45 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw123[0-5].*
  • 15:41 oblivian@puppetmaster1001: conftool action : set/weight=15; selector: service=apache2,cluster=api_appserver,dc=eqiad,name=mw1(1[8-9]|2[0-1]|22[0-5]).*
  • 15:24 zeljkof: finished EU SWAT
  • 15:23 zfilipin@tin: Synchronized php-1.29.0-wmf.7/maintenance/importImages.php: SWAT: maintenance/importImages: Dont sleep after the last upload (duration: 00m 41s)
  • 15:07 hashar: tin.eqiad.wmnet : committed an uncommitted live hack for php-1.29.0-wmf.7/includes/AutoLoader.php by ostriches
  • 15:06 zeljkof: extending EU SWAT until 332766 is deployed
  • 14:39 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Increase $wgHTTPImportTimeout to 50 seconds (T155209) (duration: 00m 39s)
  • 14:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgDisableUserGroupExpiry to true on production, false on labs (T155605) (duration: 00m 40s)
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Set wgDisableUserGroupExpiry to true on production, false on labs (T155605) (duration: 00m 40s)
  • 14:03 moritzm: upgrading firejail on aqs cluster
  • 13:12 moritzm: uploaded firejail 0.9.44.6 for jessie-wikimedia to carbon
  • 12:21 marostegui: Enable gtid_domain_id on m3 - T149418
  • 12:06 moritzm: installing libio-socket-ssl-perl bugfix updates from jessie point release
  • 11:42 moritzm: installing sed bugfix updates from jessie point release
  • 11:17 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2064 - T154097 (duration: 00m 39s)
  • 11:09 moritzm: restarting mediawiki canary servers to pick up cairo and libpng updates
  • 10:38 moritzm: installing libxml security updates
  • 10:38 oblivian@puppetmaster1001: conftool action : set/pooled=no; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw123[0-5].*
  • 10:11 godog: pool ms-fe200[789] T152612
  • 10:10 marostegui: Restart mysql dbstore2001 to enable gtid_domain_id manually before deploying it on m3 - T149418
  • 10:05 marostegui: Remove partitions from enwiktionary.templatelinks on db2064 - T154097
  • 10:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2064 - T154097 (duration: 00m 45s)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2063 - T154097 (duration: 00m 48s)
  • 09:40 marostegui: Restart mysql dbstore2002 to enable gtid_domain_id manually before deploying it on m3 - T149418
  • 08:51 marostegui: Remove partitions from enwiktionary.templatelinks on db2063 - T154097
  • 08:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2063 - T154097 (duration: 00m 39s)
  • 08:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2060 - T154031 (duration: 00m 40s)
  • 08:18 marostegui: Compressing templatelinks tables on db1035 - T154465
  • 08:16 marostegui: Compressing templatelinks tables on db1038 - T154465
  • 08:00 _joe_: restarting pybal on lvs1003
  • 07:56 _joe_: restarting pybal on lvs1003
  • 07:34 oblivian@puppetmaster1001: conftool action : set/pooled=no; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw122[6-9].*
  • 07:32 _joe_: depooling mw1226-mw1235 from the https pool in eqiad, T152074
  • 07:30 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=nginx,cluster=api_appserver,dc=eqiad,name=mw12[7-9].*
  • 07:25 marostegui: Restart MySQL dbstore2001 to apply InnoDB defaults
  • 03:05 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 18 03:05:47 UTC 2017 (duration 5m 38s)
  • 03:00 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.8) (duration: 13m 22s)
  • 02:29 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 08m 23s)
  • 02:00 ejegg: updated SmashPig from 03ef6b1 to 48675c3
  • 01:21 mobrovac@tin: Finished deploy [citoid/deploy@9f93a00]: (no message) (duration: 04m 14s)
  • 01:17 mobrovac@tin: Starting deploy [citoid/deploy@9f93a00]: (no message)
  • 00:59 mobrovac@tin: Finished deploy [trending-edits/deploy@1d53b7c]: fixes for T153122 and T145571 (duration: 05m 06s)
  • 00:54 mobrovac@tin: Starting deploy [trending-edits/deploy@1d53b7c]: fixes for T153122 and T145571
  • 00:35 demon@tin: Synchronized w/mobilelanding.php: Last major fix for multiversion (duration: 00m 45s)
  • 00:29 ejegg: updated SmashPig from 3da597f to 03ef6b1
  • 00:00 demon@tin: Synchronized multiversion/MWVersion.php: Swap to using MWMultiVersion and make this a fallback (duration: 00m 39s)

2017-01-17

  • 23:47 demon@tin: Synchronized w: Step 4/∞ of multiversion cleanups (duration: 00m 39s)
  • 23:42 demon@tin: Synchronized multiversion/MWScript.php: Step 3/∞ of multiversion cleanups (duration: 00m 39s)
  • 23:21 demon@tin: Synchronized rpc/RunJobs.php: Step 2/∞ of multiversion cleanups (duration: 00m 39s)
  • 22:30 demon@tin: Synchronized multiversion: Step 1/∞ of multiversion cleanups (duration: 00m 55s)
  • 21:48 demon@tin: Synchronized multiversion/MWVersion.php: Removing old getMediaWikiCli() entry point, unused (duration: 00m 39s)
  • 20:41 demon@tin: Synchronized php-1.29.0-wmf.8/extensions/LiquidThreads/classes/Hooks.php: Fix warning about pass-by-ref (duration: 00m 40s)
  • 20:09 ema: restarting hhvm on mw1227 - hhvm-dump-debug in /tmp/hhvm.25127.bt
  • 20:08 demon@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to wmf.8
  • 19:52 demon@tin: Finished scap: testwiki to wmf.8 + rebuild l10n (duration: 46m 21s)
  • 19:36 ejegg: updated payments-wiki from 1f9ea80 to c22353b
  • 19:06 demon@tin: Started scap: testwiki to wmf.8 + rebuild l10n
  • 19:05 demon@tin: Synchronized wmf-config/throttle.php: throttle rule for T155493 (duration: 00m 40s)
  • 19:04 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Enable subpages in NS_MAIN in eswikiversity (duration: 00m 39s)
  • 19:02 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Add *.leventhalmap.org to the copyupload whitelist (duration: 00m 39s)
  • 19:01 demon@tin: Synchronized wmf-config/InitialiseSettings.php: avwiki namespace tweaks, T155321 (duration: 00m 39s)
  • 19:00 demon@tin: Synchronized wmf-config/throttle.php: T155510 throttle rule (duration: 01m 36s)
  • 18:03 mobrovac: restbase deploying a0e542b, switching to Node v6 T149331
  • 17:48 mobrovac: restbase installing node v6.9.1 on the cluster T149331
  • 16:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2056 - T154097 (duration: 00m 48s)
  • 16:22 marostegui: Powering off db2060 for maintenance - T154031
  • 16:16 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2005.codfw.wmnet
  • 15:01 moritzm: installing bash security updates
  • 14:10 moritzm: installing bind9 security updates
  • 13:33 moritzm: installing libpng security updates
  • 12:29 marostegui: Remove partitions from enwiktionary.templatelinks on db2056 - T154097
  • 12:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2056 - T154097 (duration: 00m 42s)
  • 12:16 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T154097 (duration: 00m 47s)
  • 11:54 moritzm: installing potrace security updates
  • 11:32 moritzm: installing w3m security updates
  • 11:22 moritzm: installing tre security updates
  • 11:19 moritzm: installing python-werkzeug security updates
  • 11:15 marostegui: Remove partitions from enwiktionary.templatelinks on db2049 - T154097
  • 11:13 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2049 - T154097 (duration: 00m 38s)
  • 10:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2041 - T154097 (duration: 00m 38s)
  • 10:43 moritzm: installing file/libmagic security updates
  • 10:35 hashar: CI switched NodeJS from v4 to v6 T155443 T149331
  • 10:22 moritzm: installing jq security updates
  • 10:16 hashar: Updating CI Jessie image for NodeJs 4 -> 6 upgrade. T155443
  • 10:06 moritzm: installing libwmf security updates
  • 10:02 marostegui: Remove partitions from enwiktionary.templatelinks on db2041 - T154097
  • 09:59 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2041 - T154097 (duration: 00m 38s)
  • 09:22 moritzm: installing tomcat security updates
  • 08:42 marostegui: Compressing wikidatawiki on db1026 - https://phabricator.wikimedia.org/T154929
  • 08:33 moritzm: installing tiff security updates
  • 07:50 marostegui: Remove partitions from enwiktionary.templatelinks on dbstore2001 - T154097
  • 07:26 marostegui: Compressing revision tables db1035 (depooled)
  • 06:58 marostegui: Compressing cebwiki/templatelinks (215G) table on db1038 - T154465
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 17 02:26:07 UTC 2017 (duration 4m 22s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 20s)

2017-01-16

  • 18:42 moritzm: uploaded nodejs 6.9.1 for jessie-wikimedia to carbon
  • 15:01 elukey: restarting hhvm on mw1167 - hhvm-dump-debug in /tmp/hhvm.20360.bt
  • 14:47 hashar: European SWAT complete
  • 14:44 hashar@tin: Synchronized wmf-config/throttle.php: Add a new throttle rule - T155416 (duration: 00m 38s)
  • 14:26 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Namespace aliases on Bhojpuri Wikipedia (bhwiki) - T155278 (duration: 00m 41s)
  • 14:20 hashar@tin: Synchronized wmf-config/throttle.php: Add one throttle rule + remove obsolete ones T155345 (duration: 00m 38s)
  • 14:18 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: wgBabelMainCategory for cswikiversity to UĹživatel %code% T155301 (duration: 00m 39s)
  • 13:12 moritzm: installing pysaml2 security updates
  • 12:26 moritzm: installing pdns-recursor security updates
  • 10:35 marostegui: Compressing templatelinks tables on db1044 (depooled) - T153826
  • 10:30 marostegui: Compressing pagelinks tables on db1038 - T154465
  • 09:13 marostegui: Compressing dewiki on db1026 - T154929
  • 08:56 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2034 - T149553 (duration: 00m 38s)
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 16 02:25:46 UTC 2017 (duration 4m 21s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 08m 04s)

2017-01-15

  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jan 15 02:25:27 UTC 2017 (duration 4m 23s)
  • 02:21 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 46s)

2017-01-14

  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jan 14 02:35:07 UTC 2017 (duration 4m 25s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 12m 03s)

2017-01-13

  • 22:56 godog: delete labs instance data older than 60d from graphite[21]001, low disk space
  • 02:25 l10nupdate@tin: ResourceLoader cache refresh completed at Fri Jan 13 02:25:57 UTC 2017 (duration 5m 16s)
  • 02:20 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 11s)
  • 01:32 bsitzmann@tin: Finished deploy [trending-edits/deploy@cf388a9]: Update trending-edits to 421fa63 (duration: 01m 53s)
  • 01:30 bsitzmann@tin: Starting deploy [trending-edits/deploy@cf388a9]: Update trending-edits to 421fa63
  • 00:12 demon@tin: Synchronized multiversion: Clean up cli entry point (duration: 00m 54s)

2017-01-12

  • 23:05 demon@tin: Synchronized README: no-op for force co-master sync (duration: 00m 40s)
  • 22:49 demon@tin: Synchronized docroot/foundation: Yay no more powerpoints (duration: 00m 38s)
  • 22:38 demon@tin: Synchronized docroot/foundation/presentations: removing some of these powerpoints (duration: 00m 38s)
  • 22:08 maxsem@tin: Synchronized php-1.29.0-wmf.7/extensions/Graph/includes/ApiGraph.php: Debug for T155057 (duration: 00m 38s)
  • 18:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Use HD logos for (nap|os|pl|pt)wiki (duration: 00m 41s)
  • 18:12 demon@tin: Synchronized static/images/project-logos: HD logos for (nap|os|pl|pt)wiki (duration: 00m 39s)
  • 09:16 akosiaris: T155112 upload Vagrant 1.9.1 to apt.wikimedia.org/jessie-wikimedia/thirdparty and apt.wikimedia.org/trusty-wikimedia/thirdparty
  • 08:59 hashar: disabling puppet on contint1001 to live hack apache conf ( T150727 )
  • 02:46 demon@tin: Synchronized wmf-config/interwiki.php: T154225 (duration: 00m 38s)
  • 02:37 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: no-op, completeness (duration: 00m 38s)
  • 02:36 l10nupdate@tin: ResourceLoader cache refresh completed at Thu Jan 12 02:36:23 UTC 2017 (duration 5m 15s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 11m 12s)
  • 01:33 Reedy: running scap pull on tin
  • 00:29 demon@tin: Synchronized wmf-config/InitialiseSettings.php: oathauth group for wikitech (duration: 00m 38s)
  • 00:21 demon@tin: Synchronized docroot/noc/db.php: (no message) (duration: 00m 39s)
  • 00:17 reedy@tin: Synchronized wmf-config: More consistency for various commits (duration: 00m 40s)
  • 00:12 nuria: restarted apache2 and mysql on bohrium to see if mysql no connection errors disappear
  • 00:12 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Wikidata lang config (duration: 00m 38s)

2017-01-11

  • 23:55 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Minerva hacks (duration: 00m 38s)
  • 23:47 reedy@tin: Synchronized wmf-config: consistency (duration: 00m 41s)
  • 23:16 demon@tin: Synchronized wmf-config/CommonSettings.php: video transcode jobqueue stuff for Brion (duration: 00m 38s)
  • 23:10 demon@tin: Synchronized w/static.php: For Timo <3 (duration: 00m 40s)
  • 22:48 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Use new metawiki logo, no-op in prod (duration: 00m 38s)
  • 22:47 demon@tin: Synchronized static/images/project-logos: beta logos (duration: 00m 40s)
  • 22:44 demon@tin: Synchronized wmf-config/CommonSettings.php: commentfix (duration: 00m 38s)
  • 22:39 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Simplify ores config. Enable sitenotice banners for arwiki (duration: 00m 39s)
  • 22:32 demon@tin: Synchronized wmf-config/abusefilter.php: comment fix (duration: 00m 39s)
  • 22:26 elukey: added mw1239.eqiad.wmnet back to service - T148421
  • 22:20 elukey: restarting hhvm on mw1198 (dump-debug in /tmp/hhvm.9737.bt)
  • 22:13 demon@tin: Synchronized wmf-config/abusefilter.php: Set $wgAbuseFilterNotificationsPrivate = true; for Meta-Wiki (duration: 00m 40s)
  • 22:04 demon@tin: Synchronized wmf-config/flaggedrevs.php: Deprecated variable cleanup (duration: 00m 38s)
  • 21:54 demon@tin: Synchronized scap/plugins/wmf-beta-autoupdate.py: no-op, not yet used (duration: 00m 38s)
  • 21:45 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: no-op (duration: 00m 38s)
  • 21:28 demon@tin: Synchronized wmf-config: massmessage hack cleanup + comments on kartographer wikivoyage mode (duration: 00m 41s)
  • 21:17 demon@tin: Synchronized wmf-config/InitialiseSettings.php: use new HD logos (duration: 00m 38s)
  • 21:16 demon@tin: Synchronized static/images/project-logos/notifications: New HD logos (duration: 00m 38s)
  • 21:10 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: pawiki logos. remove technican from trwikiquote (duration: 00m 38s)
  • 21:09 reedy@tin: Synchronized static/images: pawiki (duration: 00m 42s)
  • 21:03 Reedy: update collation of fiwikivoyage T151570
  • 21:02 reedy@tin: Synchronized php-1.29.0-wmf.7/extensions/CentralAuth/extension.json: fix name (duration: 00m 41s)
  • 21:01 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Translation namespace for mlwikisource. fiwikivoyage collation (duration: 00m 40s)
  • 20:53 reedy@tin: Synchronized wmf-config/CommonSettings.php: Upgrade Collections license URL to HTTPS (duration: 00m 57s)
  • 20:52 reedy@tin: Synchronized wmf-config/throttle.php: Fix throttle (duration: 00m 42s)
  • 20:44 Dereckson: Reset user e-mail for account for Panam2014
  • 20:43 demon@tin: Synchronized tests/noc-conf/NOCDblistTest.php: No-op (duration: 00m 40s)
  • 20:40 demon@tin: Synchronized wmf-config/InitialiseSettings-labs.php: no-op (duration: 00m 40s)
  • 19:36 demon@tin: Synchronized multiversion: rollback (duration: 00m 56s)
  • 19:34 demon@tin: Synchronized multiversion: MWVersion fallbacks & such (duration: 00m 56s)
  • 19:27 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/FlaggedRevs: Stupid errors (duration: 00m 46s)
  • 18:56 demon@tin: Synchronized multiversion/MWMultiVersion.php: Attempt #2 for Multiversion cleanup (duration: 00m 41s)
  • 18:08 ebernhardson: restart elasticsaerch on relforge100[12] for new test version of ltr plugin
  • 14:12 hashar: European SWAT completed
  • 14:11 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Import source on bd.wikimedia.org T154990 + Turn of patrolling on ruwiki T154285 (duration: 00m 42s)
  • 14:10 hashar@tin: Synchronized composer.json: build: Update PHPUnit from 3.7 to 4.8, add phplint to composer-test - T85947 (duration: 00m 45s)
  • 14:09 hashar@tin: Synchronized composer.lock: build: Update PHPUnit from 3.7 to 4.8, add phplint to composer-test - T85947 (duration: 00m 55s)
  • 14:06 hashar: scap pull on terbium
  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 11 02:35:46 UTC 2017 (duration 4m 31s)
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 10m 47s)
  • 00:39 _joe_: restart hhvm on mw1182, stuck on HPHP::Treadmill::getAgeOldestRequest

2017-01-10

  • 23:09 hoo: Ran DELETE FROM wbc_entity_usage WHERE eu_row_id IN(1714177, 1714178, 1714179, 1714180, 1714181, 1714182, 1714183, 1714184, 3914375); on s5 master (T147630)
  • 22:53 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/VisualEditor/ApiVisualEditor.php: T154962 logspam (duration: 00m 41s)
  • 22:46 mutante: gerrit restarting for config change 331553
  • 22:13 reedy@tin: Synchronized php-1.29.0-wmf.7/includes/registration/ExtensionRegistry.php: (no message) (duration: 00m 43s)
  • 22:12 reedy@tin: Synchronized php-1.29.0-wmf.7/extensions/Wikidata: (no message) (duration: 02m 21s)
  • 22:06 demon@tin: Synchronized php-1.29.0-wmf.7/includes/libs/objectcache/WANObjectCache.php: Silence obnoxious replag errors (duration: 00m 42s)
  • 21:33 eileen2: civicrm updated from b26844d to af8d735
  • 21:22 mutante: cp3048 - labservices1001 - ran puppet, in this case it wasn't about gerrit, but recovered too
  • 21:15 mutante: sca2004, labsdb1003 - ran puppet (they wanted to git clone during gerrit restart)
  • 21:04 mutante: gerrit restarting for config change 49993 (T40114)
  • 20:16 dereckson@tin: Synchronized php-1.29.0-wmf.7/extensions/SemanticMediaWiki: Remove deprecated function usages (T147924) (duration: 00m 49s)
  • 20:14 dereckson@tin: Synchronized php-1.29.0-wmf.7/extensions/UploadWizard/resources/transports/mw.FormDataTransport.js: mw.FormDataTransport: Don't remove Unicode characters from temp filename (T155039) (duration: 00m 41s)
  • 19:49 demon@tin: Synchronized scap/plugins/prep.py: another no-op (duration: 00m 41s)
  • 19:39 demon@tin: Synchronized scap/plugins/prep.py: prod no-op, for completeness (duration: 00m 40s)
  • 18:41 legoktm: re-attached User:Fuu5tgsrygr / T154983
  • 04:30 dereckson@tin: Synchronized wmf-config/throttle.php: Fix Dayanand College Solapur event throttle rule (T154312) (duration: 00m 44s)
  • 02:26 l10nupdate@tin: ResourceLoader cache refresh completed at Tue Jan 10 02:26:47 UTC 2017 (duration 4m 22s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 46s)
  • 01:34 reedy@tin: Synchronized rpc/RunJobs.php: revert (duration: 00m 40s)
  • 01:32 reedy@tin: Synchronized w: revert 0a2a096 (duration: 00m 40s)
  • 01:31 reedy@tin: Synchronized multiversion: revert 0a2a096 (duration: 00m 56s)
  • 01:01 Dereckson: Updated articles count on pl.wikisource: 491 100 (T154711)
  • 00:59 Dereckson: Fixed links with namespaceDupes on pl.wikisource
  • 00:58 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Add Collection namespace to the Polish Wikisource (T154711) (duration: 00m 41s)

2017-01-09

  • 23:38 reedy@tin: Synchronized wmf-config/CommonSettings.php: wfLoadExtension (duration: 00m 40s)
  • 23:34 reedy@tin: Synchronized wmf-config/extension-list: More to extension.json (duration: 00m 40s)
  • 23:29 demon@tin: Synchronized multiversion: Final batch of MWVersion cleanup (in song form) (duration: 00m 56s)
  • 23:28 demon@tin: Synchronized rpc/RunJobs.php: More cleanup songs (duration: 00m 40s)
  • 23:28 mutante: ganglia web - replacing SSL cert with Letsencrypt
  • 23:26 demon@tin: Synchronized w: Cleanup cleanup everybody do your share (duration: 00m 40s)
  • 23:25 demon@tin: Synchronized multiversion/MWMultiVersion.php: Cleanup cleanup everybody everywhere (duration: 00m 40s)
  • 23:09 reedy@tin: Synchronized wmf-config/CommonSettings-labs.php: T154927 (duration: 00m 41s)
  • 23:08 reedy@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T154927 (duration: 00m 42s)
  • 23:07 reedy@tin: Synchronized wmf-config/extension-list-labs: T154927 (duration: 00m 41s)
  • 22:53 robh: updating lists.w.o to use LE cert
  • 21:50 akosiaris: service restart zotero on sca1003, sca1004. Zotero OOMed again as usual
  • 19:49 robh: updating librenms.wikimedia.org cert, netmon1001 only system affected
  • 18:16 paravoid: rebooting and powercycling mira, CPU frequency throttled, suspecting firmware bug
  • 17:51 hoo: Updated the Wikidata property suggester with data from last Monday's JSON dump and applied the T132839 workarounds
  • 15:35 zeljkof: finished EU SWAT!
  • 15:33 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add en.wikinews and es.wikinews as import source in testwiki (T154879) (duration: 02m 38s)
  • 15:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable import from cswiki to arbcom_cswiki (T154799) (duration: 02m 38s)
  • 15:10 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add DW alias for NS_PROJECT_TALK in frwiki (T153952) (duration: 02m 36s)
  • 15:04 zeljkof: extending EU SWAT, tree more patches left to deploy
  • 15:00 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: [throttle] Lift for 2017-01-10/12 + minor cleanup (T154312) (duration: 02m 36s)
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Extension:Babel s category on cswikiversity (T67211) (duration: 02m 36s)
  • 14:39 zfilipin@tin: Synchronized static/images/project-logos: SWAT: Add HD logos for multiple projects (T150618) (duration: 02m 36s)
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add digitalmedia.fws.gov to the whitelist (T154671) (duration: 02m 38s)
  • 10:38 akosiaris: restart nginx and rcstream on rcs1001.eqiad.wmnet to debug issue with prematurely closed connections and 502 returned to clients. No change witnessed.
  • 07:12 ebernhardson: restart elasticsearch on relforge100[12] to adjust ltr logging settings
  • 02:45 l10nupdate@tin: ResourceLoader cache refresh completed at Mon Jan 9 02:45:57 UTC 2017 (duration 4m 36s)
  • 02:41 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 27m 03s)

2017-01-08

  • 02:27 l10nupdate@tin: ResourceLoader cache refresh completed at Sun Jan 8 02:27:29 UTC 2017 (duration 4m 40s)
  • 02:22 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 07m 33s)

2017-01-07

  • 19:07 ejegg: disabled dedupe civicrm contacts
  • afk: re-enabled dedupe civicrm contacts
  • afk: disabled dedupe civicrm contacts
  • 13:31 dcausse: elastic@codfw removing/readding replicas for viwiki_general and zhwiki_content (affected by something similar to https://github.com/elastic/elasticsearch/issues/12661) - T154765
  • 11:35 _joe_: from medelevium
  • 11:35 _joe_: restarted apache/otrs, removed a 8 gb error.log
  • 05:11 Dereckson: Update statistics count on so.wikipedia (T154833)
  • 02:35 l10nupdate@tin: ResourceLoader cache refresh completed at Sat Jan 7 02:35:56 UTC 2017 (duration 5m 20s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 11m 08s)

2017-01-06

  • 23:08 godog: force puppet run on cache_upload in eqiad to switch thumbs back from codfw
  • 22:38 demon@tin: Synchronized multiversion: updateBranchPointers consolidation (duration: 00m 56s)
  • 20:14 demon@tin: Synchronized w: Dropping old entry point (duration: 00m 41s)
  • 19:35 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/UploadWizard/resources: I32e0b8 (duration: 00m 40s)
  • 19:29 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/UploadWizard/resources/mw.UploadWizard.js: I32e0b8 (duration: 00m 59s)
  • 19:17 ebernhardson: restarting elasticsearch on relforge100[12] to test new search-ltr plugin
  • 19:07 ostriches: gerrit: Started full reindex of all changes, should be background but will be watching
  • 18:59 mutante: gerrit restarting for config change 308753 - will be back in seconds
  • 18:13 mutante: mw1205 - restarted hhvm
  • 18:08 godog: force puppet run on eqiad cache_upload to switch thumbs to codfw
  • 16:17 godog: bounce swift-proxy on ms-fe100[123] leave ms-fe1004 for investigation
  • 16:14 hashar: Restarting Nodepool
  • 16:05 ema: wiping codfw caches T154758
  • 15:29 cmjohnson1: powering off mw1239 to reseat DIMM
  • 15:22 papaul: elastic2025-elastic2036 - signing puppet certs, salt-key, initial run
  • 14:53 mark: papaul powercycled asw-a7-codfw 14:50
  • 14:16 reedy@tin: Finished scap: Rebuild message cache for Echo api messages being missing T154110 (duration: 25m 00s)
  • 13:51 reedy@tin: Started scap: Rebuild message cache for Echo api messages being missing T154110
  • 10:00 ariel@tin: Synchronized wmf-config/throttle.php: test, noop (duration: 02m 45s)
  • 09:24 paravoid: asw-a7-codfw is down, serial console unresponsive
  • 09:12 ariel@tin: Synchronized wmf-config/throttle.php: Adjust throttle rule for Maharashtra 'Edit Wikipedia' workshop (VNGIASS) (duration: 02m 46s)
  • 07:08 moritzm: installing crypto++ security updates on trusty hosts
  • 04:37 matt_flaschen: Finished FlowFixInconsistentBoards.php (production mode) on all wikis
  • 04:27 matt_flaschen: Started FlowFixInconsistentBoards.php (production mode) on all wikis
  • 04:10 mutante: Icinga now using Letsencrypt cert and all good
  • 03:42 mutante: icinga - debugging issue with cert change
  • 03:05 papaul: OS instalaltion on elastic2025-elastic2036
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 13m 23s)
  • 01:27 bd808: Restarted logstash on logstash1002 (T154732)
  • 01:06 Krinkle: stream.wikimedia.org problems - nginx responds with HTTP 502 Bad Gateway to most requests
  • 01:01 ostriches: gerrit: back up from upgrade
  • 01:00 ostriches: gerrit: down for upgrade
  • 00:13 mutante: analytics1036, ms-fe1003 - ran puppet to fix Icinga
  • 00:11 mutante: carbon - stopping ganglia-monitor-aggregator for good

2017-01-05

  • 23:49 mutante: rolling out exim4 upgrades (DSA 3747-1) on all remaining eqiad (all-eqiad)
  • 23:46 godog: fix root-owned files on puppetmaster1001:/var/lib/git/operations/private/ causing /srv/private post-commit hook to fail
  • 23:18 mutante: switching eqiad ganglia aggregator - running puppet on install1001 - disabling on carbon, re-enabling puppet across eqiad
  • 23:05 mutante: temp disabling puppet on all eqiad hosts via salt - during ganglia aggregator switch
  • 22:51 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/Echo/includes/api: silence api warnings (duration: 02m 46s)
  • 22:40 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/VisualEditor: silence some api warnings (duration: 02m 48s)
  • 22:22 demon@tin: Synchronized php-1.29.0-wmf.7/extensions/CodeReview/api/ApiQueryCodeComments.php: silence some warnings (duration: 02m 46s)
  • 21:59 mutante: rolling out exim4 upgrades (DSA 3747-1) on all remaning ones in codfw (all-codfw)
  • 21:53 ejegg: updated payments wiki from 21ea9bc to 1f9ea80
  • 21:53 mutante: mx1001 - upgrading exim4 packages, exim4-daemon-heavy, forcing puppet run
  • 21:35 mutante: mx2001 - upgrading exim4 packages, daemon-heavey, forcing puppet run
  • 21:22 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.29.0-wmf.7
  • 20:58 mutante: mendelevium (OTRS) - upgrade exim4 packages, force puppet run
  • 20:45 mutante: iridium (phabricator) - upgrade exim4 packages, force puppet run
  • 20:29 demon@tin: Synchronized docroot/wikipedia.org: removing junk 15 stuff (duration: 04m 50s)
  • 20:18 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.7
  • 20:07 demon@tin: Synchronized docroot: Ok last docroot thing for today I promise (duration: 05m 00s)
  • 19:52 demon@tin: Synchronized docroot: Final bit of this round of docroot cleanup (duration: 05m 00s)
  • 19:44 thcipriani@tin: Finished scap: SemanticForms l10n cache rebuild for 1.29.0-wmf.7 (duration: 39m 21s)
  • 19:35 mutante: mira - upgrade exim, python-requests, linux-image
  • 19:29 mutante: rolled out exim4 upgrades (DSA 3747-1) on contint1001/2001, dbmonitor1001/2001, tungsten, seaborgium, hassium, labstore*
  • 19:05 thcipriani@tin: Started scap: SemanticForms l10n cache rebuild for 1.29.0-wmf.7
  • 18:34 mutante: fermium (lists server) - upgrading exim packages, exim4-daemon-heavy, forcing puppet run
  • 18:29 yurik@tin: Finished deploy [graphoid/deploy@d20b00e]: (no message) (duration: 03m 11s)
  • 18:26 yurik@tin: Starting deploy [graphoid/deploy@d20b00e]: (no message)
  • 18:25 arlolra: Updated Parsoid to 974dd5b3 (T143183, T102134, T113044)
  • 18:25 mutante: rolling out exim4 upgrades (DSA 3747-1) on prometheus, mwlog, pollux, labmon, lithium, hassaleh, dubnium, graphite1002, tin, serpens, bromine, dataset1001
  • 18:16 arlolra@tin: Finished deploy [parsoid/deploy@465f9c4]: Updating Parsoid to 974dd5b3 (duration: 10m 46s)
  • 18:15 mutante: rolling out exim4 upgrades (DSA 3747-1) on puppetmaster, yubiauth, oresrdb, (oresrdb1001 - Unknown installation error.. eh.. this is new)
  • 18:05 arlolra@tin: Starting deploy [parsoid/deploy@465f9c4]: Updating Parsoid to 974dd5b3
  • 17:57 robh: shutting down mw2075-2089 for decom per T154621
  • 17:35 robh: diabling puppet on mw2075-2089 to decommission them today.
  • 17:15 mutante: rolling out exim4 upgrades (DSA 3747-1) on ruthenium, einsteinium (icinga), etherpad1001, rhodium, | einsteinium: upgrade python packages, kernel | xenon: apt-get autoremove, upgrade python- arcconf, libs...
  • 17:07 mutante: scandium (zuul merger), upgrade exim, python-requests, kernel version
  • 17:06 mutante: rolling out exim4 upgrades (DSA 3747-1) on kraz, wdqs2003, wezen, zosma, tegmen, rutherfordium. upgrade kernel and python-requests on zosma
  • 16:59 mutante: rolling out exim4 upgrades (DSA 3747-1) on all-db-noncore, all-mw-eqiad, restbase-eqiad, kafka-main
  • 16:52 mutante: rolling out exim4 upgrades (DSA 3747-1) on notebook, lvs-canary, lvs, mw-maintenance, all-mw-codfw
  • 16:33 mutante: cobalt upgrading exim packages
  • 16:28 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-api, swift-fe, swift-be, sca, scb, misc-analytics
  • 16:21 mutante: rolling out exim4 upgrades (DSA 3747-1) on db-core-eqiad, db-misc-servers, videoscaler
  • 15:59 moritzm: upgrading firejail on thumbor servers
  • 15:44 chasemp: labstore1005 systemctl disable create-dbusers
  • 15:01 mobrovac: restbase restarting for firejail upgrade
  • 14:46 moritzm: upgrading firejail on image scalers
  • 14:19 moritzm: upgrading firejail on restbase production hosts
  • 14:15 hashar@tin: Synchronized php-1.29.0-wmf.7/extensions/ContentTranslation: Workaround to fix restoration for truncated section ids - T154279 (duration: 02m 10s)
  • 14:10 moritzm: upgrading firejail on restbase staging hosts
  • 13:46 moritzm: installing audiofile security updates
  • 13:09 mobrovac@tin: Finished deploy [trending-edits/deploy@c5d239b]: Restart for firejail upgrade (duration: 00m 46s)
  • 13:08 mobrovac@tin: Starting deploy [trending-edits/deploy@c5d239b]: Restart for firejail upgrade
  • 13:07 mobrovac@tin: Finished deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade (duration: 05m 29s)
  • 13:02 mobrovac@tin: Starting deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade
  • 13:01 mobrovac@tin: Finished deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade (duration: 03m 40s)
  • 12:57 mobrovac@tin: Starting deploy [electron-render/deploy@b2a820e]: Restart for firejail upgrade
  • 12:45 mobrovac@tin: Finished deploy [mobileapps/deploy@c39bd1f]: Restart for firejail upgrade (duration: 04m 05s)
  • 12:41 mobrovac@tin: Starting deploy [mobileapps/deploy@c39bd1f]: Restart for firejail upgrade
  • 12:41 mobrovac@tin: Finished deploy [mathoid/deploy@79fdd56]: Restart for firejail upgrade (duration: 00m 41s)
  • 12:40 mobrovac@tin: Starting deploy [mathoid/deploy@79fdd56]: Restart for firejail upgrade
  • 12:40 mobrovac@tin: Finished deploy [graphoid/deploy@151f26c]: Restart for firejail upgrade (duration: 00m 43s)
  • 12:39 mobrovac@tin: Starting deploy [graphoid/deploy@151f26c]: Restart for firejail upgrade
  • 12:38 mobrovac@tin: Finished deploy [cxserver/deploy@0279029]: Restart for firejail upgrade (duration: 00m 39s)
  • 12:37 mobrovac@tin: Starting deploy [cxserver/deploy@0279029]: Restart for firejail upgrade
  • 12:36 mobrovac@tin: Finished deploy [citoid/deploy@da96f4b]: (no message) (duration: 01m 06s)
  • 12:35 mobrovac@tin: Starting deploy [citoid/deploy@da96f4b]: (no message)
  • 12:24 moritzm: installing firejail security updates on scb
  • 11:13 akosiaris: rebooting bast3001, T154603
  • 11:11 moritzm: uploaded firejail 0.9.44+wmf2 for jessie-wikimedia to carbon
  • 07:54 elukey: chown www-data:www-data all the root:adm hhvm log files on mw eqiad hosts (T132324)
  • 07:11 marostegui: Compressing revision tables across all the wikis - db1038 - T154465
  • 07:09 marostegui: Compressing pagelinks tables across all the wikis - db1044 - T153826
  • 07:08 marostegui: Compressing revision tables across all the wikis - db1015 - T153739
  • 06:15 bd808: sudo -u l10nupdate rm /var/lock/scap on tin to clean up lock left by bad l10nupdate locking attempt
  • 05:49 mutante: rolling out exim4 upgrades (DSA 3747-1) on swift-fe-codfw, swift-be-codfw, ALL remaining mw
  • 05:44 mutante: rolling out exim4 upgrades (DSA 3747-1) on db-core-codfw, etcd, graphite, kafka-analytics-canary, kafka-analytics, logstash
  • 05:39 mutante: rolling out exim4 upgrades (DSA 3747-1) on prometheus, aqs, db-es
  • 05:35 mutante: rolling out exim4 upgrades (DSA 3747-1) on cp-eqiad, memcached-eqiad
  • 01:48 mutante: rolling out exim4 upgrades (DSA 3747-1) on redis-codfw (rdb2005 needed manual) and all of dc-ulsfo, dc-esams
  • 01:41 mutante: rolling out exim4 upgrades (DSA 3747-1) on ganeti, cp-ulsfo, wdqs, thumbor, db-es-codfw
  • 01:36 mutante: rolling out exim4 upgrades (DSA 3747-1) on parsoid, maps, cp-esams
  • 01:32 mutante: servermon - after the next update by cron - package data is back
  • 01:13 mattflaschen@tin: Synchronized wmf-config/InitialiseSettings.php: Disable NewUserMessage on gomwiki (duration: 00m 41s)
  • 01:10 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw2090.codfw.wmnet
  • 01:06 demon@tin: Synchronized scap/plugins: (no message) (duration: 00m 40s)
  • 01:04 mattflaschen@tin: Synchronized php-1.29.0-wmf.7/extensions/Flow: Flow script to add more troubleshooting information to a maintenance script (duration: 00m 56s)
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2090.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2089.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2088.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2087.codfw.wmnet
  • 00:57 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2086.codfw.wmnet
  • 00:56 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2085.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2084.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2083.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2082.codfw.wmnet
  • 00:55 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2081.codfw.wmnet
  • 00:54 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2080.codfw.wmnet
  • 00:54 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2080.codfw.wmnet
  • 00:52 mattflaschen@tin: Synchronized php-1.29.0-wmf.6/extensions/Flow: Two Flow fixes related to production database/content inconsistencies. (duration: 00m 59s)
  • 00:42 mutante: phab2001 - same chmod 751 on exim4 dirs that i manually did on krypton is done by puppet here, fully automatic, not sure why krypton was a one-off
  • 00:41 mutante: planet1001/2001, phab2001 - upgrade exim4, exim4-daemon-heavy
  • 00:37 mutante: servermon - weird behaviour in the "pending package upgrades" list? exim4 package was shown as pending on lots of hosts, after next upgrade it disppears from list, even though about half the servers should still be listed
  • 00:23 mutante: krypton - chmod 751 /var/spool/exim4/ to fix Icinga alerts about unaccesible tmpfs (nagios user could not access), it was 751 on other hosts like ununpentium
  • 00:17 mutante: krypton - stop exim, umount orphaned "scan" tmpfs (there is no clamav here)
  • 00:00 mutante: gerrit slowdown reported around 23:55 UTC, was back to normal after 2 minutes (T148478) - attaching latest jvm_gc log
  • 00:00 aaron@tin: Synchronized wmf-config/logging.php: No-op sync of 7e103f2 (duration: 00m 42s)

2017-01-04

  • 23:55 mutante: krypton - chown Debian-exim:Debian-exim /var/spool/exim4/scan/ to fix Icinga-reported DISK issue - wrong permissions - see puppet/modules/exim4/manifests/init.pp line 57 ff "catch-22 with Puppet vs. package"
  • 23:20 mutante: rolled out exim4 upgrades (DSA 3747-1) on memcached-canary, memcached-codfw, restbase-codfw, cp-codfw
  • 23:11 aaron@tin: Synchronized wmf-config/logging.php: Include DB shard as a logstash column (duration: 00m 41s)
  • 23:09 mutante: rolling out exim4 upgrades (DSA 3747-1) on memcached-canary, memcached-codfw, restbase-codfw
  • 23:06 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-eqiad
  • 22:52 robh: all my server depools and decoms for the mw range are on T154621
  • 22:51 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2078.codfw.wmnet
  • 22:51 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2077.codfw.wmnet
  • 22:51 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2076.codfw.wmnet
  • 22:50 robh@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2075.codfw.wmnet
  • 22:49 godog: rename / reimage restbase-test1* to restbase-dev1*
  • 22:46 bblack: TLS: unified certificates in esams switching to digicert
  • 22:19 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-codfw
  • 21:55 Krinkle: mwscript deleteEqualMessages.php --wiki nowikinews (T45917)
  • 21:55 Krinkle: mwscript deleteEqualMessages.php --wiki nowiki (T45917)
  • 21:48 otto@tin: Finished deploy [eventstreams/deploy@a103be2]: (no message) (duration: 01m 32s)
  • 21:46 otto@tin: Starting deploy [eventstreams/deploy@a103be2]: (no message)
  • 21:24 bsitzmann@tin: Finished deploy [mobileapps/deploy@c39bd1f]: Update mobileapps to b43c5d6 (duration: 02m 55s)
  • 21:21 bsitzmann@tin: Starting deploy [mobileapps/deploy@c39bd1f]: Update mobileapps to b43c5d6
  • 21:15 smalyshev@tin: Finished deploy [wdqs/wdqs@3762556]: (no message) (duration: 02m 40s)
  • 21:13 smalyshev@tin: Starting deploy [wdqs/wdqs@3762556]: (no message)
  • 21:01 smalyshev@tin: Finished deploy [wdqs/wdqs@3762556]: (no message) (duration: 00m 59s)
  • 21:00 smalyshev@tin: Starting deploy [wdqs/wdqs@3762556]: (no message)
  • 20:49 madhuvishy: adding temporary IP tables rule on labservices1001 to drop traffic from toolchecker for tests (T152369)
  • 20:39 mutante: rolling out exim4 upgrades (DSA 3747-1) on mw-canary hosts
  • 20:36 mutante: rolling out exim4 upgrades (DSA 3747-1) on stat* and kubernetes hosts
  • 20:05 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.29.0-wmf.7
  • 19:09 thcipriani@tin: Synchronized portals: SWAT: Bumping portal to master T128546 (duration: 00m 42s)
  • 19:08 thcipriani@tin: Synchronized portals/prod/wikipedia.org/assets: SWAT: Bumping portal to master T128546 (duration: 00m 43s)
  • 18:30 mutante: rolling out exim4 upgrades (DSA 3747-1) on misc servers
  • 18:29 ejegg: enabled payment processor audit parser jobs
  • 17:47 matt_flaschen: Ran manual DB update to officewiki for T153320.
  • 15:20 zeljkof: EU SWAT finished
  • 15:19 hashar@tin: Synchronized wmf-config/throttle.php: (no message) (duration: 00m 41s)
  • 15:03 zeljkof: extending eu swat until 330392 is merged
  • 14:56 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Throttle rules for 2017-01-06/07, tewiki (T154568) (duration: 00m 40s)
  • 14:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set valid content language for Norwegian wikis (T126146) (duration: 00m 41s)
  • 14:20 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert $wgMFEEditorOptions[anonymousEditing] = false for kowiki (T119823) (duration: 00m 41s)
  • 11:18 jynus: continuing maintenance on db1035 (mysql replication stopped)
  • 10:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2060 - T154031 (duration: 00m 47s)
  • 07:43 marostegui: Compressing tables on db1015 - T153739
  • 07:24 marostegui: Compressing more tables on db1044 - T153826
  • 03:10 l10nupdate@tin: ResourceLoader cache refresh completed at Wed Jan 4 03:10:08 UTC 2017 (duration 5m 33s)
  • 03:05 bd808@tin: Synchronized wmf-config/throttle.php: Add throttle rules for January 2017 events in Maharashtra (T154312) (duration: 00m 42s)
  • 03:04 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.7) (duration: 13m 05s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.29.0-wmf.6) (duration: 11m 17s)
  • 01:24 maxsem@tin: Synchronized php-1.29.0-wmf.6/extensions/CentralAuth: https://gerrit.wikimedia.org/r/#/c/330345/ (duration: 00m 44s)
  • 01:03 maxsem@tin: Synchronized php-1.29.0-wmf.7/extensions/Flow: https://gerrit.wikimedia.org/r/#/c/330338/ (duration: 00m 58s)
  • 00:42 foks: removed 2fa for account per T154171
  • 00:33 maxsem@tin: Synchronized wmf-config/unitConversionConfig.json: https://gerrit.wikimedia.org/r/#/c/327907/5 (duration: 00m 40s)
  • 00:13 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/330264/3 (duration: 00m 41s)
  • 00:06 eileen: update civicrm from f78c894 to b26844d

2017-01-03

  • 23:57 thcipriani@tin: rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.29.0-wmf.7
  • 23:48 thcipriani@tin: Finished scap: testwiki to php-1.29.0-wmf.7 and rebuild l10n cache (duration: 49m 50s)
  • 23:19 ostriches: gerrit: quick restart of services
  • 22:58 thcipriani@tin: Started scap: testwiki to php-1.29.0-wmf.7 and rebuild l10n cache
  • 22:55 chasemp: iptables block of tools-checker-01 to debug DNS SPoF
  • 22:44 maxsem@tin: Synchronized php-1.29.0-wmf.7/extensions/Kartographer: https://gerrit.wikimedia.org/r/#/c/330321/ (duration: 00m 42s)
  • 22:38 maxsem@tin: Synchronized php-1.29.0-wmf.6/extensions/Kartographer: https://gerrit.wikimedia.org/r/#/c/330322/ (duration: 00m 42s)
  • 22:11 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: mapframe on fr: and fi: https://gerrit.wikimedia.org/r/#/c/330311/3 (duration: 00m 41s)
  • 21:00 gehel@tin: Finished deploy [wdqs/wdqs@a25d3aa]: (no message) (duration: 00m 55s)
  • 20:59 gehel@tin: Starting deploy [wdqs/wdqs@a25d3aa]: (no message)
  • 20:57 otto@tin: Finished deploy [eventstreams/deploy@9095b4e]: (no message) (duration: 11m 15s)
  • 20:45 otto@tin: Starting deploy [eventstreams/deploy@9095b4e]: (no message)
  • 20:18 gehel@tin: Finished deploy [wdqs/wdqs@cd7215c]: (no message) (duration: 04m 54s)
  • 20:13 gehel@tin: Starting deploy [wdqs/wdqs@cd7215c]: (no message)
  • 19:56 demon@tin: Synchronized multiversion/updateBranchPointers: Removing unused --dry-run option (duration: 00m 40s)
  • 19:52 thcipriani@tin: Synchronized multiversion: SWAT: Remove checkoutMediaWiki (duration: 00m 58s)
  • 19:13 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add badge for "digitaldocument" in Wikibase T153186 (duration: 01m 33s)
  • 18:01 moritzm: installing fontconfig security updates
  • 17:44 mutante: terbium - Notice: /Stage[main]/Mediawiki::Maintenance::Generatecaptcha/Cron[generatecaptcha]/ensure: created(T150029)
  • 17:24 thcipriani: starting branch cut for 1.29.0-wmf.7
  • 17:04 mutante: iridium (phab) - apt-get clean ; find /var/log/account/ -mtime +10 -delete ; find /var/log/atop/ -mtime +10 -delete (T154407)
  • 16:59 thcipriani: enable l10nupdate cron post deployment-freeze
  • 16:49 mutante: iridium (phab) - reduce process accounting from 30 days to 10 days to save disk space used by /var/log/account, run /etc/cron.daily/acct (T154407)
  • 16:37 bd808: Updated scholarships to 1690808 on krypton; needed help from _joe_ to make trebuchet work
  • 15:58 _joe_: rolling restart of restbase on the production cluster
  • 15:49 _joe_: rolling restart of restbase on the test cluster
  • 14:53 akosiaris: reenabling ntpd on the restbase in eqiad
  • 14:43 gehel: upgrade liblogstash-gelf on elastic* - T150408
  • 14:35 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: [2/2] Add HD logos for multiple wikis - T150618 (duration: 00m 40s)
  • 14:33 hashar@tin: Synchronized static/images/project-logos: [1/2] Add HD logos for multiple wikis - T150618 (duration: 00m 40s)
  • 14:29 akosiaris: reenabling ntpd on the rest of the boxes. Leaving restbase only out for last
  • 14:21 hashar@tin: Synchronized wmf-config/throttle.php: New rules + remove obsolete rules T154245 (duration: 00m 40s)
  • 14:19 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mapframe for nowiki T154021 (duration: 00m 39s)
  • 14:17 akosiaris: reenabling ntpd on aqs boxes
  • 14:17 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Set sortPrepend for gdwiki T153900 (duration: 00m 40s)
  • 14:16 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable SandboxLink on ruwiki T153855 (duration: 00m 40s)
  • 14:10 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable subpages in NS0 for arbcom_cswiki - T154247 (duration: 00m 40s)
  • 14:08 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Add new page protection level on etwiki - T153465 (duration: 00m 53s)
  • 14:08 akosiaris: reenabling ntpd on maps, maps-test boxes
  • 14:07 akosiaris: reenabling ntpd on kafka boxes
  • 13:37 akosiaris: reenabling ntpd on elastic boxes
  • 13:27 akosiaris: reenabling ntpd on rdb boxes
  • 13:22 akosiaris: reenabling ntpd on conf boxes
  • 13:22 akosiaris: reenabling ntpd on es* boxes
  • 13:15 akosiaris: reenabling ntpd on scb* boxes
  • 13:09 moritzm: uploaded Linux 4.4.39 for jessie-wikimedia to carbon
  • 13:09 akosiaris: reenabling ntpd on mc* boxes
  • 13:01 akosiaris: reenabling ntpd on ms-fe boxes
  • 13:00 moritzm: installing libgd security updates
  • 12:55 akosiaris: reenabling ntpd on ms-be boxes
  • 12:52 akosiaris: reenabling ntpd on lvs boxes
  • 12:48 akosiaris: reenabling ntpd on analytics boxes
  • 12:41 moritzm: installing python security updates
  • 12:12 moritzm: installing squid security updates
  • 12:07 akosiaris: reenabling ntpd on pc eqiad & codfw boxes
  • 12:06 akosiaris: reenabling ntpd on ganeti eqiad & codfw boxes
  • 11:54 akosiaris: reenabling ntpd on wtp eqiad boxes
  • 11:52 akosiaris: reenabling ntpd on logstash eqiad boxes
  • 11:51 akosiaris: reenabling ntpd on db* eqiad boxes
  • 11:46 akosiaris: reenabling ntpd on cobalt (gerrit)
  • 11:32 moritzm: installing tar security updates on trusty hosts
  • 11:27 gehel: upgrade liblogstash-gelf on deployment-elastic* - T150408
  • 11:16 gehel: upgrade lilogstash-gelf on relforge - T150408
  • 11:13 akosiaris: reenabling ntpd on db* codfw boxes
  • 11:07 akosiaris: reenabling ntpd on wtp codfw boxes
  • 10:59 akosiaris: reenabling ntpd on mw eqiad boxes
  • 10:53 jynus: stopping mysql replication on db1035 (depooled)
  • 10:50 akosiaris: reenabling ntpd on mw codfw boxes
  • 10:44 akosiaris: reenabling ntpd on eqiad cp boxes
  • 10:39 akosiaris: reenabling ntpd on codfw cp boxes
  • 10:14 akosiaris: start enabling ntpd again across the fleet. Starting with cp boxes on ulsfo and esams
  • 09:23 marostegui: stop MySQL dbstore2002 for maintenance - T151552
  • 09:10 marostegui: stop MySQL dbstore2001 for maintenance - T151552
  • 08:21 marostegui: Run optimize table on db1038 on all the revision,templatelinks and pagelinks tables - T154465
  • 08:00 marostegui: Run optimize table on a few large tables - db1015 - T153739
  • 07:58 elukey: chown www-data:www-data all the root:adm hhvm log files on mw codfw hosts (T132324)
  • 07:54 marostegui: Run optimize table on a few large tables - db1044 - T153826
  • 07:30 marostegui: Stop mysql db2048 and db2034 for maintenance - https://phabricator.wikimedia.org/T149553

2017-01-02

  • 23:09 hoo: Removed 2fa from an account, per T154450
  • 17:20 ema: iridium: removed /var/log/account/pacct.2[0-9].gz to free up more disk space
  • 16:05 ema: removing old kernels and kernel headers from iridium to free up some disk space
  • 13:24 elukey: powercycled mw1280, not pingable and mgmt console frozen

2017-01-01

  • 02:23 chasemp: labservices1001 'racadm serveraction hardreset'
  • 02:23 godog: reboot labservices1001, unresponsive on console and MCE/temperature alerts found on lithium
  • 00:56 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1286.eqiad.wmnet,service=apache2
  • 00:55 bd808: Restarted logstash on logstash1001 (T154388)
  • 00:46 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1286.eqiad.wmnet
  • 00:27 godog: dump core file and restart varnish-frontend on cp2026


2000s

2010s

2020s