Server admin log/Archive 37

From Wikitech
Jump to navigation Jump to search

2019-02-28

  • 23:44 XioNoX: pre-configure asw-a2 ports on asw2-a2-eqiad - T187960
  • 23:31 XioNoX: pre-configure asw-a1 ports on asw2-a1-eqiad - T187960
  • 23:27 bblack@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp107[678]\.eqiad\.wmnet
  • 23:07 robh: decom cp1045-cp1055, all are role spare but may icinga alert for ping
  • 22:39 ejegg: updated fundraising CiviCRM from c81fe7a4fd to 616c58cebe
  • 22:33 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Remove old translate config (duration: 00m 46s)
  • 22:29 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Disable some translate special pages again T217376 (duration: 00m 47s)
  • 22:29 ottomata: replaying events from mediawki eventbus config outage - T217385
  • 22:03 hashar: MediaWiki 1.33.0-wmf.19 deployed on all wikis # T206673
  • 21:59 XioNoX: disable asw2-a5 <> asw-a link - T217383
  • 21:28 hashar@deploy1001: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 00m 47s)
  • 21:09 herron: disabling logstash persisted queue
  • 20:52 herron: cleared logstash persistent queue on logstash100[7-9]
  • 20:13 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.19
  • 20:02 thcipriani@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add throttle exception for Amnesty International Editathon Thottle Rules: remove "all" Add new throttle rules T216998 T217063 T217305 T217311 (duration: 00m 54s)
  • 19:40 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT Remove legacy eventBus config settings. (duration: 00m 53s)
  • 19:36 _joe_: upgrading scap on all servers
  • 19:30 thcipriani@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add throttle rule for Art+Feminism 2019 editathon T217336 (duration: 00m 54s)
  • 19:26 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Enable WikibaseCirrusSearch on Beta Cluster (beta only change/noop sync) T215684 (duration: 00m 55s)
  • 19:22 robh: mw1272 being worked on by onsite
  • 19:21 robh: mw1272 unresponsive to mgmt or production interfaces
  • 19:16 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: GrowthExperiments: Start help panel experiment on viwiki T215666 (duration: 03m 02s)
  • 18:52 moritzm: installing libgd security updates on trusty
  • 18:52 herron: migrating logstash1006 kafka to logstash1012 T213898
  • 18:43 XioNoX: start pybal on lvs1016 - T212348
  • 18:34 robh: cp1078 power down for network move
  • 18:28 XioNoX: stop pybal on lvs1016 - T212348
  • 18:28 robh: cp1077 power off for network port relocation
  • 18:21 robh: cp1076 power down for network port move
  • 17:51 herron: logstash1011 kafka now in sync. transitioning logstash1005 to spare system T213898
  • 17:24 cmjohnson1: powering down sodium to move racks T212348
  • 17:23 jynus: recreating replicas, master ops events for db1078, db1075 T213858
  • 16:43 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1006.eqiad.wmnet
  • 16:39 elukey: clean up old/stale zookeeper znodes from conf100[4-6] - T216979
  • 16:28 herron: migrating kafka on logstash1005 to logstash1011 T213898
  • 16:27 herron: migrating kafka on logstash1005 to logstash1011 T213898
  • 16:15 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T214905 Add ReferencePreviews to allowed BetaFeatures (duration: 00m 54s)
  • 16:08 jbond42: rebooting labstore2003
  • 15:56 thcipriani@deploy1001: Synchronized README: noop sync to test opcache-manager in scap 3.9.1-1 (duration: 00m 53s)
  • 15:52 jbond42: rebooting labsdb1004
  • 15:50 thcipriani@deploy1001: Synchronized README: noop sync scap 3.9.1-1 (duration: 00m 52s)
  • 15:49 akosiaris@deploy1001: scap-helm citoid finished
  • 15:49 akosiaris@deploy1001: scap-helm citoid cluster staging completed
  • 15:48 akosiaris@deploy1001: scap-helm citoid upgrade -f citoid-staging-values.yaml staging stable/citoid [namespace: citoid, clusters: staging]
  • 15:46 _joe_: install scap 3.9.1-1 on the deployment servers
  • 15:43 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1006.eqiad.wmnet
  • 15:43 jbond42: rebooting labsdb1007
  • 15:37 jbond42: rebooting labsdb1006
  • 15:36 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1005.eqiad.wmnet
  • 15:33 jbond42: rebooting labstore2002
  • 15:29 jbond42: rebooting labstore2001
  • 15:23 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1005.eqiad.wmnet
  • 15:19 jbond42: rebooting rhodium
  • 15:15 cmjohnson1: powering off db1114 to replace motherboard T214720
  • 15:14 _joe_: uploading scap 3.9.1-1 to {stretch,jessie}-wikimedia
  • 14:50 jbond42: reboot cloudnet2001-dev.codfw.wmnet
  • 14:47 hashar: mw1272 fixed by running "scap sync-l10n" from deploy host
  • 14:46 hashar: mw1272 had /srv/mediawiki/php-1.33.0-wmf.19/includes/cache/localisation/LocalisationCache.php:475) No localisation cache found for English. Please run maintenance/rebuildLocalisationCache.php.
  • 14:46 hashar@deploy1001: scap sync-l10n completed (1.33.0-wmf.19) (duration: 03m 33s)
  • 14:42 jbond@cumin1001: conftool action : set/pooled=no; selector: name=rhodium.eqiad.wmnet
  • 14:41 hashar@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.19 (duration: 00m 53s)
  • 14:40 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.19
  • 14:34 milimetric@deploy1001: Finished deploy [analytics/refinery@f605fad]: New sqoop logic that uses the sharded replicas (duration: 10m 00s)
  • 14:30 akosiaris@deploy1001: scap-helm citoid finished
  • 14:30 akosiaris@deploy1001: scap-helm citoid cluster staging completed
  • 14:30 akosiaris@deploy1001: scap-helm citoid upgrade -f citoid-staging-values.yaml staging stable/citoid [namespace: citoid, clusters: staging]
  • 14:28 hashar@deploy1001: Synchronized php-1.33.0-wmf.19/extensions/WikibaseMediaInfo: Move up checks to test if we should construct depicts widgets - T217285 (duration: 00m 58s)
  • 14:24 milimetric@deploy1001: Started deploy [analytics/refinery@f605fad]: New sqoop logic that uses the sharded replicas
  • 13:56 elukey: re-start cleanup of 20k+ zookeeper nodes on conf100[4-6] (old Hadoop Yarn state) - T216952
  • 13:52 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1003.eqiad.wmnet
  • 13:43 godog: depool prometheus1003.eqiad.wmnet to take a data snapshot
  • 13:34 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus2003.codfw.wmnet
  • 12:36 zeljkof: EU SWAT finished
  • 12:35 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add throttle rule for Day of Digital Service (T217155) (duration: 00m 52s)
  • 12:31 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for Czech Wikigap 2019 (T217270) (duration: 00m 53s)
  • 12:18 zfilipin@deploy1001: Synchronized wmf-config/: SWAT: Show referencePreviews on group0 wikis as beta feature (T214905) (duration: 00m 56s)
  • 11:59 jbond42: rolling openssl security updates to jessie systems
  • 11:32 akosiaris: remove sca1003, sca1004, sca2003, sca2004 from the fleet. Celebrate!!!!
  • 11:28 elukey: pause cleanup of 20k+ zookeeper nodes on conf100[4-6] (old Hadoop Yarn state) - T216952
  • 10:00 _joe_: executing a rolling puppet run (2 server at a time per cluster, per dc) in eqiad,codfw as an HHVM restart will be triggered
  • 09:37 gilles@deploy1001: Synchronized php-1.33.0-wmf.19/extensions/NavigationTiming/modules/ext.navigationTiming.js: T217210 Don't assume PerformanceObserver entry types are supported (duration: 00m 54s)
  • 09:30 elukey: start cleanup of 20k+ zookeeper nodes on conf100[4-6] (old Hadoop Yarn state) - T216952
  • 09:26 moritzm: installed php security updates on netmon1002 and people1001
  • 09:22 marostegui: Stop MySQL on db1125 (sanitarium) to upgrade, this will generate lag on labs on: s2, s4, s6,s7
  • 09:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1121 (duration: 00m 54s)
  • 09:08 marostegui: Stop MySQL on db1121 for upgrade, this will generate lag on labsdb:s4
  • 09:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1121 (duration: 00m 53s)
  • 08:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1079 (duration: 00m 53s)
  • 08:32 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase API traffic db1079 after mysql upgrade (duration: 00m 53s)
  • 08:31 elukey: roll restart of Yarn Resource Managers on an-master100[1,2] to pick up new settings
  • 08:22 marostegui: Change abuse_filter_log indexes on s3 codfw, lag will appear on codfw - T187295
  • 08:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1079 after mysql upgrade (duration: 00m 54s)
  • 08:06 moritzm: installing glibc security updates for stretch
  • 07:47 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1079 in API after mysql upgrade (duration: 00m 53s)
  • 07:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1079 after mysql upgrade (duration: 00m 56s)
  • 07:08 marostegui: Stop MySQL on db1079 for mysql upgrade
  • 06:50 marostegui: Deploy schema change on db1079, this will generate lag on s7 on labs - T86342
  • 06:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1079 (duration: 00m 55s)
  • 06:18 kart_: Finished manual run of unpublished ContentTranslation draft purge script (T216983)
  • 05:56 marostegui: Upgrade MySQL on db1124 (Sanitarium) lag will be generated on s1,s3,s5,s8
  • 03:03 kart_: Manual run of unpublished ContentTranslation draft purge script (T216983)
  • 02:08 bstorm_: clouddb1002 is now in place to replace labsdb1004 as replica for toolsdb but not wikilabels postgres yet T193264
  • 01:43 twentyafterfour: phabricator upgrade completed without issues (actually completed at 01:23 UTC but I failed to hit enter and submit this message)
  • 01:20 twentyafterfour: deploying phabricator update 2019-02-27
  • 01:03 twentyafterfour: preparing to deploy phabricator-2019-02-27
  • 00:55 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.19/vendor/: vendor/ruflin/Elastica: Remove scalar return type hints (duration: 01m 33s)
  • 00:22 ebernhardson@deploy1001: Synchronized vendor/: Remove scalar type hints from ruflin/Elastica (duration: 00m 58s)
  • 00:10 ebernhardson@deploy1001: Synchronized wmf-config/CommonSettings.php: T215725 Remove mediawikiwiki from wgCentralAuthAutoCreateWikis (duration: 00m 54s)
  • 00:07 ebernhardson@deploy1001: Synchronized wmf-config/: T215684 Add config for switching Wikibase search to WikibaseCirrusSearch codebase (duration: 00m 55s)

2019-02-27

  • 21:57 XioNoX: delete local pref for peering sessions in eqiad - T204281
  • 21:44 eileen: civicrm revision is c81fe7a4fd, config revision is 050abdf9e8
  • 21:26 XioNoX: delete local pref for peering sessions in eqord - T204281
  • 20:53 XioNoX: delete local pref for peering sessions in codfw/eqdfw - T204281
  • 20:50 hashar: 1.33.0-wmf.19 not rolled to group1. Pending T217285 (Wikibase raising exception on commonswiki). To be figured out during European day time.
  • 20:50 eileen: civicrm revision changed from 224bf15206 to c81fe7a4fd, config revision is d1826e371b
  • 20:14 hashar@deploy1001: rebuilt and synchronized wikiversions files: (no justification provided)
  • 20:04 hashar@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.19 (duration: 00m 53s)
  • 20:04 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.19
  • 19:49 bstorm_: stopped slave on labsbd1004 for T193264
  • 19:43 bstorm_: downtimed labsdb1004 to stop mysql for transferring data for T193264
  • 19:32 SMalyshev: repooled wdqs1005, caught up
  • 19:26 herron: replacing kafka on logstash1004 with logstash1010 T213898
  • 18:56 SMalyshev: depooled wdqs1005 to let it catch up
  • 18:36 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@465673b]: Redeploy GUI for T217161 (duration: 10m 51s)
  • 18:28 cmjohnson1: powering off mw126[3-6] one at a time to move to different rack A5 T212348
  • 18:25 smalyshev@deploy1001: Started deploy [wdqs/wdqs@465673b]: Redeploy GUI for T217161
  • 18:21 cmjohnson1: powering off mw1262 to move to different rack A5 T212348
  • 18:15 cmjohnson1: powering off mw1261 to move to different rack A5 T212348
  • 17:57 niharika29@deploy1001: Synchronized php-1.33.0-wmf.19/extensions/Flow/: Make VisualEditor unwrap <section> tags T217206 (duration: 01m 00s)
  • 17:56 elukey: roll restart hadoop hdfs namenodes on an-master100[1,2] to pick up the new rack config of analytics1071
  • 17:37 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Welcome survey: add a control group to viwiki T216669 (duration: 00m 54s)
  • 17:34 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Stop collecting data for CitaitonUsage and CitationUsagePageLoad T213969 (duration: 00m 55s)
  • 17:22 elukey: drain + shutdown of analytics1071 to allow its move to A5 - T212348
  • 17:19 cmjohnson1: powering off wtp1030 to move to different rack A5 T212348
  • 17:14 cmjohnson1: powering off wtp1029 to move to different rack A5 T212348
  • 17:06 cmjohnson1: powering off wtp1029 to move to different rack A5 T212348
  • 17:05 RoanKattouw: Running foreachwikiindblist dblists/echo.dblist extensions/Echo/maintenance/removeOrphanedEvents.php on mwmaint1002
  • 16:58 hashar@deploy1001: Synchronized php-1.33.0-wmf.19/extensions/Score: Revert "beautify lilypond error message output" - T217241 (duration: 00m 56s)
  • 16:49 jijiki: Deploy LVS for eventgate-analytics - T211247
  • 16:26 volans: temporarily disabled puppet on icinga[12]001 to deploy g/493171
  • 16:21 volans: force-rebooting icinga1001 (to test some puppet changes) - T214760
  • 15:34 jbond42: rolling openssl security updates to jessie canary servers
  • 14:26 marostegui: Deploy schema change on abuse_filter_log on s7 codfw - lag will be generated on codfw - T187295
  • 14:01 marostegui: Change indexes on abuse_filter_log on db1089 - T187295
  • 14:00 moritzm: uploaded openssl 1.0.2r to jessie-wikimedia
  • 12:08 jbond42: correction: rolling updates of apache on mw api servers *not* jobrunners
  • 12:04 jbond42: rolling updates of apache on mw jobrunners
  • 11:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1077 after MySQL upgrade (duration: 00m 53s)
  • 11:28 godog: cleanup log4j from lvs eqiad / ipvsadm -D -t logstash.svc.eqiad.wmnet:4560
  • 11:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1077 after MySQL upgrade (duration: 00m 54s)
  • 11:17 godog: roll-restart pybal after removing logstash log4j service
  • 10:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1077 after MySQL upgrade (duration: 00m 54s)
  • 10:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 with low weight after MySQL upgrade (duration: 00m 53s)
  • 09:55 marostegui: Stop MySQL on db1077 for mysql upgrade - this will generate lag on labsdb:s3
  • 09:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 for MySQL upgrade (duration: 00m 53s)
  • 09:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 (duration: 00m 54s)
  • 09:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool in API db1082 after mysql upgrade (duration: 00m 53s)
  • 09:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1082 after mysql upgrade (duration: 00m 54s)
  • 09:05 marostegui: Stop MySQL on db1082 for mysql upgrade
  • 08:41 godog: enable mmjsonparse by default on kafka outputs - T213189
  • 08:40 marostegui: Deploy schema change on db1082 - will generate lag on labsdb:s5 - T86342
  • 08:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1082 for mysql upgrade (duration: 00m 54s)
  • 08:26 marostegui: Retroactive log, T216444 Global rename of Дагиров Умар → Takhirgeran Umar was done by alanajjar
  • 08:02 marostegui: Global rename of HeavyTony → QTHCCAN by alanajjar - T217222
  • 07:01 marostegui: Deploy schema change on s5 codfw master (db2052), this will generate lag on codfw - T86342
  • 06:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1076 (duration: 00m 55s)
  • 06:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1076 (duration: 01m 08s)
  • 05:05 kart_: Finished manual run of unpublished ContentTranslation draft purge script (T216983)
  • 04:58 SMalyshev: repooled wdqs1006
  • 03:09 kart_: Manual run of unpublished ContentTranslation draft purge script (T216983)
  • 00:38 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: [cirrus] decrease regex timeouts by 25% and drop timeout hack (duration: 00m 53s)
  • 00:30 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.19/skins/MinervaNeue/resources/skins.minerva.scripts/errorLogging.js: MinervaNeue: Allow us to distinguish errors for logged in users (duration: 00m 53s)
  • 00:30 bd808: Re-enabled puppet on labweb100[12]
  • 00:23 bd808: Disabled puppet on labweb100[12]
  • 00:15 bd808: Manually changed logging level and restarted Horizon on labweb100[12]
  • 00:15 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [cirrus] autocomplete: enable subphrase matching for officewiki (2/2) (duration: 00m 54s)
  • 00:14 ebernhardson@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 01s)
  • 00:07 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [cirrus] autocomplete: enable subphrase suggester builds on officewiki (1/2) (duration: 00m 54s)
  • 00:03 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: noop sync for labs files gerrit:493103 (duration: 00m 54s)

2019-02-26

  • 23:39 tgr: T217203 running mwscript ~/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'LaurenceKingPublishing' 'Fiona at Laurence King Publishing'
  • 23:37 tgr: T217203 running mwscript ~/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'Citycarclubfi' 'Urbaanimies'
  • 23:16 SMalyshev: depooled wdqs1006 to see if it's catch up
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics upgrade -f eventgate-analytics-staging-values.yaml staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics cluster eqiad completed
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics upgrade -f eventgate-analytics-eqiad-values.yaml production stable/eventgate-analytics [namespace: eventgate-analytics, clusters: eqiad]
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics cluster codfw completed
  • 22:43 akosiaris@deploy1001: scap-helm eventgate-analytics upgrade -f eventgate-analytics-codfw-values.yaml production stable/eventgate-analytics [namespace: eventgate-analytics, clusters: codfw]
  • 22:42 akosiaris@deploy1001: scap-helm eventgate-analytics upgrade -f eventgate-analytics-staging-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 22:42 akosiaris@deploy1001: scap-helm eventgate-analytics upgrade -f eventgate-analytics-eqiad-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: eqiad]
  • 22:42 akosiaris@deploy1001: scap-helm eventgate-analytics upgrade -f eventgate-analytics-codfw-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: codfw]
  • 22:13 XioNoX: delete local pref for peering sessions in eqsin - T204281
  • 19:12 reedy@deploy1001: Synchronized php-1.33.0-wmf.19/extensions/EventBus/: T217145 (duration: 00m 54s)
  • 18:24 arlolra: Updated Parsoid to e82347d (T204608, T214099, T217093)
  • 18:17 arlolra@deploy1001: Finished deploy [parsoid/deploy@ae76aa2]: Updating Parsoid to e82347d (duration: 11m 03s)
  • 18:06 arlolra@deploy1001: Started deploy [parsoid/deploy@ae76aa2]: Updating Parsoid to e82347d
  • 16:38 cdanis: cdanis@krypton sudo apt-get remove grafana
  • 16:35 otto@deploy1001: scap-helm eventgate-analytics finished
  • 16:35 otto@deploy1001: scap-helm eventgate-analytics cluster codfw completed
  • 16:35 otto@deploy1001: scap-helm eventgate-analytics install -n production -f eventgate-analytics-codfw-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: codfw]
  • 16:35 otto@deploy1001: scap-helm eventgate-analytics finished
  • 16:35 otto@deploy1001: scap-helm eventgate-analytics cluster eqiad completed
  • 16:35 otto@deploy1001: scap-helm eventgate-analytics install -n production -f eventgate-analytics-eqiad-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: eqiad]
  • 16:34 otto@deploy1001: scap-helm eventgate-analytics install -n production eventgate-analytics-eqiad-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: eqiad]
  • 16:24 jijiki: Restarting memcached on mc1028 - T208844
  • 16:14 hashar@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.19 # T206673
  • 16:09 herron: elasticsearch stopped on logstash100[456] T213898
  • 16:07 otto@deploy1001: scap-helm eventgate-analytics finished
  • 16:07 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 16:07 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics --set main_app.version=v1.0.0-rc2 [namespace: eventgate-analytics, clusters: staging]
  • 16:01 otto@deploy1001: scap-helm eventgate-analytics finished
  • 16:01 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 16:01 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 16:00 herron: re-enabling ircecho
  • 16:00 hashar@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.19 and rebuild l10n cache # T206673 (duration: 58m 17s)
  • 15:47 akosiaris@deploy1001: scap-helm mathoid finished
  • 15:47 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 15:47 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 15:47 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml production stable/mathoid [namespace: mathoid, clusters: eqiad,codfw]
  • 15:43 akosiaris@deploy1001: scap-helm mathoid finished
  • 15:43 akosiaris@deploy1001: scap-helm mathoid cluster staging completed
  • 15:43 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-staging-values.yaml staging stable/mathoid [namespace: mathoid, clusters: staging]
  • 15:21 godog: force puppet run on failed agents in codfw
  • 15:17 herron: stopped ircecho to squelch puppet run alerts
  • 15:13 godog: poweroff ms-be2030 - T204567
  • 15:02 hashar@deploy1001: Started scap: testwiki to php-1.33.0-wmf.19 and rebuild l10n cache # T206673
  • 15:02 otto@deploy1001: scap-helm eventgate-analytics finished
  • 15:02 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 15:02 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 14:58 otto@deploy1001: scap-helm eventgate-analytics finished
  • 14:58 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 14:58 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 14:36 hashar@deploy1001: Pruned MediaWiki: 1.33.0-wmf.19 (duration: 04m 42s)
  • 14:20 hashar: Applied 1.33.0-wmf.19 security patches | T206673
  • 14:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1122 (duration: 00m 45s)
  • 13:37 hashar: cutting deployment branch 1.33.0-wmf.19
  • 13:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1122 (duration: 00m 46s)
  • 12:14 moritzm: uploaded php7.2 7.2.15-1+0~20190209065123.16+stretch~1.gbp3ad8c0+wmf1 to component/php72 (T216712)
  • 11:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Full repool db1074 (duration: 00m 46s)
  • 11:12 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 38s)
  • 11:12 jijiki: Pooling thumbor2004 - T214597
  • 11:11 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 11:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1074 into API (duration: 00m 45s)
  • 10:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1074 (duration: 00m 46s)
  • 10:35 marostegui: Stop MySQL on db1074 for upgrade
  • 10:20 marostegui: Deploy schema change on db1074, this will generate lag on labsdb for s2 - T86342
  • 10:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1074 (duration: 00m 46s)
  • 10:12 godog: bounce gerrit on gerrit2001 and cobalt after https://gerrit.wikimedia.org/r/c/operations/puppet/+/492633 - T213899
  • 09:10 jynus: temporarilly stop dbstore1001:s1replication to perform new backup system test
  • 09:04 jijiki: Pooling thumbor1003
  • 08:48 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 49s)
  • 08:47 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 08:45 moritzm: installing elfutils security updates
  • 08:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 (duration: 00m 45s)
  • 08:31 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 (duration: 00m 46s)
  • 08:08 jijiki: Depool and reimage thumbor2004 - T214597
  • 08:07 jijiki: Pooling thumbor2003 - T214597
  • 08:04 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 30s)
  • 08:04 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 07:54 elukey: removed /rmstore-analytics-test-hadoop from zookeeper main-eqiad - T216952
  • 07:45 _joe_: publishing golang:1.11.5-1 docker image
  • 07:44 moritzm: installing tiff security updates
  • 07:02 marostegui: Deploy schema change on s2 codfw (this will generate lag on s2 codfw) T86342
  • 06:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 (duration: 00m 45s)
  • 06:50 jijiki: Depool and reimage thumbor1003 and thumbor2003 - T214597
  • 06:46 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 07s)
  • 06:46 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 06:45 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 (duration: 00m 45s)
  • 06:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1088 (duration: 00m 45s)
  • 06:41 jijiki: Pooling tthumbor1002
  • 06:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 (duration: 00m 46s)
  • 06:34 tgr: T215107 running mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki --ignorestatus 'The_Photographer' 'Wilfredor'
  • 06:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1088 T86342 (duration: 00m 48s)
  • 06:17 marostegui: Change abuse_filter_log indexes on db1083 - T187295
  • 06:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 T187295 (duration: 00m 51s)
  • 06:10 tgr: T215107 running mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki 'The_Photographer' 'Wilfredor'
  • 04:25 kart_: Finished manual run of unpublished ContentTranslation draft purge script (T216983)
  • 04:24 eileen-sorting-k: civicrm revision changed from d1fc603677 to 224bf15206, config revision is d1826e371b
  • 03:06 kart_: Manual run of unpublished ContentTranslation draft purge script (T216983)

2019-02-25

  • 23:50 XioNoX: Re-enabled BGP to Zayo on cr2-codfw - T215193
  • 23:15 herron: service restarts to make logstash101[012] master eligible are taking longer than expected, leaving elasticsearch on logstash100[456] enabled overnight T213898
  • 22:56 mholloway-shell@deploy1001: Started restart [mobileapps/deploy@1ac3c38]: Restarting mobileapps on scb2003
  • 21:54 eileen: update process-control config revision is d1826e371b
  • 21:14 arlolra@deploy1001: Finished deploy [parsoid/deploy@cb62482]: Updating Parsoid to a8fe45e (duration: 04m 19s)
  • 21:11 herron: turning down elasticsearch service on logstash100[456] (data has been migrated to logstash101[012]) T213898
  • 21:10 arlolra@deploy1001: Started deploy [parsoid/deploy@cb62482]: Updating Parsoid to a8fe45e
  • 21:09 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@1ac3c38]: Update mobileapps to c3871cc (duration: 03m 48s)
  • 21:05 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@1ac3c38]: Update mobileapps to c3871cc
  • 19:58 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Use EventBus multi endpoint configuration for eventbus configs (duration: 00m 45s)
  • 19:53 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Swat! (duration: 00m 45s)
  • 19:46 reedy@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: Disable MFSpecialCaseMainPage for srwiki and enwikivoyage (duration: 00m 46s)
  • 19:41 vgutierrez: restarting pybal on lvs5003 - T213121
  • 19:35 reedy@deploy1001: Synchronized php-1.33.0-wmf.18/extensions/Renameuser: T215107 (duration: 00m 46s)
  • 19:31 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: labs! (duration: 00m 46s)
  • 18:44 krinkle@deploy1001: Synchronized php-1.33.0-wmf.18/includes/libs/objectcache/WANObjectCache.php: 79a1593cae48 / T203786 (duration: 00m 48s)
  • 18:18 jijiki: Pooling thumbor2001
  • 18:18 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 01m 09s)
  • 18:16 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 18:13 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@4c27682]: New GUI, Updater & Blazegraph builds (duration: 09m 53s)
  • 18:04 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@4c27682]: New GUI, Updater & Blazegraph builds
  • 17:59 jijiki: Depooling and reimaging thumbor1002 to stretch - T214597
  • 17:58 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
  • 17:58 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
  • 17:42 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
  • 17:42 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
  • 17:26 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
  • 17:26 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
  • 16:48 thcipriani@deploy1001: Synchronized README: noop sync for scap 3.9.0-1 (duration: 00m 46s)
  • 16:43 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
  • 16:43 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
  • 16:41 jijiki: Pooling thumbor1001
  • 16:40 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 04s)
  • 16:40 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 16:23 chasemp: reset 2fa for JBennett on phab with video confirmation
  • 16:21 jijiki: Depooling and reimaging thumbor2001 - T214597
  • 16:17 fsero: upload envoy 1.9.0 to stretch-wikimedia T215810
  • 15:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool in API db1085 after MySQL upgrade (duration: 00m 45s)
  • 15:35 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
  • 15:35 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
  • 15:34 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
  • 15:34 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
  • 15:33 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99)
  • 15:33 vgutierrez@cumin1001: START - Cookbook sre.hosts.decommission
  • 15:30 akosiaris@deploy1001: scap-helm citoid finished
  • 15:30 akosiaris@deploy1001: scap-helm citoid cluster staging completed
  • 15:30 akosiaris@deploy1001: scap-helm citoid install -n staging -f citoid-staging-values.yaml stable/citoid [namespace: citoid, clusters: staging]
  • 15:28 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 07s)
  • 15:28 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 15:27 jiji@deploy1001: deploy aborted: (no justification provided) (duration: 00m 04s)
  • 15:27 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 15:20 vgutierrez: shutting down certcentral VMs for decommission - T207389
  • 15:18 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase API traffic for db1085 after MySQL upgrade (duration: 00m 45s)
  • 15:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 after MySQL upgrade (duration: 00m 45s)
  • 14:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 after MySQL upgrade (duration: 00m 45s)
  • 14:49 jiji@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 15s)
  • 14:49 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 14:47 jiji@deploy1001: deploy aborted: (no justification provided) (duration: 00m 19s)
  • 14:46 jiji@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 14:32 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool into API db1085 after MySQL upgrade (duration: 00m 45s)
  • 14:15 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1085 after MySQL upgrade (duration: 00m 45s)
  • 14:04 marostegui: Stop MySQL on db1085 for mysql upgrade
  • 13:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1085 for MySQL upgrade and schema change (duration: 00m 46s)
  • 13:32 akosiaris: upgrade etherpad-lite to 1.7.5
  • 12:38 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 07s)
  • 12:38 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 12:27 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 05s)
  • 12:27 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 12:22 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 01m 15s)
  • 12:21 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 11:49 moritzm: rolling out intel-microcode 3.20180807a.2 on all jessie/stretch servers, tests on a number of previously unsupported servers with Westmere CPU were successful and I've verified that all other microcode files are identical compared to the current 3.20180807a.1 microcode
  • 11:19 jijiki: Reimageing thumbor1001 - T214597
  • 10:40 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546, T202497) (duration: 00m 46s)
  • 10:39 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546, T202497) (duration: 00m 46s)
  • 10:32 gtirloni: labstore1004 restarted nfsd and killed stuck rpc.mountd.real processed (T216988)
  • 10:16 jijiki: Depooling thumbor1001 to reimage - T214597
  • 09:54 marostegui: Deploy schema change on db1074, this will generate lag on labsdb:s2 - T187295
  • 09:07 marostegui@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Increase ParserCache TTL from 24 days to 30 - T210992 (duration: 00m 46s)
  • 08:52 marostegui: Deploy schema change on s2 on codfw master - lag will happen on s2 codfw - T187295
  • 08:49 _joe_: generating mcrouter certificate for mw2151 T192457
  • 07:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1104 after MySQL upgrade (duration: 00m 45s)
  • 06:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1104 in API after MySQL upgrade (duration: 00m 45s)
  • 06:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 after MySQL upgrade (duration: 00m 45s)
  • 06:02 marostegui: Stop MySQL on db1104 for mysql upgrade
  • 06:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 for MySQL upgrade (duration: 00m 50s)

2019-02-24

  • 21:49 eileen: civicrm revision changed from 1b5d974569 to d1fc603677, config revision is 00f9c08766
  • 18:20 elukey: clean up 2017/2018 log files in /var/log/jmxtrans on kafka1013-22 - root partitions filling up
  • 18:15 elukey: clean up 2017/2018 log files in /var/log/jmxtrans - root partition almost filled up on kafka1012
  • 10:22 elukey: force remount of /mnt/hdfs on an-coord1001 (fuse-hdfs stuck)

2019-02-22

  • 18:02 gehel: rolling upgrade on elasticsearch / cirrus / eqiad completed - T215931
  • 18:00 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 18:00 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 17:33 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=0)
  • 17:33 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 17:33 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
  • 17:33 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 17:33 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 17:14 bblack: cp5007: repooling into service - T216716
  • 17:13 bblack: cp5006: repooling into service - T216717
  • 17:06 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 17:06 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 16:29 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 16:29 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 15:33 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 15:32 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 15:15 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 15:15 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 14:23 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 14:22 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 14:03 moritzm: removed labvirt1008 from debmonitor (T216661)
  • 14:02 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 14:02 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 13:54 akosiaris: reboot helium for kernel/microcode updates
  • 13:25 moritzm: installing wireshark security updates
  • 13:19 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 13:17 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 13:09 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 13:09 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 13:01 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 13:00 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 12:56 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 12:48 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
  • 12:43 moritzm: rebooting auth1002 for kernel update
  • 12:17 moritzm: rebooting tungsten to pick up updated microcode to address SSBD/L1TF
  • 12:13 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 12:12 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 12:12 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 11:54 moritzm: various reboots of servers with Westmere-EP CPUs to pick up updated microcode to address SSBD/L1TF
  • 11:41 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 11:41 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 11:34 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 11:34 moritzm: rebooting cp1008 for some microcode test
  • 11:33 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 11:32 jijiki: Pooling thumbor2002 after upgrade - T214597
  • 11:20 moritzm: imported intel-microcode 3.20180807a.2 for jessie-wikimedia (T216802)
  • 11:01 godog: swift eqiad set thumbor write ACLs for wikipedia-meta-local-thumb
  • 10:37 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
  • 10:36 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 10:35 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 10:15 jijiki: Pooling thumbor1004 after upgrade - T214597
  • 09:55 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 09:51 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 09:51 moritzm: fixed package state on mw2167
  • 09:38 akosiaris@deploy1001: scap-helm citoid install -n staging -f citoid-staging-values.yaml stable/citoid [namespace: citoid, clusters: staging]
  • 09:33 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 09:33 moritzm: installing tor security update on torrelay1001
  • 09:33 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 09:32 _joe_: set pooled=inactive on mw1272, T211668
  • 09:26 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 09:22 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 16s)
  • 09:22 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 09:22 moritzm: updated tor packages to 0.3.5.8-1~d90.stretch+1
  • 09:18 gehel@cumin2001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 09:16 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 14s)
  • 09:16 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 09:16 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 09:16 gehel: starting rolling upgrade on elasticsearch / cirrus / eqiad - T215931
  • 08:52 godog: force ftpsync run on sodium after debian mirror update
  • 08:19 moritzm: installing uriparser security updates
  • 08:18 godog: temporarily stop prometheus global on prometheus2004 to take a snapshot
  • 07:47 moritzm: installing krb5 updates for jessie
  • 07:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after MySQL upgrade (duration: 00m 46s)
  • 07:28 elukey: manually delete WANCache:v:metawiki:translate-groups from memcache on mc1022 to test fix for T203786
  • 07:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Give more traffic to es1013 after MySQL upgrade (duration: 00m 45s)
  • 07:15 _joe_: deactivating mw1272, memory problems
  • 07:03 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after MySQL upgrade (duration: 00m 45s)
  • 06:51 marostegui: Power cycle mw1272 as it crashed - T211668
  • 06:49 marostegui: Stop MySQL on es1013 to upgrade MySQL
  • 06:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool es1013 for MySQL upgrade (duration: 02m 50s)
  • 06:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1087 after MySQL upgrade (duration: 02m 51s)
  • 06:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1087 for MySQL upgrade (duration: 02m 53s)
  • 06:15 marostegui: Stop MySQL on db1087 for kernel and mysql upgrade
  • 03:26 XioNoX: delete old gr-1/0/0 from cr1-eqsin - T213121
  • 01:58 XioNoX: power-down cp5007 - T216716
  • 01:40 XioNoX: power-down cp5006 - T216717
  • 00:57 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Noop sync of labs settings (duration: 00m 44s)
  • 00:46 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T215931 [cirrus] Switch production search traffic to codfw (2/2) (duration: 00m 46s)
  • 00:45 ebernhardson@deploy1001: sync-file aborted: T215931 [cirrus] Switch production search traffic to codfw (2/2) (duration: 00m 05s)
  • 00:39 ebernhardson@deploy1001: Synchronized wmf-config/Wikibase.php: Deploy WikibaseCirrusSearch: Part III, Wikibase.php (duration: 00m 45s)
  • 00:27 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Deploy WikibaseCirrusSearch: Part II, InitialiseSettings.php (duration: 00m 46s)
  • 00:23 ebernhardson@deploy1001: Synchronized wmf-config/extension-list: Deploy WikibaseCirrusSearch: Part I, extensionlist (duration: 00m 46s)
  • 00:21 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T215931 [cirrus] Switch production search traffic to codfw (1/2) (duration: 00m 45s)
  • 00:18 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T215931 [cirrus] Switch production search traffic to codfw (1/2) (duration: 00m 46s)
  • 00:17 ebernhardson@deploy1001: sync-file aborted: T215931 (duration: 00m 00s)

2019-02-21

  • 22:25 tzatziki: change pw for NazarSusP
  • 22:17 volans: forcing a puppet run on A:ganeti
  • 20:35 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=0)
  • 20:18 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.18
  • 20:06 ladsgroup@deploy1001: Finished deploy [ores/deploy@5d937b1]: Drop accepting pickle altogether (T206333) (duration: 13m 17s)
  • 19:58 bblack: eqsin: repooling user traffic
  • 19:52 ladsgroup@deploy1001: Started deploy [ores/deploy@5d937b1]: Drop accepting pickle altogether (T206333)
  • 19:35 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Drop obsolete Wikibase configs (T213713), Part II (duration: 00m 53s)
  • 19:33 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: Drop obsolete Wikibase configs (T213713), Part I (duration: 00m 52s)
  • 19:32 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 19:32 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 19:25 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 19:19 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT Set wmgWikibaseRepoIdGeneratorSeparateDbConnection to true for wikidata (T215147) (duration: 00m 56s)
  • 18:59 ladsgroup@deploy1001: Finished deploy [ores/deploy@2d84709]: Change default task serializer of celery from pickle to json (T206333) (duration: 16m 54s)
  • 18:46 jynus: shutting down db1114 T214720
  • 18:42 ladsgroup@deploy1001: Started deploy [ores/deploy@2d84709]: Change default task serializer of celery from pickle to json (T206333)
  • 18:33 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
  • 18:30 robh: ignore icinga1001 alerts, rebooting it into hardware tests via T214760
  • 18:29 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 18:28 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 18:28 ladsgroup@deploy1001: Finished deploy [ores/deploy@5d50713]: (no justification provided) (duration: 14m 37s)
  • 18:13 ladsgroup@deploy1001: Started deploy [ores/deploy@5d50713]: (no justification provided)
  • 17:54 robh: cp5007 rebooting into bios update and hardware testing via T216716
  • 17:47 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 17:11 bblack: eqsin: restarting all varnish frontends to wipe cache after purge loss (site currently depooled) (skipping 5006/7 since they're being rebooted for bios flashing anyways)
  • 17:10 robh: rebooting cp5006 to flash bios in memory troubleshooting steps via T216717
  • 16:50 bblack: eqsin: restarting all varnish backends to wipe cache after purge loss (site currently depooled)
  • 16:41 volans: applied hot band-aid patch to spicerack/remote.py on cumin2001 ( https://gerrit.wikimedia.org/r/c/operations/software/spicerack/+/481858 )
  • 16:38 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
  • 16:23 herron: updated phabricator.wikimedia.org spf record T216714
  • 16:22 fsero: uploading scap3 3.9.0.1 package to trusty, jessie and stretch T216666
  • 16:20 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 16:18 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 16:17 fsero: uploading scap3 3.9.0.1 package to trusty, jessie and stretch
  • 16:17 fsero: updating scap3 to 3.9.0-1
  • 15:57 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 15:52 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
  • 15:23 moritzm: installing krb5 updates for jessie
  • 15:07 herron: migrating ES shards away from logstash100[456] with "cluster.routing.allocation.exclude._name" : "logstash1004-production-logstash-eqiad,logstash1005-production-logstash-eqiad,logstash1006-production-logstash-eqiad” T214608
  • 14:50 gehel@cumin2001: END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0)
  • 14:50 gehel@cumin2001: START - Cookbook sre.elasticsearch.force-shard-allocation
  • 14:41 bmansurov@deploy1001: Finished deploy [recommendation-api/deploy@600e689]: Update to 0bb0a07 (duration: 04m 59s)
  • 14:37 bblack: restart vhtcpd on cp5002 to debug multicast loss
  • 14:36 bmansurov@deploy1001: Started deploy [recommendation-api/deploy@600e689]: Update to 0bb0a07
  • 13:57 godog: depool and reimage logstash1007 - T213898
  • 13:25 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 13:20 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 16s)
  • 13:19 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 13:19 gehel@cumin2001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 13:19 jbond42: restarting hhvm and updateing apache on deploy1001.eqiad.wmnet
  • 13:18 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 13:18 gehel: restarting rolling upgrade on elasticsearch / cirrus / codfw - T215931
  • 12:50 jbond42: restarting hhvm and updateing apache on mwmaint1002.eqiad.wmnet
  • 12:44 zeljkof: EU SWAT finished
  • 12:42 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add img.raremaps.com at wgCopyUploadsDomains (T216638) (duration: 00m 52s)
  • 12:40 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 20s)
  • 12:39 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 12:38 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Throttle rule for National Gallery of Canada Library and Archives edit-a-thon (T216642) (duration: 00m 53s)
  • 12:33 arturo: disable puppet in cloudnet2001-dev to test T216497
  • 12:31 akosiaris@deploy1001: scap-helm mathoid finished
  • 12:31 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 12:30 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 12:30 akosiaris@deploy1001: scap-helm mathoid upgrade --recreate-pods -f mathoid-values.yaml production stable/mathoid [namespace: mathoid, clusters: eqiad,codfw]
  • 12:27 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: (no justification provided) (duration: 00m 38s)
  • 12:26 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: (no justification provided)
  • 12:24 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: l thumbor2002.codfw.wmnet (duration: 00m 04s)
  • 12:24 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: l thumbor2002.codfw.wmnet
  • 12:24 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: l thumbor2002 (duration: 00m 08s)
  • 12:24 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: l thumbor2002
  • 12:23 arturo: importing openstack mitaka packages to reprepro @ install1002 (T216497)
  • 12:17 arturo: enable puppet in install1002 (done testing T216497)
  • 12:14 zfilipin@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: SWAT: Disable mobile main page special casing on huwiki (T216563) (duration: 00m 54s)
  • 12:13 gilles@deploy1001: Finished deploy [3d2png/deploy@ca39432]: Updating repo (duration: 00m 29s)
  • 12:13 gilles@deploy1001: Started deploy [3d2png/deploy@ca39432]: Updating repo
  • 12:10 arturo: T216497 import reprepro key 7638D0442B90D010 (debian archive automatic signing key (8/jessie)
  • 12:01 arturo: disable puppet in install1002 to test T216497
  • 11:13 volans: upgraded spicerack to 0.0.19 on cumin[12]001
  • 11:11 volans: uploaded spicerack_0.0.19-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
  • 10:55 akosiaris: upgrade mathoid staging+production to latest helm chart
  • 10:47 akosiaris@deploy1001: scap-helm mathoid finished
  • 10:47 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 10:47 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 10:47 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml production stable/mathoid [namespace: mathoid, clusters: eqiad,codfw]
  • 10:29 akosiaris@deploy1001: scap-helm mathoid finished
  • 10:29 akosiaris@deploy1001: scap-helm mathoid cluster staging completed
  • 10:29 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-staging-values.yaml staging stable/mathoid [namespace: mathoid, clusters: staging]
  • 10:28 akosiaris@deploy1001: scap-helm mathoid finished
  • 10:28 akosiaris@deploy1001: scap-helm mathoid cluster staging completed
  • 10:28 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml staging stable/mathoid [namespace: mathoid, clusters: staging]
  • 10:27 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml stable/mathoid [namespace: mathoid, clusters: staging]
  • 10:26 akosiaris@deploy1001: scap-helm list finished
  • 10:26 akosiaris@deploy1001: scap-helm list cluster codfw completed
  • 10:26 akosiaris@deploy1001: scap-helm list cluster eqiad completed
  • 10:26 akosiaris@deploy1001: scap-helm list [namespace: list, clusters: eqiad,codfw]
  • 10:23 godog: on boron unblock trusty builds with umount /var/cache/pbuilder/base-trusty-amd64.cow/dev/ptmx
  • 10:04 akosiaris: create citoid namespace on kubernetes eqiad codfw staging clusters T213194
  • 10:04 akosiaris: create cxserver namespace on kubernetes eqiad codfw staging clusters T213195
  • 09:35 volans: force rebooting unresponsive icinga1001 T214760
  • 09:29 marostegui: Deploy schema change on s3 primary master (db1078) - T210713
  • 09:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1075 T210713 (duration: 00m 52s)
  • 09:14 moritzm: temporarily stop prometheus@labs.service on labmon for journald restarts (part of security update)
  • 08:40 marostegui: Deploy schema change on db1075 - T210713
  • 08:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1075 T210713 (duration: 00m 54s)
  • 08:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 T210713 (duration: 00m 53s)
  • 07:44 moritzm: rolling out remaining systemd security updates on jessie
  • 07:12 marostegui: Deploy schema change on db1077 - this will generate lag on labsdb:s3 T210713
  • 07:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 T210713 (duration: 00m 56s)
  • 07:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 T210713 (duration: 00m 55s)
  • 06:22 marostegui: Deploy schema change on db1123 - T210713
  • 06:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 T210713 (duration: 00m 57s)
  • 05:46 bblack: repooling cp5010 - T214274
  • 05:42 bblack: removing cp5010 downtimes from icinga - T214274
  • 05:34 bblack: rebooting cp5010 for device name on swapped disk (depooled) - T214274
  • 04:30 kart_: Finished: Fifth manual run of unpublished draft purge script for ContentTranslation (T216470)
  • 04:16 XioNoX: Unplug Tata/NTT/PCCW from cr1-eqsin - T213121
  • 03:21 XioNoX: replace cp5010 disk 1 - T214274
  • 03:15 kart_: Fifth manual run of unpublished draft purge script for ContentTranslation (T216470)
  • 02:44 XioNoX: depool eqsin - T213121
  • 02:31 twentyafterfour: phabricator upgrade finished, service appears to be returned to normal
  • 01:43 twentyafterfour: running phabricator database schema changes
  • 01:38 twentyafterfour: now taking phabricator offline for upgrade
  • 01:15 twentyafterfour: Taking phabricator offline momentarily for upgrade
  • 01:01 twentyafterfour: set downtime in icinga for phab100*
  • 00:17 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable partial blocks on metawiki and mediawikiwiki (T216065) (duration: 00m 54s)

2019-02-20

  • 23:59 ppchelko@deploy1001: Finished deploy [changeprop/deploy@5e4486a]: Purge varnish on revision restrictions (duration: 01m 23s)
  • 23:57 ppchelko@deploy1001: Started deploy [changeprop/deploy@5e4486a]: Purge varnish on revision restrictions
  • 21:48 eileen: civicrm revision changed from 165fbf5894 to 1b5d974569, config revision is ccefa3716b
  • 21:46 arlolra: Updated Parsoid to 9b204a0 (T153080, T169975, T215824)
  • 21:28 arlolra@deploy1001: Finished deploy [parsoid/deploy@c4574d1]: Updating Parsoid to 9b204a0 (duration: 09m 33s)
  • 21:19 arlolra@deploy1001: Started deploy [parsoid/deploy@c4574d1]: Updating Parsoid to 9b204a0
  • 21:08 _joe_: rolling restart of php-fpm to catch up with the tideways change
  • 20:35 thcipriani@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.18 (duration: 00m 53s)
  • 20:14 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.18/extensions/EventBus/includes/EventBusRCFeedEngine.php: Check for eventServiceName in config before accessing T216561 (duration: 00m 55s)
  • 18:30 fdans@deploy1001: Finished deploy [analytics/refinery@ccf837e]: deploying refinery for new wikis and changes in scripts (duration: 11m 13s)
  • 18:24 mobrovac@deploy1001: Finished deploy [restbase/deploy@80f518c]: Remove VE request logging - T215956 (duration: 20m 19s)
  • 18:19 fdans@deploy1001: Started deploy [analytics/refinery@ccf837e]: deploying refinery for new wikis and changes in scripts
  • 18:04 mobrovac@deploy1001: Started deploy [restbase/deploy@80f518c]: Remove VE request logging - T215956
  • 17:22 sbisson@deploy1001: Synchronized php-1.33.0-wmf.18/extensions/Flow/modules/mw.flow.Initializer.js: SWAT: Unbreak reply clicks with existing widget (duration: 00m 58s)
  • 17:08 hashar: contint1001: fix broken root ownership on zuul git deploy repo: sudo find /etc/zuul/wikimedia/.git -not -user zuul -exec chown zuul:zuul {} +
  • 16:49 herron: migrating es shards away from logstash100[56] with "cluster.routing.allocation.exclude._name" : "logstash1005-production-logstash-eqiad,logstash1006-production-logstash-eqiad” T214608
  • 16:40 twentyafterfour: started phd again, seems to be working now without killing the db
  • 16:38 bblack: multatuli: upgrade gdnsd to 3.0.0-1~wmf1
  • 16:36 godog: depool and reimage logstash1008 with stretch - T213898
  • 16:26 twentyafterfour: stopped phd on phab1001 and scheduled downtime in icinga
  • 16:24 bblack: authdns1001: upgrade gdnsd to 3.0.0-1~wmf1
  • 16:19 twentyafterfour: stopped phd on phab1002
  • 16:03 ottomata: removing spark 1 from Analytics cluster - T212134
  • 15:55 bblack: authdns2001: upgrade gdnsd to 3.0.0-1~wmf1
  • 15:37 fsero: restarting docker-registry service on systemd
  • 15:35 moritzm: temporarily stop prometheus instances on prometheus1004 for systemd upgrade/journald restart
  • 14:43 gehel@cumin2001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 14:35 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 14:35 volans: upgraded spicerack to 0.0.18 on cumin[12]001
  • 14:34 volans: uploaded spicerack_0.0.18-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
  • 14:00 gehel@cumin2001: END (ERROR) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=97)
  • 14:00 gehel@cumin2001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 13:59 gehel: rolling upgrade of elasticsearch / cirrus / codfw to 5.6.14 - T215931
  • 13:51 godog: prometheus on prometheus2004 crashed/exited after journald upgrade -- starting up again now
  • 13:00 jbond42: rolling restarts for hhvm in eqiad
  • 12:28 volans: upgraded spicerack to 0.0.17 on cumin[12]001
  • 12:25 volans: uploaded spicerack_0.0.17-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
  • 12:08 moritzm: restarted ircecho on kraz.wikimedia.org
  • 11:46 jbond42: rolling restarts for hhvm in codfw
  • 11:28 akosiaris: rebuild and re-upload rsyslog_8.38.0-1~bpo9+1wmf1_amd64.changes to apt.wikimedia.org/stretch-wikimedia to have mmkubernetes package
  • 10:36 marostegui: Deploy schema change on db1095:3313 - T210713
  • 10:04 marostegui: Deploy schema change on dbstore1004:3313 - T210713
  • 09:57 moritzm: installing systemd security updates on jessie hosts
  • 09:33 marostegui: Deploy schema change on db2043 (s3 codfw master), lag will be generated on s3 codfw - T210713
  • 09:06 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1109 (duration: 00m 52s)
  • 08:48 moritzm: powercycling rdb1001 for a test
  • 07:45 moritzm: installing gnupg2 updates on stretch
  • 07:14 marostegui: Deploy schema change on s1 primary master (db1067) - T210713
  • 07:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 T210713 (duration: 00m 52s)
  • 07:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 after kernel upgrade (duration: 00m 52s)
  • 06:54 oblivian@deploy1001: Synchronized wmf-config/profiler.php: Fix the tideways setup (duration: 00m 52s)
  • 06:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 after kernel upgrade (duration: 00m 52s)
  • 06:47 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 T210713 (duration: 00m 51s)
  • 06:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1119 T210713 (duration: 00m 51s)
  • 06:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1109 after kernel upgrade (duration: 00m 52s)
  • 06:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1109 after kernel upgrade (duration: 00m 52s)
  • 06:18 marostegui: Stop MySQL on db1109 for kernel and mysql upgrade
  • 06:18 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 for kernel and mysql upgrade (duration: 00m 52s)
  • 06:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1119 T210713 (duration: 01m 05s)
  • 04:45 XioNoX: add avoid-paths WIRESTAR-OPTICALTEL to cr2-eqdfw
  • 02:15 mobrovac@deploy1001: Finished deploy [restbase/deploy@751dc5c]: Temporarily collect VE lrequest ogs for T215956 (duration: 22m 37s)
  • 01:52 mobrovac@deploy1001: Started deploy [restbase/deploy@751dc5c]: Temporarily collect VE lrequest ogs for T215956
  • 00:24 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.17/skins/MinervaNeue/resources/skins.minerva.content.styles/lists.less: Revert switch to outside list style from ordered lists (duration: 00m 52s)
  • 00:23 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.18/skins/MinervaNeue/resources/skins.minerva.content.styles/lists.less: Revert switch to outside list style from ordered lists (duration: 00m 59s)
  • 00:05 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: SWAT T215969 Return cirrussearch master timeout back to the default value (duration: 00m 57s)

2019-02-19

  • 23:51 ebernhardson: restarted ferm on relforge1001
  • 23:50 ebernhardson: temporarly stop ferm on relforge1001 to test where a connection is being blocked
  • 20:49 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.18
  • 20:34 thcipriani@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.18 and rebuild l10n cache (duration: 30m 31s)
  • 20:07 gehel@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=0)
  • 20:04 thcipriani@deploy1001: Started scap: testwiki to php-1.33.0-wmf.18 and rebuild l10n cache
  • 20:01 gehel@cumin1001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 19:57 thcipriani: restarting ci-jenkins for plugin update
  • 19:49 thcipriani@deploy1001: Pruned MediaWiki: 1.33.0-wmf.13 (duration: 11m 52s)
  • 19:39 gtirloni: re-pooled labsdb1011 T216481
  • 19:09 andrewbogott: rebooting cloudvirt1009 to poke around in the bios
  • 18:20 thcipriani: starting branch-cut for 1.33.0-wmf.18
  • 17:55 herron: temporarily increased eqiad logstash elasticsearch low disk watermark to 87% (will restore to 85% when eqiad expansion hosts are fully online)
  • 17:52 jijiki: Restarting memcache on mc1027 - T208844
  • 17:00 hashar: Offlined compiler1002.puppet-diffs.eqiad.wmflabs from Jenkins. Its disk is corrupt | T216513
  • 16:39 gtirloni: depooled labsdb1011 T216481
  • 16:33 moritzm: installing libssh update from stretch point release
  • 16:28 jforrester@deploy1001: Synchronized php-1.33.0-wmf.17/includes/specials/pagers/ActiveUsersPager.php: T216200 Hot deploy variable name fix for ActiveUsersPager query (duration: 00m 48s)
  • 16:26 herron: enabling elasticsearch on new eqiad hosts logstash101[0-2]
  • 16:18 gtirloni: re-pooled labsdb1010 T216481
  • 16:07 gehel@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 16:04 gehel@cumin1001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 15:47 jijiki: Reimaging thumbor2002 to stretch - T214597
  • 15:38 gehel@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 15:32 hashar: apt-get upgrade on compiler1001 and compiler1002.puppet-diffs.eqiad.wmflabs
  • 15:27 gehel@cumin1001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 15:25 hashar: Started instance compiler1002.puppet-diffs.eqiad.wmflabs via Horizon. It was in shutoff state | T216513
  • 15:10 _joe_: uploading tideways-xhprof_5.0.0~beta3 to reprepro T176916
  • 15:09 gtirloni: depooled labsdb1010 T216481
  • 14:53 jynus: stopping db2089 for hw maintenance T216240
  • 14:41 gehel@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 14:40 gehel@cumin1001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 14:36 gehel@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 14:35 gehel@cumin1001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 14:31 gehel@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-upgrade (exit_code=99)
  • 14:30 gehel@cumin1001: START - Cookbook sre.elasticsearch.rolling-upgrade
  • 14:30 gehel: rolling upgrade of elasticsearch on relforge - T215931
  • 14:16 jynus: stop db2090 for reboot testing T216240
  • 14:04 gtirloni: running `maintain-views --all-databases --replace-all --clean --debug` on labsdb1010 (T216481)
  • 13:44 Amir1: mwscript maintenance/createAndPromote.php --wiki=testwikidatawiki --force --interface-admin Ladsgroup
  • 13:43 Amir1: ladsgroup@mwmaint1002:~$ mwscript maintenance/createAndPromote.php --wiki=testwikidatawiki --force --sysop Ladsgroup (T215919)
  • 13:31 moritzm: installing rssh update for jessie
  • 13:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1118 T210713 (duration: 00m 46s)
  • 13:23 gtirloni: running `maintain-views --all-databases --replace-all --clean --debug` on labsdb1009 (T216481)
  • 12:57 zeljkof: EU SWAT finished
  • 12:56 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule for Kickstarter Edit-a-thon (T215839) (duration: 00m 43s)
  • 12:50 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set $wgArticleCountMethod = any on fiwikinews (T216333) (duration: 00m 45s)
  • 12:37 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespace Додатак on srwiktionary (T216343) (duration: 00m 46s)
  • 12:29 _joe_: creating gerrit repo operations/debs/tideways-xhprof T176916
  • 12:28 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule for WikiProject Women in red, enwiki (T215295) (duration: 00m 47s)
  • 12:19 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Modifying configuration about Chinese Wikiversity (T212919) (duration: 00m 48s)
  • 11:59 marostegui: Deploy schema change on db1118 - T210713
  • 11:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1118 T210713 (duration: 00m 46s)
  • 11:53 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1064 (duration: 00m 46s)
  • 11:49 moritzm: installing ruby-rack security updates
  • 11:26 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T187299 Launch performance perception survey on eswiki (duration: 00m 46s)
  • 11:22 jynus: stop and restart db1064
  • 11:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 T210713 (duration: 00m 46s)
  • 11:10 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1064 (duration: 00m 46s)
  • 11:05 marostegui: Deploy schema change on dbstore1002
  • 10:25 marostegui: Deploy schema change on db1083 - T210713
  • 10:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 T210713 (duration: 00m 46s)
  • 10:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 after kernel upgrade (duration: 00m 46s)
  • 09:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: More traffic to db1093 after kernel upgrade (duration: 00m 46s)
  • 09:42 mforns@deploy1001: Finished deploy [analytics/refinery@0d7ec19]: deploying refinery to update EL sanitization whitelist (duration: 07m 49s)
  • 09:34 mforns@deploy1001: Started deploy [analytics/refinery@0d7ec19]: deploying refinery to update EL sanitization whitelist
  • 09:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1089 T210713 (duration: 00m 45s)
  • 09:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1093 on API after kernel upgrade (duration: 00m 46s)
  • 09:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 after kernel upgrade (duration: 00m 46s)
  • 08:56 _joe_: experimenting with php-fpm configuration on mwdebug1001 for T176916
  • 08:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1093 for kernel upgrade (duration: 00m 45s)
  • 08:55 hashar: Cleaning contint1001 / partition
  • 08:50 marostegui: Deploy schema change on db1089 - T210713
  • 08:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 T210713 (duration: 00m 46s)
  • 08:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1106 T210713 (duration: 00m 49s)
  • 07:51 marostegui: Drop ep_* tables on s1 - T174802
  • 07:50 moritzm: installing systemd security updates on stretch
  • 07:46 marostegui: Reboot db1106 for kernel upgrade (and remove debug from kernel) T216240 T216273
  • 07:21 marostegui: Drop ep_* tables on s3 - T174802
  • 06:56 marostegui: Deploy schema change on db1106 - this will generate lag on labsdb:s1 T210713
  • 06:56 marostegui: Deploy schema change on db1106 - T210713
  • 06:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1106 T210713 (duration: 00m 52s)
  • 05:31 XioNoX: delete local pref for peering sessions in ulsfo - T204281
  • 05:17 XioNoX: deleted previously deactivated BGP_community_actions terms - T204281
  • 00:01 XioNoX: disable BGP to Zayo on cr2-codfw for intrusive testing - T215193

2019-02-18

  • 20:19 gtirloni: icinga2001 ran puppet ahead of schedule (enable tools-checker-toolsdb monitor)
  • 18:26 jynus: setting clouddb1001 in read_write mode
  • 18:14 volans: upgraded to spicerack 0.0.16-1 cumin[12]001
  • 18:12 volans: uploaded spicerack_0.0.16-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
  • 18:08 jynus: killing mysql on labsdb1005
  • 18:08 jynus: disabled puppet and edited my.cnf on labsdb1005
  • 17:56 jynus: restarting labsdb1004
  • 17:53 jynus: set clouddb1001 in read_only=1
  • 17:50 jijiki: Reimaging thumbor1004 to stretch - T214597
  • 15:41 jynus: performing es2 & es3 backups into es2002
  • 15:21 jynus: move logical backups to subdirectory T210292
  • 14:29 moritzm: rebooting mw2167 for kernel tests
  • 13:59 marostegui: Drop ep_* tables from s7 - T174802
  • 13:25 jijiki: Depooling thumbor1004 to check if the rest of our hosts can handle the load without it - T214597
  • 12:34 moritzm: installing brltty bugfix update from stretch point release
  • 12:31 moritzm: installing upgrading stat1005 to buster
  • 12:28 XioNoX: update clouddb_return term from cloud-in4 on cr1/2-eqiad - T216353
  • 11:53 moritzm: installing hdparm bugfix update from stretch point release
  • 11:36 moritzm: installing uriparser security updates
  • 11:11 moritzm: installing c3p0 security updates
  • 10:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 T210713 (duration: 00m 46s)
  • 10:54 jijiki: Reimaging thumbor2002 to stretch - T214597
  • 10:40 marostegui: Drop tables ep_* from s2 (cswiki nlwiki ptwiki svwiki) T174802
  • 09:50 marostegui: Deploy schema change on db1105:3311 T210713
  • 09:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 T210713 (duration: 00m 46s)
  • 09:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 T210713 (duration: 00m 46s)
  • 09:28 marostegui: Drop ep_* from s6 (ruwiki) - T174802
  • 09:16 marostegui: Deploy schema change on db1099:3311 - T210713
  • 09:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 T210713 (duration: 00m 48s)
  • 09:08 marostegui: Deploy schema change on dbstore1003:3311 and dbstore1001:3311 - T210713
  • 08:27 marostegui: Drop ep_* tables from s5 (srwiki) - T174802
  • 08:23 marostegui: Deploy schema change on s1 codfw master (db2048), lag will be generated on s1 codfw - T210713
  • 07:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1119 after mysql upgrade (duration: 00m 46s)
  • 06:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1119 into API service after mysql upgrade (duration: 00m 46s)
  • 06:49 marostegui: Reboot db2085 to disable debug mode on kernel T216273
  • 06:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1119 after mysql upgrade (duration: 00m 46s)
  • 06:29 marostegui: Stop MySQL on db1119 for mysql and kernel upgrade
  • 06:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1119 for mysql upgrade (duration: 01m 01s)
  • 05:55 marostegui: Deploy schema change on s8 primary master (db1071) - T210713
  • 05:52 marostegui: Set dbstore1002 on read only to start the migration T210478 T215589

2019-02-17

  • 21:20 bstorm_: The slave of labsdb1005.eqiad.wmnet is now clouddb1001.clouddb-services.eqiad.wmflabs
  • 13:14 XioNoX: add term labsdb_return to cloud-in4 - T216353

2019-02-16

  • 16:26 ariel@deploy1001: Finished deploy [dumps/dumps@8f83eea]: fix up multistream index file recombines for large files; better errors for misc dumps failures (duration: 00m 03s)
  • 16:25 ariel@deploy1001: Started deploy [dumps/dumps@8f83eea]: fix up multistream index file recombines for large files; better errors for misc dumps failures
  • 14:21 arturo: T194855 cloudvirt1020 is poweroff, waiting for disk setup before installing
  • 00:20 XioNoX: add port 22 in cloud-in4 term labsdb

2019-02-15

  • 20:40 andrewbogott: enabled virtualization (all three settings) on cloudvirt1019
  • 19:41 arturo: T193264 reimaging cloudvirt1019 to get mitaka/stretch
  • 18:51 arturo: T193264 icinga downtime cloudvirt1019 for 1 week
  • 18:44 bstorm_: stopped replication and then mariadb on labsdb1004
  • 16:52 cdanis: correction, needed to increment version; adding backported rasdaemon 0.6.0-1.2+deb8u2 to jessie-wikimedia
  • 16:48 cdanis: adding backported rasdaemon 0.6.0-1.2+deb8u1 to jessie-wikimedia
  • 16:29 bblack: reprepro: uploaded gdnsd-3.0.0-1~wmf1 to stretch-wikimedia
  • 15:45 moritzm: rebooting auth1001 for kernel security update
  • 14:50 moritzm: installing unbound update from stretch point release
  • 14:45 moritzm: removed labvirt1012 from debmonitor (got renamed to cloudvirt1012) (T216190)
  • 14:06 moritzm: rebooting mwlog1001 for kernel security update
  • 13:54 moritzm: rebooting mwlog2001 for kernel security update
  • 13:46 jbond42: install tar security updates
  • 13:19 moritzm: rolling reboot of mwdebug servers in eqiad to pick up SSBD-enabled qemu
  • 13:12 gtirloni: reboot cloudvirt1020
  • 13:11 arturo: T216239 labvirt1019 has been drained of any workload
  • 13:06 moritzm: installing NSS security updates
  • 12:42 moritzm: installing squid3 security updates
  • 12:30 jynus: stop db2089 mysql instances for reboot testing T216240
  • 12:30 arturo: T216239 schedule 1week of icinga downtime for labvirt1019
  • 10:48 akosiaris: upgrade docker on contint2001 to 18.06.2 T216236
  • 10:42 akosiaris: upgrade docker on contint1001 to 18.06.2 T216236
  • 10:35 gtirloni: reboot cloudvirt1019
  • 09:44 gehel: repool maps100[12]
  • 09:33 moritzm: imported php-defaults debs to thirdparty/php72
  • 08:42 akosiaris: restart gerrit to pick up https://gerrit.wikimedia.org/r/490640 T177868
  • 08:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 (duration: 00m 46s)
  • 08:28 moritzm: rolling restart of apertium to pick up Python 3.4 security update
  • 07:55 godog: bounce prometheus@ops on prometheus2004 to take a snapshot
  • 06:41 marostegui: Stop puppet on labsdb1005 to leave "max_user_connections" on my.cnf - T216170 T216208
  • 06:39 marostegui: Restart labsdb1005 with max_user_connections = 20 T216208
  • 06:17 marostegui: Deploy schema change on db1109 - T210713
  • 06:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 49s)
  • 06:13 marostegui: Reload haproxy on dbproxy11 to repool labsdb1009
  • 00:39 mutante: puppetmaster1001: sudo puppet node clean bast3003.wikimedia.org ; sudo puppet node deactivate bast3003.wikimedia.org (T216199)
  • 00:15 jynus: setting labsdb1005 back into read-write

2019-02-14

  • 23:47 jynus: restarting labsdb1005 mysql in read only mode
  • 23:37 niharika29@deploy1001: Finished deploy [scholarships/scholarships@25ea138]: Update app with updated dependencies to mitigate PHPMailer error T215302 (duration: 00m 02s)
  • 23:37 niharika29@deploy1001: Started deploy [scholarships/scholarships@25ea138]: Update app with updated dependencies to mitigate PHPMailer error T215302
  • 22:07 andrewbogott: rebuilding labvirt1012 as cloudvirt1012, T216190
  • 20:38 bstorm_: Restarted mariadb on labsdb1005 for https://wikitech.wikimedia.org/wiki/Incident_documentation/20190214-labsdb1005
  • 20:09 ejegg: updated fundraising CiviCRM from 02ea871b88 to 165fbf5894
  • 19:42 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.17/extensions/GrowthExperiments/modules/help: SWAT: Help Panel: Fix IME broken in help panel search T216131 (duration: 00m 54s)
  • 19:14 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Stop NavPopups gadget conflict with PagePreviews on Wikivoyage T214878 (duration: 00m 54s)
  • 19:01 mutante: scandium - deleting parsoid clone dir and running puppet one more time, to fix permissions to allow wikidev
  • 18:52 mutante: scandium - deleting parsoid clone dir and running puppet one more time, to fix permissions to allow wikidev
  • 18:12 mutante: scandium - deleting parsoid clone dir and running puppet
  • 18:03 fsero: upgrading tiller to 2.12.2 on eqiad
  • 17:34 godog: bounce rsyslog on wezen/lithium, tls listener timeout in icinga
  • 16:59 moritzm: restarting apertium-apy on scb1001 to pick up Python security update
  • 16:39 marostegui: Depool labsdb1009 - T210713
  • 16:26 fsero: upgrading tiller on codfw
  • 16:11 fsero: updating tiller version on staging cluster
  • 16:10 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2085 - T214840 (duration: 00m 52s)
  • 15:50 fsero: building and publishing new tiller docker image on boron
  • 15:50 END: (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) (volans@cumin1001)
  • 15:43 START: - Cookbook sre.hosts.upgrade-and-reboot (volans@cumin1001)
  • 15:28 volans: upgraded spicerack to v0.0.15 on cumin[12]001
  • 15:26 volans: uploaded spicerack_0.0.15-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
  • 15:12 marostegui: Clear idrac logs from db2085 - T214840
  • 14:45 godog: depool and stop logstash1009 for stretch reimage - T213898
  • 14:20 marostegui: Stop MySQL on db2085 for on-site maintenance - T214840
  • 14:12 jijiki: Enabling puppet on thumbor* servers - T214597
  • 13:39 arturo: T215892 icinga downtime cloudvirt1024 for 2 weeks
  • 12:22 zeljkof: EU SWAT finished
  • 12:21 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.17/extensions/ExternalGuidance/: SWAT: Fix the eventlogging schema definition as per manifest_version=2 (duration: 00m 55s)
  • 11:43 _joe_: restarting hhvm on mw1338, hot tc exhausted T216084
  • 11:04 _joe_: upgrading python3-etcd on stretch T209136
  • 11:03 jbond42: rolling security updates for curl
  • 11:02 jijiki: Disabling puppet on thumbor* servers - T214597
  • 10:59 moritzm: installing python3.4 security updates
  • 10:53 godog: bounce prometheus instances on prometheus2004 to take a snapshot
  • 08:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1106 T214840 (duration: 00m 52s)
  • 07:57 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1087 T210713 (duration: 00m 54s)
  • 07:36 marostegui: Stop MySQL on db1106 for reboot - T214840
  • 06:10 marostegui: Deploy schema change on db1087 with replication, lag will be generated on labsdb:s8 T210713
  • 06:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1087 T210713 (duration: 00m 55s)
  • 01:52 mutante: scandium - removing parsoid deploy dir and letting puppet re-clone it after merging gerrit fix 484602 - replace manual clone with proper puppetization (T201366)
  • 01:52 mutante: scandium - removing parsoid deploy dir and letting puppet re-clone it after merging gerrit fix 484602 - replace manual hack with proper puppet
  • 01:15 mutante: phab1001 - phabricator mail config converted to cluster.mailers to adjust to upstream change (T212989)
  • 00:36 bd808@deploy1001: Finished deploy [scholarships/scholarships@1d89fe2]: Live hack PHPMailer namespace T215302 (duration: 00m 02s)
  • 00:36 bd808@deploy1001: Started deploy [scholarships/scholarships@1d89fe2]: Live hack PHPMailer namespace T215302
  • 00:32 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable ORES (damaging only) on itwiki (T211032) (duration: 00m 53s)
  • 00:24 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable help panel search on cswiki and kowiki (T209301) (duration: 00m 55s)

2019-02-13

  • 23:42 niharika29@deploy1001: Finished deploy [scholarships/scholarships@1d89fe2]: Update scholarships app for 2019 cycle T215302 (duration: 00m 02s)
  • 23:42 niharika29@deploy1001: Started deploy [scholarships/scholarships@1d89fe2]: Update scholarships app for 2019 cycle T215302
  • 21:31 jijiki: Restarting nutcracker on scb100*.eqiad.wmnet
  • 20:54 mutante: ruthenium - shell access for parsoid-testers revoked by puppet, please use scandium.eqiad.wmnet (T201366)
  • 20:44 otto@deploy1001: Started restart [eventstreams/deploy@07033d4]: bouncing eventstreams to apply page-links-change stream config
  • 20:43 mutante: ms-be2021 - powercycling
  • 20:09 thcipriani@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.17 (duration: 00m 53s)
  • 20:08 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.17
  • 19:55 mforns@deploy1001: Finished deploy [analytics/refinery@5f1461e]: Deploying analytics refinery with refinery-source v0.0.85 jars (duration: 07m 36s)
  • 19:48 mforns@deploy1001: Started deploy [analytics/refinery@5f1461e]: Deploying analytics refinery with refinery-source v0.0.85 jars
  • 18:13 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool es1014 (duration: 00m 52s)
  • 18:06 godog: reimage prometheus2003 - T187987
  • 18:01 krinkle@deploy1001: Synchronized php-1.33.0-wmf.17/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Id70fdfa62ef / T215611 (duration: 00m 55s)
  • 17:49 marostegui: Stop MYSQL on db1114 for onsite maintenance - T214720
  • 17:25 jijiki: Pooling mw1299 back - T215569
  • 17:06 cmjohnson1: db1106, troubleshooting idrac issue and updating f/w
  • 16:58 otto@deploy1001: scap-helm eventgate-analytics finished
  • 16:58 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 16:58 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 16:30 elukey: reimage stat1005 to Debian Buster (again)
  • 16:22 otto@deploy1001: scap-helm list finished
  • 16:22 otto@deploy1001: scap-helm list cluster staging completed
  • 16:22 otto@deploy1001: scap-helm list [namespace: list, clusters: staging]
  • 16:13 otto@deploy1001: scap-helm eventgate-analytics finished
  • 16:13 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 16:13 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 15:46 marostegui: Stop MySQL on db1106 for onsite maintenance - this will generate lag on s1 labs - T214840
  • 15:28 jynus: stop and upgrade es1014
  • 15:27 otto@deploy1001: scap-helm eventgate-analytics finished
  • 15:27 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 15:27 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 15:27 otto@deploy1001: scap-helm eventgate-analytics upgrade staging stable/eventgate-analytics [namespace: eventgate-analytics, clusters: eqiad,codfw]
  • 15:17 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 15:17 akosiaris@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 15:17 akosiaris@deploy1001: scap-helm eventgate-analytics install -f /srv/scap-helm/eventgate/eventgate-analytics-staging-values.yaml --set service.port=31193 ../ [namespace: eventgate-analytics, clusters: staging]
  • 15:16 moritzm: updated thirdparty/php72 component to PHP 7.2.15
  • 15:10 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 15:10 akosiaris@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 15:10 akosiaris@deploy1001: scap-helm eventgate-analytics install -f /srv/scap-helm/eventgate/eventgate-analytics-staging-values.yaml --set service.port=31193 ../ [namespace: eventgate-analytics, clusters: staging]
  • 15:09 akosiaris@deploy1001: scap-helm eventgate-analytics install -f /srv/scap-helm/eventgate/eventgate-analytics-staging-values.yaml ../ [namespace: eventgate-analytics, clusters: staging]
  • 15:08 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 15:08 akosiaris@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 15:08 akosiaris@deploy1001: scap-helm eventgate-analytics install --dry-run --debug -f /srv/scap-helm/eventgate/eventgate-analytics-staging-values.yaml ../ [namespace: eventgate-analytics, clusters: staging]
  • 15:05 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 15:05 akosiaris@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 15:05 akosiaris@deploy1001: scap-helm eventgate-analytics install --dry-run --debug -f eventgate-analytics-staging-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 15:05 akosiaris@deploy1001: scap-helm eventgate-analytics install -n staging -f eventgate-analytics-staging-values.yaml stable/eventgate-analytics --dry-run --debug [namespace: eventgate-analytics, clusters: staging]
  • 14:53 otto@deploy1001: scap-helm eventgate-analytics finished
  • 14:53 otto@deploy1001: scap-helm eventgate-analytics cluster staging completed
  • 14:53 otto@deploy1001: scap-helm eventgate-analytics install -n staging -f eventgate-analytics-staging-values.yaml stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]
  • 14:25 elukey: reimage stat1005 back to stretch to test GPU drivers
  • 14:06 godog: cancel https://integration.wikimedia.org/ci/job/operations-mw-config-composer-test-docker/12236 to unblock test-prio zuul queue
  • 14:05 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1120, depool es1014 (duration: 00m 52s)
  • 12:34 arturo: T216030 icinga downtime cloudvirt1018 for 2 weeks
  • 12:32 arturo: T216030 T216004 rebooting cloudvirt1018
  • 11:55 moritzm: installing avahi security updates
  • 11:49 jynus: stop and upgrade db1120
  • 11:43 moritzm: installing golang updates on jessie
  • 11:41 volans: upgraded spicerack on cumin[12]001 to v0.0.14
  • 11:38 volans: uploaded spicerack_0.0.14-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
  • 11:33 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1120 (duration: 00m 53s)
  • 11:11 moritzm: installing postgis security updates
  • 09:46 moritzm: installing golang security updates
  • 09:33 gtirloni: labsdb1005 rebooted server
  • 09:26 gtirloni: labsdb1005 stopped mysql
  • 09:22 marostegui: Stop MySQL on db1106 - T214840
  • 09:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1106 T214840 (duration: 00m 53s)
  • 08:18 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1104 (duration: 00m 53s)
  • 06:46 vgutierrez: uploaded acme-chief 0.10 to apt.wikimedia.org (buster) - T215925
  • 06:18 marostegui: Deploy schema change on db1104 - T210713
  • 06:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 (duration: 01m 07s)
  • 06:12 marostegui: Stop MySQL on db2085 to keep debugging kernel issues - T214840
  • 01:31 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Add ExternalGuidance extension T213076 (part 3) (duration: 00m 53s)
  • 01:30 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add ExternalGuidance extension T213076 (part 2) (duration: 00m 53s)
  • 01:15 thcipriani@deploy1001: Finished scap: SWAT: Add ExternalGuidance extension T213076 (part I: build l10n and sync code) (duration: 27m 51s)
  • 00:47 thcipriani@deploy1001: Started scap: SWAT: Add ExternalGuidance extension T213076 (part I: build l10n and sync code)
  • 00:41 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.17/extensions/Thanks/modules/ext.thanks.mobilediff.css: SWAT: Follow ups to I807f729c1b1a9e9b5952685bb18f540f81d70f47 (duration: 00m 55s)
  • 00:27 XioNoX: merge VRRP Icinga Check

2019-02-12

  • 23:14 jforrester@deploy1001: Finished scap: Another full scap, hoping to find the new i18n in RL for T214482 T215471 T215472 (duration: 06m 01s)
  • 23:09 foks: removed 4 files for legal compliance
  • 23:08 jforrester@deploy1001: Started scap: Another full scap, hoping to find the new i18n in RL for T214482 T215471 T215472
  • 22:47 jforrester@deploy1001: Finished scap: Full scap for new i18n and code for T214482 T215471 T215472 (duration: 18m 03s)
  • 22:29 jforrester@deploy1001: Started scap: Full scap for new i18n and code for T214482 T215471 T215472
  • 21:38 robh: icinga1001 in hardware testing, dont mess with it T214760
  • 21:10 robh: working on troubleshooting icinga1001 via T214760
  • 20:58 jforrester@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/Wikibase/view/resources/resources.php: Hot-deploy I74f6389ae for other code, file 2 (duration: 00m 52s)
  • 20:57 jforrester@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/Wikibase/view/lib/resources.php: Hot-deploy I74f6389ae for other code, file 1 (duration: 00m 51s)
  • 20:52 jforrester@deploy1001: Synchronized php-1.33.0-wmf.16/resources/Resources.php: Hot-deploy If0d7b687e for other code (duration: 00m 54s)
  • 20:06 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Group0 to 1.33.0-wmf.17
  • 19:59 thcipriani@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.17 and rebuild l10n (duration: 18m 54s)
  • 19:40 thcipriani@deploy1001: Started scap: testwiki to php-1.33.0-wmf.17 and rebuild l10n
  • 19:37 thcipriani@deploy1001: Pruned MediaWiki: 1.33.0-wmf.12 (duration: 03m 10s)
  • 19:32 thcipriani@deploy1001: Pruned MediaWiki: 1.33.0-wmf.9 (duration: 10m 05s)
  • 18:48 thcipriani: make-wmf-branch 1.33.0-wmf.17
  • 17:54 chaomodus: notebook1003 - restarted nagios-nrpe-server T212824
  • 17:04 marostegui: Start MySQL again on db2085 for s1 and s8 - T214840
  • 16:18 akosiaris: refresh kubernetes default egress policy T211247
  • 15:58 akosiaris@deploy1001: scap-helm eventgate-analytics finished
  • 15:58 akosiaris@deploy1001: scap-helm eventgate-analytics cluster codfw completed
  • 15:58 akosiaris@deploy1001: scap-helm eventgate-analytics cluster eqiad completed
  • 15:58 akosiaris@deploy1001: scap-helm eventgate-analytics [namespace: eventgate-analytics, clusters: eqiad,codfw]
  • 15:46 akosiaris: create namespaces for eventgate-analytics on eqiad/codfw/staging cluster T211247 T213194
  • 15:45 moritzm: rebooting db2085 for some tests
  • 15:38 marostegui: Stop MySQL on db2085 - T214840
  • 15:38 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2085 - T214840 (duration: 00m 47s)
  • 15:30 otto@deploy1001: scap-helm --help finished
  • 15:30 otto@deploy1001: scap-helm --help cluster codfw completed
  • 15:30 otto@deploy1001: scap-helm --help cluster eqiad completed
  • 15:30 otto@deploy1001: scap-helm --help [namespace: --help, clusters: eqiad,codfw]
  • 15:03 ejegg: updated fundraising CiviCRM from a541a83cb2 to 02ea871b88
  • 14:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1092 (duration: 00m 46s)
  • 14:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: More api traffic to db1092 (duration: 00m 44s)
  • 14:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: More traffic to db1092 (duration: 00m 46s)
  • 13:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Give some api traffic to db1092 (duration: 00m 46s)
  • 13:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 (duration: 00m 47s)
  • 13:39 vgutierrez: uploaded acme-chief 0.9 to apt.wikimedia.org (stretch) - T207389 T213737
  • 12:57 moritzm: installing openssl1.0 security updates
  • 12:30 zeljkof: EU SWAT finished
  • 12:30 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add https://polona.pl/ to $wgCopyUploadsDomains (T215501) (duration: 00m 46s)
  • 12:19 moritzm: install ghostscript security updates on scb*
  • 12:15 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create extendedconfirmed user group for viwiki (T215493) (duration: 00m 47s)
  • 12:10 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Rollbackers User Group Right on azwiki (T215200) (duration: 00m 47s)
  • 12:03 marostegui: Stop MySQL on db1092 to upgrade mysql and kernel
  • 11:27 moritzm: rebooting stat1005
  • 11:20 moritzm: installing ghostscript security updates on remaining thumbor hosts
  • 10:25 marostegui: Deploy schema change on db1092 T210713
  • 10:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1092 (duration: 00m 46s)
  • 10:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 (duration: 00m 46s)
  • 10:00 moritzm: installing ghostscript security updates on thumbor1001
  • 09:36 moritzm: reimaging stat1005 to buster
  • 08:20 marostegui: Deploy schema change on db1101:3318 - T210713
  • 08:20 marostegui: Depool db1101:3318 - T210713
  • 08:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 (duration: 00m 46s)
  • 08:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 (duration: 00m 49s)
  • 07:49 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@125354e]: maintain symlink for old venv path with new virtualenv deploy script (duration: 03m 55s)
  • 07:46 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@125354e]: maintain symlink for old venv path with new virtualenv deploy script
  • 07:40 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@125354e]: testing simplified virtualenv deploy (take 2) (duration: 04m 14s)
  • 07:35 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@125354e]: testing simplified virtualenv deploy (take 2)
  • 07:31 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@125354e]: testing simplified virtualenv deploy (duration: 01m 07s)
  • 07:30 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@125354e]: testing simplified virtualenv deploy
  • 07:26 elukey: update analytics-in4 term mysql-dbstore on cr1/cr2 eqiad
  • 07:09 marostegui: Rename ep_* tables on db1089 (s1) - T174802
  • 06:33 kart_: Finished fourth manual run of unpublished draft purge script (T203059)
  • 06:14 marostegui: Deploy schema change on db1099:3318 T210713
  • 06:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 (duration: 00m 52s)
  • 06:04 kart_: Fourth manual run of unpublished draft purge script (T203059)
  • 02:18 thcipriani: restarting gerrit due to high load
  • 00:49 ebernhardson@deploy1001: Finished scap: SWAT: full sync for gerrit:489309 i18n (duration: 18m 20s)
  • 00:30 ebernhardson@deploy1001: Started scap: SWAT: full sync for gerrit:489309 i18n
  • 00:28 ebernhardson@deploy1001: Synchronized wmf-config/WikibaseSearchSettings.php: gerrit:489780 T214515 Promote new wbsearchentities profiles to default in de, fr, es (duration: 00m 46s)
  • 00:13 jforrester@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/CentralNotice/: SWAT Merge branch 'master' into wmf_deploy I8e52d222eb (duration: 00m 49s)
  • 00:05 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT Stop setting wgSessionsInObjectCache, it's being removed from MW I2946b5b9a (duration: 00m 47s)

2019-02-11

  • 23:22 cdanis: T214760 icinga2001% sudo killall nsca
  • 22:53 cdanis: icinga.w.o-->icinga2001 DNS change deployed T214760
  • 22:40 cdanis: icinga1001 now passive T214760
  • 22:34 cdanis: failing over icinga to icinga2001
  • 21:33 arlolra: Updated Parsoid to b4b9603 (T208901, T215537, T213468, T215638)
  • 21:24 arlolra@deploy1001: Finished deploy [parsoid/deploy@4e9b142]: Updating Parsoid to b4b9603 (duration: 09m 33s)
  • 21:22 otto@deploy1001: Synchronized wmf-config/CommonSettings.php: Use newer RCFeed config for EventBus based recentchange event - T215834 (duration: 00m 47s)
  • 21:20 ottomata: deploying mediawiki-config change for update to EventBus RCFeed config (no-op)
  • 21:16 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@c6a6285]: Weekly GUI deploy (duration: 11m 54s)
  • 21:14 arlolra@deploy1001: Started deploy [parsoid/deploy@4e9b142]: Updating Parsoid to b4b9603
  • 21:13 mobrovac@deploy1001: Finished deploy [citoid/deploy@0b91bea]: Use Zotero for DOIs and pass it the A-L header - T214766 T210806 T215755 (duration: 03m 47s)
  • 21:09 mobrovac@deploy1001: Started deploy [citoid/deploy@0b91bea]: Use Zotero for DOIs and pass it the A-L header - T214766 T210806 T215755
  • 21:04 smalyshev@deploy1001: Started deploy [wdqs/wdqs@c6a6285]: Weekly GUI deploy
  • 20:08 ppchelko@deploy1001: Finished deploy [changeprop/deploy@bdb4740]: Update dependencies, minor refactor, safer deduplication, T207329 (duration: 01m 37s)
  • 20:07 ppchelko@deploy1001: Started deploy [changeprop/deploy@bdb4740]: Update dependencies, minor refactor, safer deduplication, T207329
  • 19:42 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1106, db1118 with full weight (duration: 00m 46s)
  • 19:34 catrope@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: Remove main page special casing from lawiki (T215709) (duration: 00m 46s)
  • 19:28 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wgRestrictionLevels on Serbian projects (T215653) (duration: 00m 46s)
  • 19:16 catrope@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/GrowthExperiments/: Help panel search instrumentation (T211166) (duration: 00m 47s)
  • 19:08 catrope@deploy1001: Synchronized wmf-config/throttle.php: Lift account creation cap for edit-a-thon (T215069) (duration: 00m 47s)
  • 19:08 jijiki: Repooled thumbor1004 - T215411
  • 18:50 robh: thumbor1004 rebooted and updated firmware T215411
  • 18:50 robh: thumbor1004 rebooted and updated firmware
  • 16:49 jynus: stop, upgrade and restart db1106
  • 16:36 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011
  • 16:31 marostegui: Reverse password for globaldev user on dbstore1002 - T200801
  • 16:29 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 00m 52s)
  • 15:49 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1118 (duration: 00m 48s)
  • 15:24 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY (duration: 00m 47s)
  • 15:23 marostegui: Relohad haproxy on dbproxy1010 to depool labsdb1011 - https://phabricator.wikimedia.org/T212308
  • 15:21 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase.php, add conditional setting of useEntitySourceBasedFederation (duration: 00m 47s)
  • 15:20 marostegui: Repool labsdb1010 - T212308
  • 15:19 jynus: add missing grants to db1118
  • 15:07 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Revert, second try (duration: 00m 47s)
  • 15:00 addshore@deploy1001: sync-file aborted: Wikibase.php, add conditional setting of useEntitySourceBasedFederation (duration: 00m 01s)
  • 14:55 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Revert (duration: 00m 45s)
  • 14:53 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool db1118 for the first time (duration: 00m 47s)
  • 14:51 mbsantos@deploy1001: Finished deploy [tilerator/deploy@d546183] (stretch): Updating maps2004 tilerator for the stretch migration work (duration: 00m 39s)
  • 14:50 mbsantos@deploy1001: Started deploy [tilerator/deploy@d546183] (stretch): Updating maps2004 tilerator for the stretch migration work
  • 14:48 mbsantos@deploy1001: Finished deploy [kartotherian/deploy@173adbe] (stretch): Updating maps2004 kartotherian for the stretch migration work (duration: 00m 21s)
  • 14:48 mbsantos@deploy1001: Started deploy [kartotherian/deploy@173adbe] (stretch): Updating maps2004 kartotherian for the stretch migration work
  • 14:47 moritzm: installing curl security updates on trusty
  • 14:21 marostegui: Remove staging from dbstore1003 - T210478
  • 14:16 godog: depool and take a snapshot of prometheus data for all instances on prometheus2003 - T187987
  • 14:09 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010 - T212308
  • 14:08 marostegui: Deploy schema change on db1116:3318 - T210713
  • 12:21 godog: bounce rsyslogd on lithium / wezen, syslog tls listener stuck
  • 12:19 zeljkof: EU SWAT finished
  • 12:18 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for Senior Citizens Write Wikipedia course (T215618) (duration: 00m 48s)
  • 12:14 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Clean expired throttle rules (duration: 00m 48s)
  • 10:47 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 46s)
  • 10:46 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 48s)
  • 10:41 jynus: upgrading mariadb client on cumin* hosts
  • 10:27 mvolz@deploy1001: scap-helm zotero finished
  • 10:27 mvolz@deploy1001: scap-helm zotero cluster codfw completed
  • 10:27 mvolz@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 10:24 mvolz@deploy1001: scap-helm zotero finished
  • 10:24 mvolz@deploy1001: scap-helm zotero cluster eqiad completed
  • 10:24 mvolz@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 10:19 marostegui: Add dbstore1005:3350 to tendril and zarcillo - T210478
  • 10:17 mvolz@deploy1001: scap-helm zotero finished
  • 10:17 mvolz@deploy1001: scap-helm zotero cluster staging completed
  • 10:17 mvolz@deploy1001: scap-helm zotero upgrade staging -f zotero-values-staging.yaml --version=0.0.1 stable/zotero [namespace: zotero, clusters: staging]
  • 10:17 jynus: restart db1114
  • 09:38 marostegui: Stop all mysql instances on dbstore1005 for reboot
  • 09:11 marostegui: Stop all mysql instances on dbstore1003 for reboot
  • 08:17 moritzm: removed cloudcontrol2001-dev.codfw.wmnet from debmonitor (actual hostname in use is cloudcontrol2001-dev.wikimedia.org)
  • 08:07 marostegui: Deploy schema change on s8 codfw master (db2045) - this will generate lag on codfw T210713
  • 07:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1100 (duration: 00m 46s)
  • 07:39 marostegui: Deploy schema change on s7 primary master (db1062) - T210713
  • 07:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Give api traffic to db1100 (duration: 00m 46s)
  • 07:18 marostegui: Stop all mysql instances on dbstore1004 for a reboot
  • 07:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1100 with low weight (duration: 00m 46s)
  • 07:06 marostegui: Upgrade MySQL on db1100
  • 07:06 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1100 for mysql upgrade (duration: 00m 47s)
  • 07:00 marostegui: Restart icinga on icinga1001 - checks went awol
  • 06:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1079 (duration: 00m 48s)
  • 06:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1079 (duration: 00m 48s)
  • 06:14 marostegui@deploy1001: sync-file aborted: Depool db0179 (duration: 00m 01s)
  • 04:23 TimStarling: on mwmaint1002: running normalizeThrottleParameters.php --dry-run on all wikis (T209565)
  • 04:19 tstarling@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/AbuseFilter/maintenance/normalizeThrottleParameters.php: maintenance script update for new dry run (duration: 00m 47s)
  • 04:19 tstarling@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/WikimediaEvents/tests/phpunit/PageViewsTest.php: test-only undeployed change (duration: 00m 46s)
  • 04:18 tstarling@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/NavigationTiming/tests/ext.navigationTiming.test.js: test-only undeployed change (duration: 00m 51s)
  • 04:10 tstarling@deploy1001: sync-file aborted: test-only undeployed change (duration: 00m 12s)
  • 03:05 kartik@deploy1001: Finished deploy [cxserver/deploy@ee4a15a]: Update cxserver to 8928852 (T213256) (duration: 04m 08s)
  • 03:01 kartik@deploy1001: Started deploy [cxserver/deploy@ee4a15a]: Update cxserver to 8928852 (T213256)

2019-02-10

  • off: force rebooting mw1299, stuck again - T215569
  • off: forcing reboot of icinga1001 because it's stuck again (no ping, no ssh, CPU stuck messages on console) - T214760
  • 09:25 marostegui: Disable notifications for lag checks on dbstore1002 - T210478

2019-02-09

  • 21:42 Reedy: running `foreachwiki refreshImageMetadata.php --mediatype BITMAP --mime image/vnd.djvu --force` on mwmaint1002 T215635
  • 21:41 Reedy: refreshImageMetadata.php for commonswiki done T215635
  • 16:51 Jeff_Green: restarted icinga process on icinga1001 because of passive check alert-storm

2019-02-08

  • 23:23 Reedy: running `refreshImageMetadata.php --mediatype BITMAP --mime image/vnd.djvu --force` against commonswiki on mwmaint1002 T215635 (this time we mean it)
  • 22:56 Reedy: running `refreshImageMetadata.php --mediatype BITMAP --mime image/vnd.djvu` against commonswiki on mwmaint1002 T215635
  • 21:25 reedy@deploy1001: Synchronized multiversion/MWMultiVersion.php: Move variable (duration: 00m 49s)
  • 19:50 krinkle@deploy1001: Synchronized w/touch.php: Ia1e610a5f (duration: 00m 46s)
  • 19:49 krinkle@deploy1001: Synchronized w/robots.php: Ia1e610a5f (duration: 00m 46s)
  • 19:48 krinkle@deploy1001: Synchronized w/favicon.php: Ia1e610a5f (duration: 00m 46s)
  • 19:47 krinkle@deploy1001: Synchronized w/extract2.php: Ia1e610a5f (duration: 00m 48s)
  • 18:14 gtirloni: T213527 graphite2002 disabled puppet and commented prometheus_puppet_agent_stats cronjob due to cronspam
  • 18:08 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for s1 rc slaves (duration: 00m 49s)
  • 17:55 mutante: phab1001 - restart aphlict service
  • 17:52 mutante: phab1001 - restarting phd service
  • 17:49 arturo: T215605 add prometheus-openstack-exporter 0.0.8-4 to stretch-wikimedia
  • 17:47 mutante: phab1001 - restarting apache2 service for library upgrade
  • 17:42 mutante: graceful reload of apache on phabricator prod server (phab1001)
  • 17:27 XioNoX: merge Icinga: add ping check for ulsfo PDUs
  • 16:50 ejegg: updated payments-wiki-staging from 52a271e681 to 31647bc97e
  • 16:09 jynus: stopping s1 replication on dbstore1001 to speed up cloning T214720
  • 16:08 moritzm: imported git-fat 0.1.3-2+deb10u1 to buster-wikimedia (T213527)
  • 15:46 marostegui: Repool labsdb1009 - T212308
  • 15:33 _joe_: apt-get upgrade on mwmaint2001 to fix the php installation T215376
  • 15:31 moritzm: imported debmonitor 0.1.5-1+deb10u1 to buster-wikimedia (T213527)
  • 15:31 _joe_: upgraded all php extensions to php 7.2 compatible versions on mwmaint1002
  • 15:10 jijiki: Upgrading php-redis 4.1.1 to mwmaint1002 - T215376
  • 14:51 marostegui: Reload haproxy on dbproxy1011 to depool labsdb1009 - https://phabricator.wikimedia.org/T212308
  • 13:56 moritzm: updated firmware-enriched buster netboot image to 20190208 daily build, the alpha5 image no longer works as Linux 4.19.16-1 bumped the ABI and migrated to testing yesterday
  • 13:45 jynus: racadm serveraction powercycle db1114
  • 13:39 onimisionipe: starting osm-initial-import for maps2004 which is the newly migrated to stretch master - T198622
  • 13:37 elukey: roll restart of aqs on aqs1* to pick up new druid backend changes
  • 13:05 arturo: T209029 reimaging cloudelastic1004
  • 12:54 ejegg: updated fundraising CiviCRM from 3a1bb82373 to a541a83cb2
  • 12:51 jynus: disabling notifications on db1114
  • 12:44 elukey@deploy1001: Synchronized wmf-config/db-eqiad.php: depooling db1114, host down (duration: 00m 47s)
  • 11:36 moritzm: reimage graphite2002 to buster
  • 11:08 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 fully (duration: 00m 47s)
  • 10:50 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099 (duration: 00m 47s)
  • 10:27 jijiki: Restarting memcached on mc1026 to apply '-R 200' - T208844
  • 10:23 godog: swift codfw-prod: more weight to ms-be2047 - T209395 T209921
  • 10:15 jynus: stop and upgrade db1099
  • 10:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3317 (duration: 00m 47s)
  • 09:37 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3317 (duration: 00m 46s)
  • 09:28 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099 (duration: 00m 46s)
  • 09:22 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1086 (duration: 00m 46s)
  • 09:16 moritzm: installing rssh security updates
  • 09:06 moritzm: installing libarchive security updates
  • 09:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: More traffic to db1086 (duration: 00m 47s)
  • 08:53 moritzm: reimage graphite2002 to buster
  • 08:50 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 with low load (duration: 00m 46s)
  • 08:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1086 (duration: 00m 47s)
  • 08:24 jynus: stop and upgrade db1083
  • 08:23 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 (duration: 00m 47s)
  • 08:15 marostegui: Upgrade MySQL on db1086
  • 08:05 marostegui: Upgrade MySQL on db1086 and deploy schema change
  • 08:03 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1086 (duration: 00m 46s)
  • 07:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Full repool db1094 (duration: 00m 47s)
  • 07:45 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1299.eqiad.wmnet
  • 07:41 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: More traffic to db1094 (duration: 02m 55s)
  • 07:27 marostegui@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1299.eqiad.wmnet
  • 07:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1094 (duration: 02m 56s)
  • 07:12 marostegui: Upgrade mysql and kernel on db1094
  • 06:58 marostegui: Deploy schema change on db1094 T210713
  • 06:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1094 (duration: 00m 46s)
  • 06:54 marostegui: Take a mysqldump from staging on dbstore1003 from dbstore1002 - T210478
  • 06:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 (duration: 00m 49s)
  • 06:29 marostegui: powercycle mw1299 - T215569
  • 06:21 marostegui: Deploy schema change on db1098:3317
  • 06:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 (duration: 02m 58s)
  • 06:07 marostegui: Drop staging.mep_word_persistence from dbstore1002 T215450 T213706
  • 02:34 ejegg: updated fundraising CiviCRM from 08be00e87f to 3a1bb82373
  • 01:37 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw1299.eqiad.wmnet
  • 01:10 mutante: mw1299 has been down about 8 hours, does it need deployment.. depooling
  • 01:08 mutante: powercycle crashed mw1299 via mgmt (garbled console output) (T215569)
  • 00:22 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT gerrit:488588 phab:T214515 Turn off wikidata wbsearchentities ab test in de, fr, es (duration: 02m 55s)
  • 00:16 ebernhardson: scap sync timed out on mw1299.eqiad.wmnet
  • 00:15 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT gerrit:483044 T209873 Give protect right to centralnoticeadmin on Meta (duration: 02m 56s)

2019-02-07

  • 23:29 XioNoX: restart ps1-22-ulsfo
  • 23:23 reedy@deploy1001: Synchronized tests/dblistTest.php: Sync test (duration: 02m 55s)
  • 23:18 reedy@deploy1001: Synchronized README: must be up to date (duration: 02m 54s)
  • 22:48 reedy@deploy1001: Synchronized dblists/: alphasort dblists (duration: 02m 56s)
  • 21:43 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group2 wikis to 1.33.0-wmf.16 refs T206670
  • 21:38 robh: updating firmware on ps1-23-ulsfo via T209101 ps1-22-ulsfo update completed
  • 21:22 robh: updating firmware on ps1-22-ulsfo via T209101
  • 20:55 twentyafterfour: train status: deploying 1.33.0-wmf.16 to group2
  • 20:19 sbisson@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/WikibaseLexeme/src/DataAccess/Search/LexemeFulltextResult.php: SWAT: Fix fatal error - EmptySet does not exist anymore (duration: 03m 03s)
  • 19:45 sbisson@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/GrowthExperiments/: SWAT: Help Panel: Fix iOS scroll bug (duration: 03m 02s)
  • 19:28 sbisson@deploy1001: sync-file aborted: SWAT: GrowthExperiments: Enable search for help panel on testwiki (duration: 02m 22s)
  • 19:25 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: GrowthExperiments: Enable search for help panel on testwiki (duration: 03m 04s)
  • 18:32 mutante: LDAP - adding raz-shuty to group nda (T214488)
  • 17:06 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1085 (duration: 03m 03s)
  • 16:03 jynus: restart db1085, temporary s6 lag on wikireplicas
  • 15:55 gehel: starting reimage of maps2004 - T198622
  • 15:51 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1085 (duration: 00m 58s)
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on wikitech for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on section 8 wikis for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on section 7 wikis for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on section 6 wikis for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on section 5 wikis for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on section 4 wikis for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on remaining section 3 wikis for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on section 2 wikis for T215464. This may cause lag in codfw.
  • 15:16 anomie@mwmaint1002: Fixing log_search after migrateActors.php on section 1 wikis for T215464. This may cause lag in codfw.
  • 15:07 anomie@mwmaint1002: Fixing log_search after migrateActors.php on test wikis and mediawikiwiki for T215464. This may cause lag in codfw.
  • 15:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1101 (duration: 00m 55s)
  • 14:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101 after alter and mysql upgrade (duration: 00m 55s)
  • 14:34 jbond42: deploying security updates for libgd3
  • 12:42 Amir1: EU SWAT is done
  • 12:42 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: Set EntityUsageTable addUsage batch size to 300, Part II (duration: 00m 54s)
  • 12:42 marostegui: Set dbstore1002 as IDEMPOTENT - T213670
  • 12:39 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set EntityUsageTable addUsage batch size to 300 (T215146), Part I (duration: 00m 55s)
  • 12:34 marostegui: Powercycle mw1299 as it is down and not responding
  • 12:31 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1101 after alter and mysql upgrade (duration: 03m 02s)
  • 12:26 ladsgroup@deploy1001: Synchronized wmf-config/interwiki.php: SWAT: Update interwiki cache to have yuewiktionary instead of zh-yue (T214400) (duration: 03m 04s)
  • 12:06 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4026.ulsfo.wmnet
  • 12:03 arturo: T214448 reimaging again cloudvirt200[1-3]-dev.codfw.wmnet
  • 11:55 marostegui: Stop MySQL on db1101:3317 and db1101:3318 for mysql upgrade
  • 11:37 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2055 (duration: 03m 02s)
  • 11:17 fsero: upgrade helm to 2.12.2 on deploy{1001,2001} and contint{1001,2001} T215244
  • 11:16 fsero: upgrade helm to 2.12.2 on deploy{1001,2001} and contint{1001,2001}
  • 10:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1101 for alter and mysql upgrade (duration: 00m 56s)
  • 10:43 marostegui: Run mysqldump from dbstore1003 to dump dbstore1002:staging.mep_word_persistence - T215450
  • 09:49 marostegui: Deploy schema change on db1116 - T210713
  • 09:41 akosiaris: reboot mwdebug1001, mwdebug1002, mwdebug2001, mwdebug2002 for VCPU upgrade. T212955
  • 09:23 jynus: running alter table on db2055 for perforamance testing T212092
  • 09:15 fsero: uploading helm and tiller 2.12.2 deb package to stretch and jessie
  • 08:53 marostegui: Deploy schema change on s7 codfw master (db2047), this will generate lag on s7 codfw - T210713
  • 08:34 godog: swift codfw-prod: more weight to ms-be2047 - T209395 T209921
  • 08:14 marostegui: Deploy schema change on s4 primary master (db1068) - T210713
  • 08:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1081 (duration: 00m 54s)
  • 07:50 marostegui: Deploy schema change on db1081
  • 07:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1081 (duration: 00m 53s)
  • 07:48 reedy@deploy1001: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 20s)
  • 07:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1084 (duration: 00m 53s)
  • 07:42 reedy@deploy1001: Synchronized dblists/: Wikimania T215486 (duration: 00m 54s)
  • 07:03 marostegui: Deploy schema change on db1084 - T210713
  • 07:03 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 00m 55s)
  • 06:48 marostegui: Restore consistency options on db2051
  • 06:14 marostegui: Ease consistency options on db2051 (s4 master) to let it catch up on replication
  • 04:35 tstarling@deploy1001: Synchronized wmf-config/set-time-limit.php: (no justification provided) (duration: 00m 54s)
  • 04:00 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable EP namespaces on wikis with no EP pages (duration: 00m 57s)
  • 01:31 eileen: civicrm revision changed from c5aec3ae76 to 08be00e87f, config revision is 306b4de48f
  • 01:24 eileen: civicrm revision changed from 6161a021c0 to c5aec3ae76, config revision is 306b4de48f
  • 01:05 twentyafterfour: US Evening SWAT is complete
  • 01:04 twentyafterfour: no phabricator deployment tonight
  • 01:04 eileen: civicrm revision changed from 613b388916 to 6161a021c0, config revision is 306b4de48f
  • 00:57 twentyafterfour@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT config change for Bug: T214003 (duration: 00m 53s)
  • 00:53 twentyafterfour@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/VisualEditor/: SWAT f89e12f to fix bug: T209610 (duration: 00m 55s)
  • 00:48 twentyafterfour@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/MobileFrontend/: SWAT dd8654a (duration: 01m 00s)
  • 00:47 twentyafterfour: syncing commit dd8654a for Bug: T209052
  • 00:24 twentyafterfour: running `mwscript migrateUserGroup.php commonswiki extended-uploader autopatrolled` on deploy1001

2019-02-06

  • 23:58 mutante: restarting icinga on icinga1001 to pick up new check command ?
  • 22:22 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.16 refs T206670 (duration: 00m 53s)
  • 22:22 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.16 refs T206670
  • 21:45 mutante: LDAP - adding brennen to wmf, releng, ciadmin - Welcome Brennen Bearnes, Software Engineer in Release Engineering (T215365 T214556)
  • 21:05 arlolra@deploy1001: Finished deploy [parsoid/deploy@a4acfa6]: Updating Parsoid to fb67a71 (duration: 03m 43s)
  • 21:04 Krinkle: krinkle@webperf1002 Kill xenon-log (pid 449). It seems its Redis TCP socket to mwlog1001 has been stuck since Dec 13, causing the process to indefinitely hang on listen()/socket.recv()
  • 21:01 arlolra@deploy1001: Started deploy [parsoid/deploy@a4acfa6]: Updating Parsoid to fb67a71
  • 20:49 mutante: LDAP - adding h78na to wmf - welcome Hana Worku, developer on the multimedia team (T215352)
  • 20:40 mutante: LDAP - adding egardner to wmf - welcome Eric Gardner , software engineer in Audiences (T214654)
  • 20:35 twentyafterfour: 1.33.0-wmf.16 has a significantly higher rate of "entire web request took longer than 60 seconds and timed out"
  • 20:03 twentyafterfour: Resuming the MediaWiki train for version 1.33.0-wmf.16. Will deploy Group0 wikis first and then catch up to group1 after a few minutes monitoring logs for stability.
  • 19:50 robh: updated firmware on cp4026 and re-seated (already well seated) dimm b3. errors have cleared for now T214516
  • 19:24 milimetric@deploy1001: Finished deploy [analytics/refinery@cd413dd]: Small bug fix for history checker (duration: 12m 45s)
  • 19:13 robh: taking cp4026 offline to flash firmware and reseat dimm for testing on T214516
  • 19:12 milimetric@deploy1001: Started deploy [analytics/refinery@cd413dd]: Small bug fix for history checker
  • 19:11 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@3272a46]: Add healthcheck plugin (no restart) cobalt T214326 (duration: 00m 09s)
  • 19:11 thcipriani@deploy1001: Started deploy [gerrit/gerrit@3272a46]: Add healthcheck plugin (no restart) cobalt T214326
  • 19:09 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@3272a46]: Add healthcheck plugin (no restart) gerrit2001 first (duration: 00m 10s)
  • 19:09 mutante: LDAP - adding afandian2 and toddleroux to nda (T214727)
  • 19:09 thcipriani@deploy1001: Started deploy [gerrit/gerrit@3272a46]: Add healthcheck plugin (no restart) gerrit2001 first
  • 19:04 jforrester@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/Flow/includes/Conversion/Utils.php: I405dd193 Update Parsoid Accept header to 2.0.0 so service can deploy (duration: 00m 54s)
  • 19:03 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/Flow/includes/Conversion/Utils.php: I405dd193 Update Parsoid Accept header to 2.0.0 so service can deploy (duration: 00m 56s)
  • 18:03 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removed namespace Коментар, added namespace Портал on srwikinews T214561 T214563 (duration: 00m 53s)
  • 18:01 mutante: LDAP - adding alaasarhan to wmde (T215066)
  • 17:57 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Changed wgImportSources for srwikinews T214562 (duration: 00m 53s)
  • 17:53 thcipriani@deploy1001: Synchronized dblists/s3.dblist: SWAT: dblists/s3.dblist: Fix sorting of list of wikis per alphabetical order (duration: 00m 54s)
  • 17:49 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/MobileFrontend: SWAT: VE: Load HTML in parallel with modules T209052 (duration: 00m 57s)
  • 17:40 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.16/extensions/MobileFrontend: SWAT: EditorOverlay: Pass constructor of itself to VisualEditorOverlay, not instance T215408 (duration: 00m 57s)
  • 17:10 jynus: setting db1111 in read-write mode
  • 16:24 moritzm: reimaging graphite2002 to buster
  • 16:19 jynus: running alter table on db2055 T93564
  • 16:14 gehel@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=eqiad
  • 15:44 papaul: powering down thumbor2002 for disk replacement
  • 15:42 moritzm: installing spice security updates
  • 15:41 andrewbogott: rebooting cloudvirt1015 to make sure that nothing drastic changes once libguestfs is installed T215423
  • 15:11 moritzm: installing libav security updates
  • 15:08 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2055 for performance testing T93564 (duration: 00m 55s)
  • 14:50 moritzm: draining restbase1018 for eventual reboot for kernel security update (bundled with Java update)
  • 14:36 moritzm: draining restbase1017 for eventual reboot for kernel security update (bundled with Java update)
  • 14:29 elukey: add term mysql-dbstore to analytics-in4/6 on cr1/2-eqiad to allow tcp connections to dbstore100[3-5] - T210478
  • 12:30 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3032.esams.wmnet
  • 12:29 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet
  • 12:28 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3033.esams.wmnet
  • 12:26 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet
  • 12:25 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3040.esams.wmnet
  • 12:24 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3040.esams.wmnet
  • 12:22 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3041.esams.wmnet
  • 12:22 Amir1: EU SWAT is done
  • 12:21 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3041.esams.wmnet
  • 12:20 vgutierrez: restarting varnish-fe safely across esams/text cluster - T215389
  • 12:19 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: Use separate DB connection for ID insertions on testwikidatawiki (T215147), Part II (duration: 00m 54s)
  • 12:17 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use separate DB connection for ID insertions on testwikidatawiki (T215147), Part I (duration: 00m 55s)
  • 11:58 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3042.esams.wmnet
  • 11:57 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3042.esams.wmnet
  • 11:56 vgutierrez: restarting varnish-fe in cp3042 - T215389
  • 11:02 _joe_: restarting nginx safely across the appserver fleets in order to be able to run puppet without errors
  • 10:41 marostegui: Revoke access to testreduce from ruthenium on m5 - https://phabricator.wikimedia.org/T214740
  • 10:04 moritzm: reimaging graphite2002 to buster
  • 10:01 akosiaris: restart varnish-frontend on cp3030 T215389
  • 10:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 (duration: 00m 52s)
  • 09:33 marostegui: Remove wikiuser from dbstore1003-dbstore1005 T210478
  • 09:15 godog: swift codfw-prod: more weight for ms-be2047 - T209395 T209921
  • 09:00 marostegui: Create research_role on dbstore1003-1005 on all instances - T214469
  • 08:49 marostegui: Deploy schema change on db1091 - T210713
  • 08:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1091 (duration: 00m 53s)
  • 08:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1121 (duration: 00m 53s)
  • 07:51 marostegui: Deploy schema change on db1121 - this will generate lag on s4 labs - also upgrade MySQL on db1121 T210713
  • 07:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1121 (duration: 00m 54s)
  • 07:47 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 (duration: 00m 54s)
  • 07:19 marostegui: Deploy schema change on wikitech T210713
  • 07:14 marostegui: Stop 's4' slave on dbstore1002
  • 07:13 marostegui: Deploy schema change on db1103:3314 (db1097:3314 was also done previously) - T210713
  • 07:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 (duration: 00m 53s)
  • 07:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 (duration: 00m 56s)
  • 06:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 (duration: 01m 06s)
  • 04:39 mutante: reloaded icinga service, cant find new check command definition
  • 03:14 twentyafterfour@deploy1001: Finished scap: testwikis wikis to 1.33.0-wmf.16 refs T206670 (duration: 04m 18s)
  • 03:09 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.33.0-wmf.16 refs T206670
  • 03:05 mutante: actinium - gzipping and rotating some access logs
  • 03:01 twentyafterfour@deploy1001: Synchronized scap/plugins/updateinterwikicache.py: (no justification provided) (duration: 00m 55s)
  • 02:47 mutante: actinium - blocking a bad domain and restarting squid3
  • 02:40 twentyafterfour@deploy1001: Finished scap: sync and update localization for 1.33.0-wmf.16 (duration: 15m 50s)
  • 02:32 XioNoX: push firewall rule to pfw3-eqiad - T215364
  • 02:27 mutante: actinium - apt-get clean for 8% more disk space after icinga alert
  • 02:25 twentyafterfour@deploy1001: Started scap: sync and update localization for 1.33.0-wmf.16
  • 02:16 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.14 refs T206670
  • 02:12 eileen: civicrm revision changed from 6042acb363 to 613b388916, config revision is 306b4de48f
  • 02:02 twentyafterfour@deploy1001: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 01:22 XioNoX: remove peering4/6 prefix-list from routers
  • 01:07 XioNoX: add maintenance and rollback to junos operations class
  • 00:47 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.33.0-wmf.16 refs T206670
  • 00:33 niharika29@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/MobileFrontend/: EditorOverlay: captcha/abusefilter weren't being shown correctly T215101, T202374 (duration: 00m 50s)
  • 00:24 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Demystify Logstash debug level behavior (duration: 00m 51s)
  • 00:23 niharika29@deploy1001: Synchronized wmf-config/logging.php: Demystify Logstash debug level behavior (duration: 00m 46s)
  • 00:18 niharika29@deploy1001: Synchronized wmf-config/logging.php: Add PHP version to MW logs T215350 (duration: 00m 46s)
  • 00:16 niharika29@deploy1001: Synchronized wmf-config/CommonSettings.php: Preserve Composer's include paths - T215126, T215224 (duration: 01m 40s)

2019-02-05

  • 18:56 arlolra@deploy1001: Finished deploy [parsoid/deploy@a4acfa6]: (no justification provided) (duration: 02m 06s)
  • 18:53 arlolra@deploy1001: Started deploy [parsoid/deploy@a4acfa6]: (no justification provided)
  • 18:39 arlolra@deploy1001: Finished deploy [parsoid/deploy@a4acfa6]: Updating Parsoid to fb67a71 (duration: 09m 54s)
  • 18:29 arlolra@deploy1001: Started deploy [parsoid/deploy@a4acfa6]: Updating Parsoid to fb67a71
  • 18:26 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@2959e12]: Update mobileapps to 107c1b1 (T214714) (duration: 04m 43s)
  • 18:21 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@2959e12]: Update mobileapps to 107c1b1 (T214714)
  • 18:17 mutante: contint1001/contint2001 -manually deleting crontab lines unpuppetized in gerrit:488019 (T209361)
  • 18:13 Jeff_Green: authdns-update to deploy 7fee817fd3
  • 17:22 mutante: scandium - restart parsoid-vd service
  • 17:21 mutante: scandium -- copy /srv/visualdiff/testrecude/testrun.ids from ruthenium to the same locatio
  • 15:15 godog: force curator action 'replicas' to set older logstash indices to 1 replica - T213078
  • 14:30 marostegui: Deploy schema change on s4 codfw master with replication, lag will be generated on s4 codfw - T210713
  • 14:26 Jeff_Green: authdns-update for payments dev/testing hostname
  • 14:12 marostegui: Deploy schema change on db1066 (s2 master) - T210713
  • 14:05 marostegui: Delete non used grants from dbstore1002: log, warehouse,project_illustration, cognate\_wiktionary, datasets - T212487 T210478
  • 13:55 godog: swift codfw-prod: add ms-be2047 - T209395 T209921
  • 12:18 addshore: swat done
  • 12:18 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable confirmation prompt on rollback by default T215019 (duration: 00m 47s)
  • 11:35 moritzm: added firmware-enriched buster netboot image (T213546)
  • 11:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1074 (duration: 00m 46s)
  • 10:43 marostegui: Deploy schema change on db1074 with replication, lag will be generated on s2 - T210713
  • 10:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1074 (duration: 00m 47s)
  • 10:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3312 (duration: 00m 46s)
  • 09:42 hashar: contint1001: docker image prune -f
  • 09:34 marostegui: Deploy schema change on db1090:3312 - T210713
  • 09:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3312 (duration: 00m 45s)
  • 09:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1076 (duration: 00m 46s)
  • 09:11 marostegui: Start all slaves on dbstore1002 - T213670
  • 08:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1076 (duration: 00m 45s)
  • 08:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 (duration: 00m 46s)
  • 08:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1076 (duration: 00m 45s)
  • 07:56 marostegui: Upgrade MySQL and kernel on db1076
  • 07:44 marostegui: Deploy schema change on db1076 - T210713
  • 07:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1076 T210713 (duration: 00m 47s)
  • 07:13 marostegui: Taking mysqldump from dbstore1002.staging - T210478
  • 07:05 marostegui: Reboot mysql on db1117:3323 (this will make the dbproxies complain) T214248
  • 02:24 XioNoX: remove BGP session to as6412 on cr2-eqiad (gone from IX)
  • 02:21 XioNoX: delete 2nd as9121 router on cr2-esams
  • 00:47 XioNoX: add BGP sessions to AS64050 on cr1-eqsin
  • 00:24 maxsem@deploy1001: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/486405/ (duration: 00m 46s)
  • 00:11 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 46s)

2019-02-04

  • 22:05 mutante: scandium - systemctl start parsoid-vd (T201366)
  • 20:01 herron: manually ran puppet on mc1023
  • 19:50 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Clean-up: Stop setting wgParsoidWikiPrefix, unused since the Parsoid extension (duration: 00m 45s)
  • 19:45 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Clean-up: Stop setting wgFlowEventLogging, unread (duration: 00m 45s)
  • 19:39 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Clean-up: Stop setting values for wgEcho*FooterNotice*, unread (duration: 00m 46s)
  • 19:32 James_F: Manually purged atjwiki*.png logos for T215122.
  • 19:28 jforrester@deploy1001: Synchronized static/images/project-logos/atjwiki.png: SWAT: Milestone lobo for atjwiki T215122, 1x (duration: 00m 46s)
  • 19:27 jforrester@deploy1001: Synchronized static/images/project-logos/atjwiki-1.5x.png: SWAT: Milestone lobo for atjwiki T215122, 1.5x (duration: 00m 45s)
  • 19:26 jforrester@deploy1001: Synchronized static/images/project-logos/atjwiki-2x.png: SWAT: Milestone lobo for atjwiki T215122, 2x (duration: 00m 44s)
  • 19:22 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T191039 Enable wgAbuseFilterRuntimeProfile on all wikis (duration: 00m 47s)
  • 19:19 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@8b2f078]: Weekly GUI deploy (duration: 09m 47s)
  • 19:09 smalyshev@deploy1001: Started deploy [wdqs/wdqs@8b2f078]: Weekly GUI deploy
  • 18:31 XioNoX: adding Papaul to root@wiki
  • 18:22 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Clean-up: Drop reading for wgEcho*FooterNotice*, unread (duration: 00m 46s)
  • 18:18 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Clean-up: Stop setting wgEchoConfig, unused since 2016 (duration: 00m 48s)
  • 18:11 jforrester@deploy1001: Synchronized dblists/: T213504: Finally, drop the wikidatarepo dblist (duration: 00m 45s)
  • 18:09 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: T213504: Stop telling CommonsSettings about the wikidatarepo dblist (duration: 00m 45s)
  • 18:06 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T213504: Unconfigure the wikidatarepo dblist (duration: 00m 46s)
  • 18:05 XioNoX: manually rotate log file wtmp on csw2-esams
  • 18:02 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T213504: Configure wikibaserepo dblist just like the wikidatarepo one (duration: 00m 46s)
  • 17:58 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: T213504: Tell CommonSettings about the new wikibaserepo dblist (duration: 00m 47s)
  • 17:56 jforrester@deploy1001: Synchronized dblists/wikibaserepo.dblist: T213504: Create the new wikibaserepo dblist (duration: 00m 47s)
  • 17:25 papaul: powering down thumbor2002 for disk replacement
  • 17:10 XioNoX: revert ospf metrics to normal values on esams-eqiad Level3 link
  • 16:50 Lucas_WMDE: deployed patch for T212118
  • 12:41 Lucas_WMDE: EU SWAT done
  • 12:40 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix Wikidata base URI in client config (T198946) (duration: 00m 46s)
  • 12:34 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Populate wmgWikibaseRepoSpecialSiteLinkGroups for commonswiki (T213975) (duration: 00m 51s)
  • 11:04 moritzm: installing ghostscript security updates
  • 08:48 jynus: fixing dbstore1002 x1 replication
  • 07:56 vgutierrez: uploaded certcentral 0.8 to apt.wikimedia.org (stretch) - T209980 T213820 T213301

2019-02-03

  • 20:25 elukey: powercycle mw1272 - no ssh, no tty available via com2 - DIMM correctable errors + OEM errors registered in getsel
  • 18:56 elukey: started a tmux session on dbstore1002 to migrate all the tokudb tables of mediawikiwiki to InnoDB - (s3 replication broken)
  • 17:53 elukey: start all slaves on dbstore1002 (After a crash + recovery) + moved mediawikiwiki.revision_actor_temp to Innodb to unblock s3 slave replication (still broken though)
  • 04:55 legoktm@deploy1001: Synchronized wmf-config/extension-list: Remove WikibaseQuality from extensions-list (T208499) (duration: 00m 51s)
  • 01:10 elukey: powercycle mw1299 - can't ssh nor get a tty via console - racadm getsel shows "An OEM diagnostic event occurred."

2019-02-02

  • 20:42 chaomodus: restarted pdfrender on scb1003
  • 20:41 chaomodus: restarted pdfrender on scb1004
  • 20:06 chaomodus: parsoid was failed on scandium and alerting, the service parsoid-vd was restarted and appears to have come back
  • 05:44 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/VisualEditor/lib/ve/src/ui/dialogs/ve.ui.FindAndReplaceDialog.js: b/src/ui/dialogs/ve.ui.FindAndReplaceDialog.js T214963 Hot-deploy VE fix to stop hitting user pref writes without debounce (duration: 01m 02s)

2019-02-01

  • 23:16 vgutierrez: restart pdfrender on scb1004
  • 21:57 ejegg: updated payments-wiki-staging from 7767c7027e to 52a271e681
  • 21:25 ejegg: updated payments-wiki-staging to fundraising/REL1_31 branch
  • 07:13 bawolff_: reset 2FA on wikitech for User:Cicalese

2019-01-31

  • 17:44 jynus: running alter table on metawiki.revision_actor_temp, trying to fix TokuDB horrible bugs
  • 15:54 jynus: stop, upgrade and restart db1117
  • 13:34 mvolz@deploy1001: scap-helm zotero finished
  • 13:34 mvolz@deploy1001: scap-helm zotero cluster codfw completed
  • 13:34 mvolz@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 13:31 mvolz@deploy1001: scap-helm zotero finished
  • 13:31 mvolz@deploy1001: scap-helm zotero cluster eqiad completed
  • 13:31 mvolz@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 13:19 mvolz@deploy1001: scap-helm zotero finished
  • 13:19 mvolz@deploy1001: scap-helm zotero cluster staging completed
  • 13:19 mvolz@deploy1001: scap-helm zotero upgrade staging -f zotero-values-staging.yaml --version=0.0.1 stable/zotero [namespace: zotero, clusters: staging]
  • 13:18 mvolz@deploy1001: scap-helm zotero upgrade staging -f zotero-values-staging.yaml stable/zotero [namespace: zotero, clusters: staging]
  • 12:54 jynus: stop, upgrade and restart db2044
  • 12:12 jynus: apply new grants to m5-master with replication T214740
  • 11:30 arturo: T215012 icinga downtime cloudvirt1015 for 4h while investigating issues
  • 11:24 arturo: T215012 reboot cloudvirt1015
  • 11:24 jynus: restart eventstreams on scb1002,3,4
  • 11:22 jynus: restart eventstreams on scb1001
  • 10:22 jynus: resetting to defaults innodb consistency options for db2048 T188327
  • 10:00 jynus: restarting pdfrender on scb1002,3,4
  • 09:54 jynus: restarting pdfrender on scb1001
  • 02:01 gtirloni: T215004 restarted gerrit (using 1200% cpu, 71% mem)

2019-01-30

  • 20:28 bawolff_: reset 2FA@wikitech for User:deigo
  • 18:25 ladsgroup@deploy1001: Finished deploy [ores/deploy@ad160b0]: (no justification provided) (duration: 12m 46s)
  • 18:12 ladsgroup@deploy1001: Started deploy [ores/deploy@ad160b0]: (no justification provided)
  • 18:03 jynus: reducing innodb consistency options for db2048 T188327
  • 17:36 XioNoX: deactivate/activate cr2-esams:xe-0/1/3
  • 17:28 akosiaris: restart pdfrender on scb1003, scb1004
  • 16:19 akosiaris: restart proton on proton1002
  • 15:52 jynus: stop, upgrade and restart db2037
  • 15:24 jynus: stop, upgrade and restart db2042
  • 14:27 jynus: stop, upgrade and restart db2034, this will cause some lag on x1-codfw
  • 13:53 jynus: stop, upgrade and restart db2069
  • 11:20 jynus: stop, upgrade and restart db2045, this will cause some lag on s8-codfw
  • 10:54 jynus: stop, upgrade and restart db2079
  • 10:33 jynus: stop, upgrade and restart db2039, this will cause some lag on s6-codfw
  • 10:03 jynus: stop, upgrade and restart db2052, this will cause some lag on s5-codfw
  • 09:31 jynus: stop, upgrade and restart db2089 (s5/s6)
  • 08:58 jynus: stop, upgrade and restart db2051, this will cause some lag on s4-codfw
  • 08:44 jynus: stop, upgrade and restart db2090

2019-01-29

  • 21:52 jijiki: Depooling thumbor2002 due to disc failure - T214813
  • 16:51 arturo: T214499 update Netbox status for cloudvirt1023/1024/1025/1026/1027 from PLANNED to ACTIVE. These servers are actually providing services already.
  • 10:05 jynus: stop, upgrade and restart db2065
  • 09:28 jynus: stop, upgrade and restart db2058
  • 09:12 jynus: stopping, upgrading and restarting db2035, this will cause lag on codfw-s2
  • 08:58 jynus: stop, upgrade and restart db2041
  • 08:38 jynus: stop, upgrade and restart db2056
  • 08:17 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1114 after crash (duration: 00m 52s)
  • 03:32 XioNoX: bump cr2-esams-cr2-eqiad ospf cost to 2000 for level3 link flapping

2019-01-28

  • 23:51 vgutierrez: restarting cp2014 - T214872
  • 21:02 Zoranzoki21: Done wikitext export of content of database for education program on srwiki - T174802 (duration: 8 minutes)
  • 20:54 Zoranzoki21: Starting wikitext export of content of database for education program on srwiki - T174802 (21:54 UTC+1)
  • 19:55 brion: running final pass of requeueTranscodes.php on all wikis to make sure stray missing VP9 transcodes are cleaned up (on mwmaint1002 in a tmux session)
  • 16:41 hashar: contint1001: cleaning up disk space on / (docker images)
  • 16:36 jynus: remove backups dir at dbstore2001 T214831
  • 15:22 thcipriani: restarting jenkins for update
  • 14:16 jynus: stop, upgrade and reboot db2048, this will cause general lag/read only on enwiki/s1-codfw for some minutes
  • 13:52 jynus: stop, upgrade and reboot db2092
  • 12:55 jynus: stop, upgrade and reboot db2085
  • 12:45 jynus: powercycle ms-be1034
  • 12:42 onimisionipe: restarting all elatsicsearch instances on relforge1002 to test spicerack command
  • 11:21 jynus: stop, upgrade and reboot db2062
  • 10:45 jynus: stop, upgrade and reboot db2055

2019-01-27

  • 16:22 godog: powercycle ms-be1020 - T214778
  • 03:28 marostegui: Fix x1 on dbstore1002 - T213670
  • 02:24 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/WikibaseMediaInfo/src/WikibaseMediaInfoHooks.php: Hot-deploy Ic2b08cb27 in WBMI to fix Commons File page display (duration: 00m 49s)

2019-01-26

  • 11:06 volans: force rebooting icinga1001 (no ping, no ssh, stuck console)
  • 03:23 marostegui: Convert all tables on incubatorwiki to innodb to fix s3 thread - T213670
  • 00:03 XioNoX: split member-range ge-3/0/0 to ge-3/0/38 on asw-b-codfw

2019-01-25

  • 22:45 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@5e859c4]: Update mobileapps to a8834e8 (T214728) (duration: 03m 27s)
  • 22:42 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@5e859c4]: Update mobileapps to a8834e8 (T214728)
  • 21:56 krinkle@deploy1001: Synchronized wmf-config/flaggedrevs.php: I95c37d628557c (duration: 00m 46s)
  • 21:44 krinkle@deploy1001: Synchronized wmf-config/: Idb695dd033d42 (duration: 00m 46s)
  • 21:43 krinkle@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: Idb695dd033d42 (duration: 00m 47s)
  • 21:05 robh: cleared sel on db1068, it had a power redundancy loss event (old and resolved) that was triggering the icinga check
  • 20:04 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool db1106 as an extra api host (duration: 00m 46s)
  • 19:36 jynus: powercycle db1114 T214720
  • 19:21 jynus: disabling notifications on db1114
  • 19:21 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 46s)
  • 18:32 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@94b76f5]: Update mobileapps to 4c42e3d (T214714) (duration: 03m 33s)
  • 18:28 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@94b76f5]: Update mobileapps to 4c42e3d (T214714)
  • 17:17 chaomodus: notebook1003 restarted nagios-nrpe-server due to oom - T212824
  • 14:43 hashar: contint1001: stopping zuul-merger for cleanup duties
  • 09:48 marostegui: Add dbstore1005:3318 to tendril T210478
  • 08:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1105 (duration: 00m 45s)
  • 08:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1105:3312 (duration: 00m 45s)
  • 07:51 elukey: restart yarn/hdfs daemons on analytics1056 to pick up new disk settings - T214057
  • 07:40 elukey: drain + reboot analytics1054 after disk swap (verify reboot + restore correct fstab mountpoints) - T213038
  • 07:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1105:3312 (duration: 00m 45s)
  • 07:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1105 (duration: 00m 47s)
  • 06:53 marostegui: Stop MySQL on db1105 to upgrade MySQL
  • 06:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully depool db1105 (duration: 00m 46s)
  • 06:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1122 T210713 (duration: 00m 47s)
  • 06:13 marostegui: Deploy schema change on db1122 - T210713
  • 06:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1122 T210713 (duration: 00m 48s)
  • 06:04 marostegui: Compress dbstore1002: staging.mep_word_persistence from Aria to InnoDB - T213706
  • 05:42 kartik@deploy1001: Finished deploy [cxserver/deploy@a5d7181]: Update cxserver to 356f0a1 (T213257, T213275) (duration: 04m 09s)
  • 05:38 kartik@deploy1001: Started deploy [cxserver/deploy@a5d7181]: Update cxserver to 356f0a1 (T213257, T213275)
  • 03:12 mutante: scandium sudo chgrp -R wikidev /srv/deployment/parsoid/deploy/ ; sudo chmod -R g+w /srv/deployment/parsoid/deploy/ (T201366)
  • 03:03 mutante: scandium - apt-get -t stretch-backports install npm ; run puppet ; remove manually created /apt/preferences.d/npm.pref ; puppet created npm_stretch_backports.pref ; puppet run without errors again (T201366)
  • 01:33 crusnov@deploy1001: Finished deploy [netbox/deploy@7770453]: Cleanup deploy - T212524 (duration: 00m 11s)
  • 01:33 crusnov@deploy1001: Started deploy [netbox/deploy@7770453]: Cleanup deploy - T212524
  • 01:28 crusnov@deploy1001: Finished deploy [netbox/deploy@7770453]: Upgrade netbox to 2.5.3 - T212524 Try 2 (duration: 00m 31s)
  • 01:27 crusnov@deploy1001: Started deploy [netbox/deploy@7770453]: Upgrade netbox to 2.5.3 - T212524 Try 2
  • 01:26 crusnov@deploy1001: Finished deploy [netbox/deploy@7770453]: Upgrade netbox to 2.5.3 - T212524 (duration: 07m 43s)
  • 01:18 crusnov@deploy1001: Started deploy [netbox/deploy@7770453]: Upgrade netbox to 2.5.3 - T212524
  • 00:46 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T214515 gerrit:486154: Turn on wbsearchentities ab test in de, fr, es (duration: 00m 46s)
  • 00:37 ebernhardson@deploy1001: Synchronized wmf-config/WikibaseSearchSettings.php: SWAT T214515 gerrit:484334: Add wbsearchentities profiles for de, fr, es (duration: 00m 45s)
  • 00:34 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/MobileFrontend/: SWAT T214606 gerrit:486392: MobileFrontend if wikidatadata description exists, set it as tagline (duration: 00m 47s)
  • 00:29 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.14/includes/Title.php: SWAT T210739 gerrit:486369: Clone the Title object to prevent mutation (duration: 00m 47s)
  • 00:20 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: SWAT T212788 gerrit:485609: autocomplete subphrase matching on wikitech and mw.org 2 of 2 (duration: 00m 45s)
  • 00:14 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T212788 gerrit:485608: autocomplete subphrase matching on wikitech and mw.org (duration: 00m 46s)
  • 00:01 arlolra: Updated Parsoid to 4772f44 (T214649, T214648)

2019-01-24

  • 23:54 arlolra@deploy1001: Finished deploy [parsoid/deploy@f9ef630]: Updating Parsoid to 4772f44 (duration: 11m 58s)
  • 23:42 arlolra@deploy1001: Started deploy [parsoid/deploy@f9ef630]: Updating Parsoid to 4772f44
  • 22:21 mutante: wikitech-static splitting apache2 config files into one file per vhost to make it possible for certbot t odetect them
  • 22:11 mutante: wikitech-static attempted to use certbot with --authenticator webroot and --installer apache to make it properly work with certbot renew in the future. it created account in /etc/letsencrypt/ made backup in /root/; challenge fails though because all domains need to serve out of a webroot and there is status.wikimedia.org here as well. (T21640)
  • 22:08 mutante: wikitech-static - certbot was already installed but it wasn't used to generate the existing certs so just running certbot renew did not work, attempted to use certbot to renew but apache plugin missing, installed python-certbot-apache (T214640)
  • 21:40 twentyafterfour: Finished MediaWiki train for 1.33.0-wmf.14 (T206668) - there is no train next week so I'll be back with wmf.16 (T206670) in two weeks.
  • 21:16 twentyafterfour@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/AbuseFilter/includes/Views/AbuseFilterView.php: sync I67ca47 refs T206668 (duration: 00m 47s)
  • 20:47 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.14 refs T206668
  • 20:11 jforrester@deploy1001: Finished scap: Post-SWAT full sync for new i18n for T208097 (duration: 33m 54s)
  • 19:59 mutante: temp disabled puppet on phab1001 , applying ferm change to allow deployment servers to http to phab servers
  • 19:37 jforrester@deploy1001: Started scap: Post-SWAT full sync for new i18n for T208097
  • 19:35 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T213356 Enable WelcomeSurvey experiment 2 on viwiki (duration: 00m 53s)
  • 19:33 akosiaris: delete 8505 tickets from OTRS with customerID Mailer-Daemon@wizengo.ds.planet-work.net T214604 - correction
  • 19:32 akosiaris: delete 5076 tickets from OTRS with customerID Mailer-Daemon@wizengo.ds.planet-work.net T214604
  • 19:32 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/WikibaseMediaInfo/src/WikibaseMediaInfoHooks.php: SWAT T213885 Don't add mw:mediainfoView on File pages with no captions either (duration: 00m 51s)
  • 19:26 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/WikimediaMessages/i18n/wikimedia/en.json: SWAT T208097 WikimediaMessages: Add message for BlockAttacker password policy (duration: 00m 50s)
  • 19:25 arlolra: Updated Parsoid to f1d717f (T187958, T205337, T214103)
  • 19:23 akosiaris: delete 5076 tickets from OTRS with customerID MAILER-DAEMON@ubuntu.member.linode.com T214604
  • 19:23 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/AbuseFilter/includes/AbuseFilter.php: SWAT AbuseFilter Optionally pass the filter ID to checkConditions for error reporting I8510319c (duration: 00m 53s)
  • 19:19 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/GrowthExperiments/GrowthExperiments.alias.php: SWAT T213356 Add Special:WelcomeSurvey Vietnamese alias (duration: 00m 54s)
  • 19:12 marostegui: Convert dbstore1002 staging.organic_link from Aria to InnoDB - T213706
  • 19:03 arlolra@deploy1001: Finished deploy [parsoid/deploy@f2384f0]: Updating Parsoid to f1d717f (duration: 09m 41s)
  • 19:02 cdanis: T214529: cdanis@cp4026.ulsfo.wmnet ~ % sudo apt-get --purge remove edac-utils libsysfs2 libedac1
  • 18:53 arlolra@deploy1001: Started deploy [parsoid/deploy@f2384f0]: Updating Parsoid to f1d717f
  • 18:53 mutante: notebook1003 - restarted nagios-nrpe-server... T212824
  • 18:52 chaomodus: notebook1002: restarted nagios-nrpe-server due to oom
  • 18:49 cdanis: cp4026: T214529: apt-get install'ing edac-utils with new deps libedac1 libsysfs2
  • 18:37 onimisionipe: pooling maps1003 - stretch migration is complete. T198622
  • 18:22 onimisionipe@deploy1001: Finished deploy [kartotherian/deploy@26a8bbd] (stretch): Updating maps1001 to reflect latest changes (duration: 01m 24s)
  • 18:21 onimisionipe@deploy1001: Started deploy [kartotherian/deploy@26a8bbd] (stretch): Updating maps1001 to reflect latest changes
  • 18:19 mutante: deploying polygerrit (new gerrit UI) theme change to roughly match MediaWiki timeless theme (gerrit:482379) (shoutouts: paladox, thcipiriani)
  • 18:07 XioNoX: re-activate ping offload redirect for ping1001 restart
  • 18:03 moritzm: rebooting ping1001 to pick up SSBD-enabled qemu
  • 18:01 XioNoX: deactive ping offload redirect for ping1001 restart
  • 17:58 moritzm: rebooting ping2001 to pick up SSBD-enabled qemu
  • 17:50 akosiaris: restart exim on mendelevium T214604
  • 17:44 akosiaris: block specific IPv4, IPv6 address on mx1001, mx2001 T214604
  • 17:35 akosiaris: freeze all current info@wikipedia.org emails on mx1001, mx2001 T214604
  • 17:31 moritzm: rebooting seaborgium to pick SSBD-enabled qemu
  • 17:01 akosiaris: stop exim on mendelevium
  • 16:25 moritzm: rebooting serpens to pick SSBD-enabled qemu
  • 15:45 reedy@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/WikimediaEvents/: Revive wgPoweredByHHVM (duration: 00m 55s)
  • 15:14 moritzm: rebooting pollux to pick SSBD-enabled qemu
  • 14:50 godog: roll restart prometheus after https://gerrit.wikimedia.org/r/c/operations/puppet/+/486251 - T187987
  • 14:45 ariel@deploy1001: Finished deploy [dumps/dumps@25358e7]: fix up web links to multistream dump files (duration: 00m 03s)
  • 14:45 ariel@deploy1001: Started deploy [dumps/dumps@25358e7]: fix up web links to multistream dump files
  • 14:31 andrew@deploy1001: Finished deploy [horizon/deploy@94f3ec1]: Rolling out an upgraded proxy dashboard -- now use designate v2 API (duration: 03m 21s)
  • 14:28 andrew@deploy1001: Started deploy [horizon/deploy@94f3ec1]: Rolling out an upgraded proxy dashboard -- now use designate v2 API
  • 14:23 marostegui: Stop replication on all threads in dbstore1002 - T213706
  • 13:13 zeljkof: EU SWAT finished
  • 13:10 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure $wgSitename and $wgMetaNamespace for ur.wiktionary, ur.wikibooks and ur.wikiquote (T214290) (duration: 00m 53s)
  • 13:02 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Assign "suppressredirect" to rollbacker on newiki (T214012) (duration: 00m 53s)
  • 13:00 zeljkof: extending EU SWAT for 5-10 minuts
  • 12:53 reedy@deploy1001: Synchronized private/PrivateSettings.php: fix minor typo (duration: 00m 52s)
  • 12:46 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change $wgUploadNavigationUrl for the Persian (fa) Wikisource to Commons (T214048) (duration: 00m 53s)
  • 12:36 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add few domains at $wgCopyUploadsDomains and cleanup inline comments (T213961 T213632 T213649 T213924) (duration: 00m 53s)
  • 12:32 zfilipin@deploy1001: sync-file aborted: SWAT: Enable reference previews on beta (T213415) (duration: 00m 01s)
  • 12:28 jbond42: restarting pdns-recursor and ntp on dns1001 and dns1002 for a security update
  • 12:25 zfilipin@deploy1001: Synchronized wmf-config/: SWAT: Enable reference previews on beta (T213415) (duration: 00m 54s)
  • 12:17 zfilipin@deploy1001: Synchronized wmf-config/abusefilter.php: SWAT: Enable $wgAbuseFilterProfile on every wiki (T191039) (duration: 00m 54s)
  • 12:12 onimisionipe: initializing postgres replication for maps1001
  • 11:55 moritzm: installing memcached updates on dbmonitor*
  • 11:41 moritzm: installing polarssl security updates
  • 11:38 gehel: restart elasticsearch on elastic20205 to validate configuration change
  • 11:27 gehel: restarting blazegraph + updater on wdqs* for jvm upgrade
  • 11:26 moritzm: installing xen security updates (only some client libs are used)
  • 11:12 marostegui: Add dbstore1005:3318 to zarcillo - T210478
  • 11:08 moritzm: installing Java security updates on wdqs hosts
  • 10:59 arturo: T214299 additional reboot for cloudnet1004
  • 10:51 marostegui: Compress innodb tables on dbstore1005:3318 - T210478
  • 10:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1075 (duration: 00m 53s)
  • 10:37 moritzm: installing libsndfile security updates
  • 10:37 gehel: starting stretch upgrade on maps1001 - T198622
  • 10:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1075 (duration: 00m 52s)
  • 10:13 moritzm: installing libav security updates
  • 10:03 arturo: T214299 reimage cloudnet1004 to debian stretch
  • 09:58 moritzm: installing tiff security updates on trusty
  • 09:45 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1103:3312 T210713 (duration: 00m 53s)
  • 09:43 marostegui: Deploy schema change on db1095:3312 - T210713
  • 09:30 marostegui: Deploy schema change on db1103:3312 - T210713
  • 09:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 T210713 (duration: 00m 53s)
  • 09:24 godog: temp stop prometheus@global on prometheus2003 to grab a snapshot
  • 08:51 dcausse: elasticsearch: deleting indices moved out of the search-chi@(eqiad|codfw) cluster (T214052)
  • 08:49 marostegui: Transfer s8 from db1116:3318 to dbstore1005:3318 T210478
  • 08:40 marostegui: Deploy schema change on s2 codfw master (db2035). this will generate lag on codfw - T210713
  • 08:30 marostegui: Deploy schema change on db1070 (s5 master) - T210713
  • 08:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 T210713 (duration: 00m 52s)
  • 08:18 marostegui: Deploy schema change on db1110 - T210713
  • 08:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1110 T210713 (duration: 00m 53s)
  • 08:08 oblivian@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Whitelist the php7 beta feature (duration: 00m 54s)
  • 07:58 marostegui: Compress innodb on dbstor1004 s2 and s3 - T210478
  • 07:53 marostegui: Deploy schema change on db1102:3315
  • 07:50 marostegui: Compress InnoDB tables on dbstore1005:3316 - T210478
  • 07:43 marostegui: Add dbstore1005:3316 to tendril and zarcillo - T210478
  • 07:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1082 T210713 (duration: 00m 52s)
  • 07:18 marostegui: Transfer s6 from dbstore1001 to dbstore1005 using mariadbbackup - T210478
  • 07:09 marostegui: Compress Aria tables to InnoDB on dbstore1002 staging database - T213706
  • 07:07 marostegui: Deploy schema change on db1082, this will generate lag on labsdb s5 - T210713
  • 07:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1082 T210713 (duration: 00m 52s)
  • 07:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 T210713 (duration: 00m 53s)
  • 06:55 marostegui: Transfer x1 from dbstore1001 to dbstore1005 using mariadbbackup - T210478
  • 06:51 marostegui: Deploy schema change on db1113:3315 - T210713
  • 06:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 T210713 (duration: 00m 53s)
  • 06:43 marostegui: Add dbstore1005:3320 to tendril and zarcillo - T210478
  • 06:37 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1100 T210713 (duration: 00m 52s)
  • 06:27 marostegui: Deploy schema change on db1100 - T210713
  • 06:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1100 T210713 (duration: 00m 53s)
  • 06:14 marostegui: Reboot dbstore1005 - T210478
  • 06:10 marostegui: Add dbstore1003:3311 to tendril - T210478
  • 05:03 tstarling@deploy1001: Synchronized wmf-config/profiler.php: gerrit 478137 (duration: 00m 53s)
  • 05:01 tstarling@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: gerrit 478137 (duration: 00m 53s)
  • 04:53 tstarling@deploy1001: Synchronized wmf-config/PhpAutoPrepend-labs.php: gerrit 477957 (duration: 00m 53s)
  • 04:52 tstarling@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: gerrit 477957 (duration: 00m 52s)
  • 04:51 tstarling@deploy1001: Synchronized wmf-config/LabsServices.php: gerrit 477957 (duration: 00m 52s)
  • 04:50 tstarling@deploy1001: Synchronized wmf-config/ProductionServices.php: gerrit 477957 (duration: 00m 56s)
  • 01:35 krinkle@deploy1001: Synchronized errorpages/: Ic093c3122f - rm php-fatal-error.html (duration: 00m 54s)
  • 01:01 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: 477956 and Aaron's 486134 (duration: 00m 52s)
  • 00:59 tstarling@deploy1001: Synchronized errorpages/hhvm-fatal-error.php: (no justification provided) (duration: 00m 53s)
  • 00:58 tstarling@deploy1001: Synchronized multiversion/MWRealm.php: (no justification provided) (duration: 00m 52s)
  • 00:57 tstarling@deploy1001: Synchronized src/ServiceConfig.php: gerrit 477956 (duration: 00m 53s)
  • 00:45 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.14/skins/MinervaNeue/includes/skins/minerva.mustache: SWAT: Restore banners to Wikivoyage project (duration: 00m 52s)
  • 00:42 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/MobileFrontend: SWAT: Explicitly pass in parseHTML T214451 (duration: 00m 55s)
  • 00:34 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/MobileFrontend: SWAT: Explicitly pass in parseHTML T214451 (duration: 00m 57s)

2019-01-23

  • 23:32 crusnov@deploy1001: Finished deploy [netbox/deploy@aa3c342]: Upgrade netbox to 2.5.3 - T212524 (duration: 04m 46s)
  • 23:28 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.14 refs T206668 (duration: 00m 52s)
  • 23:28 crusnov@deploy1001: Started deploy [netbox/deploy@aa3c342]: Upgrade netbox to 2.5.3 - T212524
  • 23:26 chaomodus: scap deploy netbox 2.5.3
  • 23:13 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/Translate/TranslateHooks.php: T214517 T214358 Hot-deploy Ic9d85fec1 to un-block train, hopefully (duration: 00m 53s)
  • 23:00 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/WikimediaEvents/includes/WikimediaEventsHooks.php: Hot-deploy I81165bf00 to use the right name and value for the cookie (duration: 00m 53s)
  • 22:08 chaomodus: proton1001 restarted nagios-nrpe-server which died from oom
  • 21:30 mutante: scandium - removing npm and nodejs*, testing puppetization to reinstall them
  • 20:50 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.13 refs T206668 (duration: 00m 52s)
  • 20:50 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.13 refs T206668
  • 20:43 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.14 refs T206668 (duration: 00m 52s)
  • 20:42 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.14 refs T206668
  • 20:33 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.33.0-wmf.14 refs T206668
  • 20:23 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.33.0-wmf.13 refs T206668
  • 20:21 twentyafterfour: rolling back because error rate increased significantly after promoting
  • 20:10 twentyafterfour: twentyafterfour@deploy1001 rebuilt and synchronized wikiversions files: group0 wikis to 1.33.0-wmf.14 refs T206668
  • 19:33 moritzm: rebooting dubnium to pick up SSBD-enabled qemu
  • 19:03 moritzm: rebooting puppetdb2001 to pick up SSBD-enabled qemu
  • 18:46 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Config: Disable showing 'depicts' statements on Commons for now via I66d97031 (duration: 00m 52s)
  • 18:44 jforrester@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/WikimediaEvents/includes/WikimediaEventsHooks.php: Hot-deploy Ief9c9155c to avoid auto-opting new accounts into PHP7 (duration: 00m 53s)
  • 18:35 anomie@deploy1001: Synchronized php-1.33.0-wmf.13/includes/page/WikiPage.php: Add even more temporary logging for T210739 (duration: 00m 54s)
  • 18:26 moritzm: rebooting mendelevium/ticket.wikimedia.org to pick up SSBD-enabled qemu
  • 18:10 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Reapply Enable the Welcome survey on viwiki (duration: 00m 53s)
  • 18:09 sbisson@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/GrowthExperiments/: SWAT: Help panel: ResourceLoaderHelpPanelModule handle help panel disabled (duration: 00m 54s)
  • 18:03 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns5002.wikimedia.org
  • 17:57 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns5002.wikimedia.org
  • 17:24 dcausse@deploy1001: Finished deploy [search/mjolnir/deploy@a141ad3]: fix retry_on_conflict (duration: 04m 21s)
  • 17:20 dcausse@deploy1001: Started deploy [search/mjolnir/deploy@a141ad3]: fix retry_on_conflict
  • 16:57 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns5001.wikimedia.org
  • 16:53 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns5001.wikimedia.org
  • 16:50 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns4002.wikimedia.org
  • 16:44 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns4002.wikimedia.org
  • 16:43 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns4001.wikimedia.org
  • 16:36 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns4001.wikimedia.org
  • 16:31 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns2002.wikimedia.org
  • 16:14 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns2002.wikimedia.org
  • 16:13 jbond42: rolling restarts of PDNS recursors/ntpd in codfw/esams/ulsfi/eqsin to pick up openssl security update
  • 16:02 jbond42: restarting ntpd on dns2001
  • 16:00 jbond@cumin1001: conftool action : set/pooled=yes; selector: name=dns2001.wikimedia.org
  • 15:57 jynus: adding dbstore1004:s2 to tendril
  • 15:43 jbond@cumin1001: conftool action : set/pooled=no; selector: name=dns2001.wikimedia.org
  • 15:20 marostegui: Truncate wmf_checksum table on dbstore1002 - T213670
  • 14:55 marostegui: Compress InnoDB on a few tables on dbstore1002 to gain some extra space - T213670
  • 14:18 marostegui: Convert tokudb tables into innodb on dbstore1002 - T213706
  • 13:47 marostegui: Convert a bunch of Aria tables to InnoDB on dbstore1002
  • 13:38 onimisionipe: repooling maps1002
  • 13:32 gehel: restarting kartotherian on maps100[234]
  • 13:30 gehel: restarting kartotherian on maps1003
  • 13:27 marostegui: Migrate some tokudb tables to innodb on dbstore1002 - T213706
  • 13:18 gehel: running cumin 'P{O:cache::upload} and A:eqiad' 'run-puppet-agent'
  • 13:10 zeljkof: EU SWAT finished
  • 12:36 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.14/extensions/AbuseFilter: SWAT: Re-fix the throttle script (T209565) (duration: 00m 55s)
  • 12:32 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/AbuseFilter/: SWAT: Re-fix the throttle script (T209565) (duration: 00m 54s)
  • 12:20 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add new namespace abbreviation for Swedish (sv) (T214329) (duration: 00m 53s)
  • 12:17 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix project talk namespace alias of Persian Wikipedia (T213733) (duration: 00m 53s)
  • 12:09 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Define ImportSources for nywiki (duration: 00m 54s)
  • 11:44 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: T214456 (duration: 00m 53s)
  • 11:04 arturo: T214299 reboot cloudnet2001-dev, cloudnet2002-dev and cloudnet1003 for new interface names
  • 11:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 T210713 (duration: 00m 52s)
  • 10:52 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 T210713 (duration: 00m 52s)
  • 10:39 arturo: updating puppet catalog compiler facts: `PUPPET_COMPILER=compiler1002.puppet-diffs.eqiad.wmflabs modules/puppet_compiler/files/compiler-update-facts`
  • 10:37 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 T210713 (duration: 00m 52s)
  • 10:33 Amir1: Deployed patch for T207814 on wmf.14
  • 10:31 Amir1: Deployed patch for T207814 on wmf.13
  • 10:12 marostegui: Deploy schema change on db1096:3315 - T210713
  • 10:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 T210713 (duration: 00m 53s)
  • 09:39 akosiaris: upgrade mathoid in eqiad and codfw to latest chart version
  • 09:38 akosiaris@deploy1001: scap-helm mathoid finished
  • 09:38 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 09:38 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 09:38 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml production stable/mathoid [namespace: mathoid, clusters: eqiad,codfw]
  • 09:30 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1003.eqiad.wmnet
  • 09:23 akosiaris@deploy1001: scap-helm mathoid finished
  • 09:23 akosiaris@deploy1001: scap-helm mathoid cluster staging completed
  • 09:23 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml --set resources.replicas=1 staging stable/mathoid [namespace: mathoid, clusters: staging]
  • 09:22 akosiaris@deploy1001: scap-helm mathoid finished
  • 09:22 akosiaris@deploy1001: scap-helm mathoid cluster staging completed
  • 09:22 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml staging stable/mathoid [namespace: mathoid, clusters: staging]
  • 08:55 marostegui: Deploy schema change on s5 codfw master with replication, lag will be generated - T210713
  • 08:44 addshore: addshore@mwmaint1002:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki yuewiktionary --batch-size 1000 // T214400
  • 08:28 marostegui: Deploy schema change on db1061 (s6 primary master) - T210713
  • 08:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1088 T210713 (duration: 00m 55s)
  • 08:19 marostegui: Add dbstore1004:3314 to tendril - T210478
  • 08:18 marostegui: Add dbstore1004:3314 to zarcillo - T210478
  • 08:12 marostegui: Deploy schema change on db1088 T210713
  • 08:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1088 T210713 (duration: 00m 52s)
  • 08:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1093 T210713 (duration: 00m 52s)
  • 07:51 marostegui: Compress tables on dbstore1004:3314 - T210478
  • 07:48 marostegui: Deploy schema change on db1093 - T210713
  • 07:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1093 T210713 (duration: 00m 54s)
  • 07:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1085 T210713 (duration: 00m 52s)
  • 07:13 marostegui: Deploy schema change on db1085, this will generate lag on s6 labs - T210713
  • 07:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1085 T210713 (duration: 00m 53s)
  • 07:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1113:3316 T210713 (duration: 00m 52s)
  • 06:53 marostegui: Deploy schema change on db1113:3316 - T210713
  • 06:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1113:3316 T210713 (duration: 00m 53s)
  • 06:25 marostegui: Stop s4 on db1102 to clone dbstore1004 - T210478
  • 06:16 marostegui@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Increase parsercache TTL keys from 22 to 24 days T210992 (duration: 01m 06s)
  • 04:05 tstarling@deploy1001: Finished scap: gerrit 480419 (duration: 19m 33s)
  • 03:45 tstarling@deploy1001: Started scap: gerrit 480419
  • 03:44 tstarling@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: gerrit 480419 (duration: 00m 52s)
  • 03:41 tstarling@deploy1001: Synchronized wmf-config/profiler.php: gerrit 480419 (duration: 00m 54s)
  • 03:40 tstarling@deploy1001: Synchronized wmf-config/CommonSettings.php: gerrit 480419 (duration: 00m 54s)
  • 03:38 tstarling@deploy1001: scap failed: average error rate on 9/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 03:36 tstarling@deploy1001: Synchronized wmf-config/arclamp.php: gerrit 480419 (duration: 00m 54s)
  • 03:32 tstarling@deploy1001: Synchronized php-1.33.0-wmf.13/LocalSettings.php: gerrit 480419 (duration: 00m 54s)
  • 03:29 tstarling@deploy1001: Synchronized php-1.33.0-wmf.14/LocalSettings.php: gerrit 480419 (duration: 00m 52s)
  • 03:27 tstarling@deploy1001: Synchronized src/XWikimediaDebug.php: gerrit 480419 (duration: 00m 55s)
  • 03:22 TimStarling: manually edited LocalSettings.php in php-1.33.0-wmf.13 and php-1.33.0-wmf.14 to use a relative path, like in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/480695/
  • 03:09 tstarling@deploy1001: Scap failed!: Call to mwscript eval.php returned: None
  • 01:15 mutante: scandium - puppet run now without errors for the first time for the parsoid testing role on stretch instead of jessie. nodejs 10. - @subbu @arlolra you can start using it to replace ruthenium (T201366)
  • 01:12 mutante: scandium - git cloning parsoid from gerrit - mediawiki/services/parsoid/deploy to /srv/deployment/parsoid/deploy ; still needs https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/484602/ (T201366)
  • 01:05 mutante: scandium - deleting /etc/apt/preferences.d/stretch_backports.pref ; apt-get remove nodejs ; apt-get install -t stretch-backports npm ; now has nodejs 10 and npm from backports installed (T201366)
  • 00:58 mutante: scandium - deleting /etc/apt/preferences.d/stretch_backports.pref ; apt-get remove nodejs
  • 00:52 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/ContentTranslation/scripts/purge-unpublished-drafts.php: SWAT T203059 ContentTranslation: Remove waitForReplication for dry-run (duration: 00m 55s)
  • 00:40 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T213851 Cirrus: Setup archive index shard/replica counts (duration: 00m 54s)
  • 00:05 gtirloni: T209527 disabled notifications for cloudstore100{8,9}

2019-01-22

  • 23:09 cstone: Updated payments-wiki from 7d4cd165d9 to ca7c280f3e
  • 22:22 twentyafterfour@deploy1001: Finished scap: testwikis wikis to 1.33.0-wmf.14 refs T206668 (duration: 43m 00s)
  • 21:39 twentyafterfour@deploy1001: Started scap: testwikis wikis to 1.33.0-wmf.14 refs T206668
  • 21:31 twentyafterfour@deploy1001: Synchronized wmf-config/CommonSettings.php: deploy I91e902 (duration: 01m 39s)
  • 20:26 gehel: resetting cassandra authentication on maps / eqiad
  • 20:25 milimetric@deploy1001: Finished deploy [analytics/refinery@d806b62]: Update jar versions on modified jobs (duration: 06m 48s)
  • 20:19 milimetric@deploy1001: Started deploy [analytics/refinery@d806b62]: Update jar versions on modified jobs
  • 20:07 onimisionipe@deploy1001: deploy aborted: Updating maps1002 to reflect latest changes (duration: 00m 01s)
  • 20:07 onimisionipe@deploy1001: Started deploy [kartotherian/deploy@e847e7b] (stretch): Updating maps1002 to reflect latest changes
  • 20:06 oblivian@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=kartotherian,name=eqiad
  • 20:06 volans: running cumin 'P{O:cache::upload} and A:eqiad' 'run-puppet-agent'
  • 20:03 gehel: running nodetool repair on system_auth for maps / eqiad servers
  • 19:30 arturo: T214299 additional reboot for cloudnet1003
  • 19:03 onimisionipe@deploy1001: Finished deploy [kartotherian/deploy@e847e7b] (stretch): Updating maps1002 to reflect latest changes (duration: 01m 02s)
  • 19:02 onimisionipe@deploy1001: Started deploy [kartotherian/deploy@e847e7b] (stretch): Updating maps1002 to reflect latest changes
  • 18:56 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@0bcdd3f]: Update mobileapps to 0aac268 (fix pronunciation detection in mobile-sections T214338) (duration: 04m 00s)
  • 18:52 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@0bcdd3f]: Update mobileapps to 0aac268 (fix pronunciation detection in mobile-sections T214338)
  • 18:36 arturo: T214299 reimaging cloudnet1003 as debian stretch
  • 18:00 milimetric@deploy1001: Finished deploy [analytics/refinery@b07451e]: Denormalized job updates for actor/comment refactor (duration: 17m 24s)
  • 17:43 milimetric@deploy1001: Started deploy [analytics/refinery@b07451e]: Denormalized job updates for actor/comment refactor
  • 17:42 milimetric@deploy1001: Finished deploy [analytics/refinery@372c0b6]: Denormalized job updates for actor/comment refactor (duration: 02m 11s)
  • 17:40 milimetric@deploy1001: Started deploy [analytics/refinery@372c0b6]: Denormalized job updates for actor/comment refactor
  • 17:30 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@afca813]: Add the constraintsRunCheck job definition T204031 (duration: 00m 55s)
  • 17:29 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@afca813]: Add the constraintsRunCheck job definition T204031
  • 16:12 XioNoX: deactivate local pref for peering sessions in es/knams - T204281
  • 15:45 akosiaris: upgrade zotero to latest chart version
  • 15:44 akosiaris@puppetmaster1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=zotero
  • 15:43 akosiaris@deploy1001: scap-helm zotero finished
  • 15:43 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 15:43 akosiaris@deploy1001: scap-helm zotero install -f zotero-values-eqiad.yaml -n production stable/zotero [namespace: zotero, clusters: eqiad]
  • 15:42 akosiaris@deploy1001: scap-helm zotero upgrade -f zotero-values-eqiad.yaml production stable/zotero [namespace: zotero, clusters: eqiad]
  • 15:34 addshore: addshore@mwmaint1002:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki yuewiktionary // T214400 (1 row)
  • 15:32 akosiaris@puppetmaster1001: conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=zotero
  • 15:31 akosiaris@puppetmaster1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=zotero
  • 15:30 addshore: addshore@mwmaint1002:~$ mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki yuewiktionary --site-group wiktionary // T214400
  • 15:30 akosiaris@deploy1001: scap-helm zotero finished
  • 15:30 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 15:30 akosiaris@deploy1001: scap-helm zotero upgrade -f zotero-values-codfw.yaml production stable/zotero [namespace: zotero, clusters: codfw]
  • 15:29 addshore: addshore@mwmaint1002:~$ mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki yuewiktionary --site-group wiktionary
  • 15:14 godog: turn on partitions.auto for rsyslog output to kafka - T214309
  • 15:14 marostegui: Add dbstore1003:3317 to tendril - T210478
  • 15:13 mbsantos@deploy1001: Finished deploy [kartotherian/deploy@bb30697] (stretch): monkey patching geoshapes service for maps100[3-4] (duration: 01m 45s)
  • 15:11 mbsantos@deploy1001: Started deploy [kartotherian/deploy@bb30697] (stretch): monkey patching geoshapes service for maps100[3-4]
  • 15:11 akosiaris@puppetmaster1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=zotero
  • 15:11 akosiaris@puppetmaster1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=zotero
  • 15:08 akosiaris@deploy1001: scap-helm zotero finished
  • 15:08 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 15:08 akosiaris@deploy1001: scap-helm zotero install -n production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 15:05 akosiaris@puppetmaster1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=zotero
  • 14:56 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@6cdece9]: Remove reviewers-by-blame from deployment cobalt no restart required (duration: 00m 11s)
  • 14:56 anomie@deploy1001: Synchronized php-1.33.0-wmf.13/includes/page/WikiPage.php: Add more temporary logging for T210739 (duration: 00m 47s)
  • 14:56 thcipriani@deploy1001: Started deploy [gerrit/gerrit@6cdece9]: Remove reviewers-by-blame from deployment cobalt no restart required
  • 14:54 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@6cdece9]: Remove reviewers-by-blame from deployment gerrit2001 no restart required (duration: 00m 10s)
  • 14:54 thcipriani@deploy1001: Started deploy [gerrit/gerrit@6cdece9]: Remove reviewers-by-blame from deployment gerrit2001 no restart required
  • 14:45 onimisionipe: starting init of postgres replication on maps1002 - T198622
  • 14:34 gehel: monkey patch kartotherian configuration to re-add proxy on maps100[34] - T214350
  • 14:18 akosiaris@deploy1001: scap-helm mathoid finished
  • 14:18 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 14:18 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 14:18 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml production stable/mathoid [namespace: mathoid, clusters: eqiad,codfw]
  • 14:17 akosiaris: upgrade mathoid to the latest chart version (0.0.15)
  • 14:17 akosiaris: upgrade blubberoid to the latest chart version (0.0.5)
  • 14:17 akosiaris@deploy1001: scap-helm mathoid finished
  • 14:17 akosiaris@deploy1001: scap-helm mathoid cluster staging completed
  • 14:17 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml --set resources.replicas=1 staging stable/mathoid [namespace: mathoid, clusters: staging]
  • 14:15 akosiaris@deploy1001: scap-helm mathoid finished
  • 14:15 akosiaris@deploy1001: scap-helm mathoid cluster staging completed
  • 14:15 akosiaris@deploy1001: scap-helm mathoid install -n staging -f mathoid-values.yaml --version=0.0.12 stable/mathoid [namespace: mathoid, clusters: staging]
  • 14:15 akosiaris@deploy1001: scap-helm mathoid install -n staging -f mathoid-values.yaml --version=0.0.12 stable/mathoid [namespace: mathoid, clusters: staging]
  • 14:14 akosiaris@deploy1001: scap-helm mathoid upgrade -f mathoid-values.yaml staging stable/mathoid [namespace: mathoid, clusters: staging]
  • 14:10 akosiaris@deploy1001: scap-helm blubberoid finished
  • 14:10 akosiaris@deploy1001: scap-helm blubberoid cluster staging completed
  • 14:10 akosiaris@deploy1001: scap-helm blubberoid install -n staging -f blubberoid-values.yaml stable/blubberoid [namespace: blubberoid, clusters: staging]
  • 14:04 akosiaris@deploy1001: scap-helm blubberoid finished
  • 14:04 akosiaris@deploy1001: scap-helm blubberoid cluster codfw completed
  • 14:04 akosiaris@deploy1001: scap-helm blubberoid cluster eqiad completed
  • 14:04 akosiaris@deploy1001: scap-helm blubberoid install -n production -f blubberoid-values.yaml stable/blubberoid [namespace: blubberoid, clusters: eqiad,codfw]
  • 14:04 akosiaris@deploy1001: scap-helm blubberoid upgrade -f blubberoid-values.yaml production stable/blubberoid [namespace: blubberoid, clusters: eqiad,codfw]
  • 13:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1098:3316 T210713 (duration: 00m 45s)
  • 13:55 godog: bump logstash kafka consumer threads - T214309
  • 13:41 marostegui: Stop replication in sync on dbstore1001:3316 and db1098:3316
  • 13:35 Amir1: running extensions/Wikibase/lib/maintenance/populateSitesTable.php on all.dblist (T211530 )
  • 13:30 Amir1: EU SWAT is finished
  • 13:29 ladsgroup@deploy1001: Synchronized langlist: SWAT: Add yue to langlist (T211530) (duration: 00m 46s)
  • 13:26 moritzm: installing apt security updates for jessie
  • 13:19 Amir1: ladsgroup@mwmaint1002:~$ mwscript namespaceDupes.php fawiki --fix (T213733)
  • 13:18 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add new synonyms for namespaces in Persian (fa) (T213733) (duration: 00m 47s)
  • 13:13 moritzm: installing apt security updates for trusty
  • 13:07 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable page issues improvements on English Wikipedia ([T210554]) (duration: 00m 46s)
  • 12:52 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use new logos in IS.php (T150618) (duration: 00m 47s)
  • 12:40 gehel: start stretch upgrade for maps1002 - T198622
  • 12:36 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Upload HD logos for several projects (T150618) (duration: 00m 46s)
  • 12:29 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove ability for bureaucrats on outreachwiki to remove bureaucrat flag (T214133) (duration: 00m 46s)
  • 12:21 moritzm: installing apt security updates for stretch
  • 12:20 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create extra namespace in kawiktionary (T212956) (duration: 00m 46s)
  • 12:13 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable transwiki user group on ne.wikipedia (T214036) (duration: 00m 47s)
  • 12:09 jynus: running mariabackup on dbstore1001:s1
  • 12:02 Lucas_WMDE: tried and failed to deploy patch for T212118
  • 10:55 marostegui: Deploy schema change on db1098:3316 - T210713
  • 10:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 T210713 (duration: 00m 45s)
  • 10:20 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T204031 wikidata: post edit constraint jobs on 25% of edits (duration: 00m 45s)
  • 10:15 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209504 Decrease WBQualityConstraintsTypeCheckMaxEntities from 300 to 150 (duration: 00m 47s)
  • 10:08 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T204031 wikidata: post edit constraint jobs on 10% of edits (duration: 00m 47s)
  • 09:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 T210713 (duration: 00m 47s)
  • 09:56 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=maps,name=maps1003.eqiad.wmnet
  • 09:55 gehel: repooling maps1003 after upgrade to stretch - T198622
  • 09:40 marostegui: Deploy schema change on db1096:3316 - T210713
  • 09:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 T210713 (duration: 00m 48s)
  • 09:23 jynus: stop upgrade and restart db1097
  • 08:55 dcausse: elasticsearch: closing indices in search-chi@(eqiad|codfw) moved to other elastic instances (T214052)
  • 08:53 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097 (duration: 00m 45s)
  • 08:42 moritzm: installing policykit-1 security updates on trusty
  • 08:26 marostegui: Deploy schema change on dbstore1001:3316 - T210713
  • 08:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3317 T210478 (duration: 00m 48s)
  • 08:14 marostegui: Compress s7 on dbstore1003 - T210478
  • 06:42 marostegui: Deploy schema change on db1078 (s3 master) - T85757
  • 06:36 marostegui: Stop MySQL on db1090:3317 to clone dbstore1003 - T210478
  • 06:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3317 T210478 (duration: 00m 49s)
  • 05:45 kartik@deploy1001: Finished deploy [cxserver/deploy@e0ca16b]: Update cxserver to c5ff0bf (duration: 04m 15s)
  • 05:40 kartik@deploy1001: Started deploy [cxserver/deploy@e0ca16b]: Update cxserver to c5ff0bf
  • 02:17 onimisionipe: restarting tilerator on maps100[1-2]
  • 00:38 chaomodus: stat1007 nagios-srpe-server was off and alerted, restarting fixed it

2019-01-21

  • 22:33 krinkle@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/TemplateData/includes/api/ApiTemplateData.php: I7647ddfc47 - T213953 (duration: 00m 47s)
  • 19:35 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2040 (duration: 00m 45s)
  • 19:23 jynus: mysql.py -h db1115 zarcillo -e "UPDATE masters SET instance = 'db2047' WHERE section = 's7' and dc = 'codfw'" T214264
  • 18:55 jynus: stop and upgrade db2040 T214264
  • 18:52 onimisionipe: pool maps1003 - postgresql sql lag issues has been fixed
  • 18:24 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2040, promote db2047 to s7 master (duration: 00m 46s)
  • 17:51 jynus: stop and apply puppet changes to db2047 T214264
  • 17:44 jynus: stop replication on db2040 for master switch T214264
  • 17:16 jynus: stop and upgrade db2054
  • 16:03 arturo: T214303 reimaging/renaming labtestneutron2002.codfw.wmnet (jessie) to cloudnet2002-dev.codfw.wmnet (stretch)
  • 15:58 onimisionipe: reinitializing slave replication(postgres) on maps1003
  • 15:52 jynus: stop and upgrade db2061
  • 15:19 dcausse: closing frwikiquote_* indices on elasticsearch search-chi@codfw (T214052)
  • 15:11 dcausse: closing frwikiquote_* indices on elasticsearch search-chi@eqiad (T214052)
  • 13:58 marostegui: Compress enwiki on dbstore1003:3311 - T210478
  • 12:36 jijiki: Restarting memcached on mc1025 to apply '-R 200' - T208844
  • 11:25 onimisionipe: depool maps1003 to fix replication lag issues
  • 10:51 elukey: disable puppet fleetwide to ease the merge/deploy of a puppet admin module change - T212949
  • 10:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 - T85757 (duration: 00m 44s)
  • 10:33 jynus: upgrade and restart db2047 T214264
  • 10:26 addshore@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/ArticlePlaceholder/includes/AboutTopicRenderer.php: T213739 Pass a usageAccumulator to SidebarGenerator (duration: 00m 47s)
  • 10:19 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 00m 45s)
  • 09:57 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1089 (duration: 00m 45s)
  • 09:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly Repool db1089 T210478 (duration: 00m 45s)
  • 09:30 marostegui: Compress a few tables on dbstore1003:3315 - T210478
  • 08:35 marostegui: Stop replication db1077 to deploy schema change - T85757
  • 08:32 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 - T85757 (duration: 00m 46s)
  • 08:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 - T85757 (duration: 00m 48s)
  • 08:10 moritzm: installing OpenSSL security updates
  • 07:39 marostegui: Stop replication on db1124:3313 to fix triggers - T85757
  • 07:00 marostegui: Stop MySQL on db1089 to clone dbstore1003 - T210478
  • 07:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 T210478 (duration: 00m 47s)
  • 06:54 marostegui: Deploy schema change on db1123 - T85757
  • 06:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 - T85757 (duration: 00m 50s)
  • 06:47 marostegui: Drop tag_summary table from db1023, db1077, db1075 and db1078 T212255
  • 06:45 vgutierrez@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5010.eqsin.wmnet
  • 06:32 marostegui: Drop tag_summary table from db1095:3313 - T212255
  • 06:27 marostegui: Drop tag_summary table from dbstore1002:s3 - T212255
  • 06:12 marostegui: Drop tag_summary table from s3 codfw - T212255
  • 06:09 marostegui: tag_summary table from s8 - T212255

2019-01-20

  • 15:13 marostegui: Force WriteBack on db2040 - T214264
  • 01:07 cdanis: cdanis@wdqs1004.eqiad.wmnet /var/log/wdqs % sudo service wdqs-blazegraph restart

2019-01-19

  • 22:12 ariel@deploy1001: Finished deploy [dumps/dumps@ab79bbb]: multistream dumps in parallel, recombine gz and multistream without decompression (duration: 00m 03s)
  • 22:12 ariel@deploy1001: Started deploy [dumps/dumps@ab79bbb]: multistream dumps in parallel, recombine gz and multistream without decompression
  • 20:34 gtirloni: upgraded and rebooted labstore200{3,4}
  • 12:34 onimisionipe: pool maps1003 - stretch migration is complete T198622
  • 12:08 elukey: run 'start all slaves' on dbstore1002 after crash
  • 08:42 marostegui: Fixing dbstore1002 x1 replication T213670
  • 07:36 elukey: restart pdfrender on scb1004
  • 05:55 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@af21320]: test swapping venv build to scap fetch/script step (duration: 00m 14s)
  • 05:55 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@af21320]: test swapping venv build to scap fetch/script step
  • 05:55 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@af21320]: test swapping venv build to scap fetch/script step (duration: 00m 15s)
  • 05:55 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@af21320]: test swapping venv build to scap fetch/script step
  • 05:46 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@af21320]: test swapping venv build to scap fetch/script step (duration: 00m 13s)
  • 05:46 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@af21320]: test swapping venv build to scap fetch/script step
  • 05:25 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@af21320]: bump discovery analytics to latest (duration: 00m 17s)
  • 05:25 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@af21320]: bump discovery analytics to latest
  • 05:18 legoktm@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/JsonConfig/includes/JCCache.php: Revert "JCCache: Explicit load the main slot to avoid API warnings" - T214179 (duration: 00m 58s)

2019-01-18

  • 23:57 mobrovac@deploy1001: Finished deploy [restbase/deploy@f24d681]: Deploy latest version to restbase1016 (was out of rotation), take #3 (duration: 01m 01s)
  • 23:56 mobrovac@deploy1001: Started deploy [restbase/deploy@f24d681]: Deploy latest version to restbase1016 (was out of rotation), take #3
  • 23:55 mobrovac@deploy1001: Finished deploy [restbase/deploy@f24d681]: Deploy latest version to restbase1016 (was out of rotation), take #2 (duration: 00m 18s)
  • 23:54 mobrovac@deploy1001: Started deploy [restbase/deploy@f24d681]: Deploy latest version to restbase1016 (was out of rotation), take #2
  • 23:53 mobrovac@deploy1001: Finished deploy [restbase/deploy@f24d681]: Deploy latest version to restbase1016 (was out of rotation) - T212418 (duration: 00m 34s)
  • 23:53 mobrovac@deploy1001: Started deploy [restbase/deploy@f24d681]: Deploy latest version to restbase1016 (was out of rotation) - T212418
  • 20:47 mobrovac: restbase/cassandra bootstrap restbase1016-c - T212418
  • 20:47 mobrovac: restbase/cassandra bootstrap restbase1016-c
  • 17:24 godog: bootstrap cassandra-b on restbase1016 - T212418
  • 17:06 marostegui: Reload haproxy on dbproxy1009 after rack a2 maintenance
  • 16:14 arturo: T214167 reimage+rename labtestneutron2001.codfw.wmnet (jessie) to cloudnet2001-dev.codfw.wmnet (stretch)
  • 15:36 moritzm: rebooting mwdebug servers in codfw to pick up SSBD-enabled qemu
  • 15:27 moritzm: rebooting elnath to pick up SSBD-enabled qemu
  • 13:41 marostegui: reload haproxy on dbproxy1004
  • 13:18 godog: start cassandra-a on restbase1016 - T212418
  • 13:07 mbsantos@deploy1001: Finished deploy [kartotherian/deploy@0d11a2b] (stretch): Updating stretch instance with latest code, maps1003 have wrong dependencies installed (duration: 00m 45s)
  • 13:06 mbsantos@deploy1001: Started deploy [kartotherian/deploy@0d11a2b] (stretch): Updating stretch instance with latest code, maps1003 have wrong dependencies installed
  • 12:50 moritzm: uploaded ferm 2.4-1+wmf1 to buster-wikimedia (T213527)
  • 11:46 moritzm: copied prometheus-rsyslog-exporter from stretch-wikimedia to buster-wikimedia
  • 11:09 marostegui: Deploy schema change on db2039 (s6 codfw master) - T210713
  • 10:54 marostegui: Deploy schema change on dbstore2001:3316 - T210713
  • 10:42 jynus: killing and removing data from db1118
  • 10:41 marostegui: Deploy schema change on db2076 - T210713
  • 10:29 vgutierrez: restarting pybal in lvs2002 - T214072
  • 10:23 marostegui: Deploy schema change on db2087:3316 - T210713
  • 10:23 vgutierrez: restarting pybal in lvs2005 - T214072
  • 10:02 marostegui: Add dbstore1003:3315 to zarcillo - T210478
  • 09:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 - T210478 (duration: 00m 45s)
  • 09:57 marostegui: Add dbstore1003:3315 to tendril - T210478
  • 09:53 marostegui: Deploy schema change on db2089 - T210713
  • 09:35 marostegui: Deploy schema change on db2067 - T210713
  • 09:29 _joe_: uploading python{,3}-pygerrit2 to stretch-wikimedia, T214149
  • 09:23 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Add migrated wikis from s3 to s5 to codfw config T184805 (duration: 00m 45s)
  • 09:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1113:3316 after mysql upgrade (duration: 00m 46s)
  • 08:12 godog: depool and take snapshots of prometheus data on prometheus2003 to test v2 conversion - T187987
  • 07:31 moritzm: rolling restart of AQS to pick up OpenSSL security updates for nodejs
  • 07:30 marostegui: Stop MySQL on db1113:3315 and db1113:3316 to clone dbstore1003 and for mysql and kernel upgrade
  • 07:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1113:3316 for mysql upgrade (duration: 00m 45s)
  • 07:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 - T210478 (duration: 00m 46s)
  • 07:16 moritzm: installing OpenSSL security updates
  • 06:54 marostegui: Drop table tag_summary from s7 - T212255
  • 06:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1075 and db1103 after DC hw maintenance (duration: 00m 44s)
  • 06:46 marostegui: Deploy schema change on dbstore1002:s3 - T85757
  • 06:29 marostegui: Deploy schema change on db1075 - T85757
  • 06:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool DBs on A2 rack T213748 (duration: 00m 47s)
  • 00:00 ejegg: updated payments-wiki from c455bbc6bb to 7d4cd165d9

2019-01-17

  • 23:02 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@6b344ca]: Update mobileapps to 258d76b page summary changes, 2nd try (duration: 02m 03s)
  • 23:00 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@6b344ca]: Update mobileapps to 258d76b page summary changes, 2nd try
  • 19:29 catrope@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/GrowthExperiments/: Make welcome survey C unescapable (T213958) (duration: 00m 52s)
  • 19:17 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Update groupOverrides for Serbian wikis (T213055, T213059, T213063, T213065, T213679, T213680, T213681, T213682, T213684, T213685, T213686, T213687, T213824, T213825, T213826, T213827, T213828, T213829, T213830, T213832) (duration: 00m 53s)
  • 19:02 ppchelko@deploy1001: Finished deploy [restbase/deploy@f24d681]: Update recommendation api endpoints (duration: 20m 26s)
  • 18:42 ppchelko@deploy1001: Started deploy [restbase/deploy@f24d681]: Update recommendation api endpoints
  • 18:22 vgutierrez: running ipvsadm -D -t 10.2.1.29:1968 in lvs2003 - T214041
  • 18:19 vgutierrez: running ipvsadm -D -t 10.2.1.29:1968 in lvs2006 - T214041
  • 18:18 bmansurov@deploy1001: Finished deploy [recommendation-api/deploy@5ba7582]: Update to I25c97e (duration: 05m 36s)
  • 18:12 bmansurov@deploy1001: Started deploy [recommendation-api/deploy@5ba7582]: Update to I25c97e
  • 17:52 elukey: re-enable eventlogging mysql clients and db1108's el replication after db1107 maintenance
  • 17:38 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: ConstraintsCheckJobs on wikidatawiki (5% of edits) T204031 (duration: 00m 52s)
  • 17:25 dcausse: restarting mjolnir services on all elastic* nodes
  • 17:19 dcausse@deploy1001: Finished deploy [search/mjolnir/deploy@85aec7a]: fix multi-instances support (duration: 03m 42s)
  • 17:15 dcausse@deploy1001: Started deploy [search/mjolnir/deploy@85aec7a]: fix multi-instances support
  • 16:57 jforrester@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/VisualEditor/modules/ve-mw/: T213922: Revert 48db45df7602 for wmf.12 (duration: 00m 52s)
  • 16:56 jforrester@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/VisualEditor/modules/ve-mw/: T213922: Revert 48db45df7602 for wmf.13 (duration: 00m 51s)
  • 16:46 dcausse@deploy1001: Finished deploy [search/mjolnir/deploy@42414ca]: add support for multi-instances setup (duration: 04m 59s)
  • 16:45 paravoid: updating ps1-a3-eqiad's SNMP communities to the new ones
  • 16:41 dcausse@deploy1001: Started deploy [search/mjolnir/deploy@42414ca]: add support for multi-instances setup
  • 16:28 fsero: uncordoned kubernetes1001
  • 16:27 fsero@puppetmaster1001: conftool action : set/pooled=yes; selector: name=kubernetes1001.eqiad.wmnet
  • 16:19 moritzm: rebooting roentgenium (failoid node in eqiad) to enable SSBD-enabled qemu
  • 16:18 cmjohnson1: ps1-a2-eqiad removing redundant power from side A to replace blown fuse
  • 16:16 moritzm: rebooting tureis (failoid node in codfw) to enable SSBD-enabled qemu
  • 15:12 moritzm: rebooting archiva1001 (archiva.wikimedia.org) to enable SSBD-enabled qemu
  • 14:49 moritzm: rebooting darmstadtium (docker registry) to enable SSBD-enabled qemu
  • 14:36 jbond42: rolling out update for debdeploy 0.0.99.6-1 -> 0.0.99.7-1 T207845
  • 14:24 anomie: Restarting migrateActors.php on s3
  • 14:19 marostegui: Drop empty frimpressions database from m2 - T213973
  • 14:04 vgutierrez: running ipvsadm -D -t 10.2.2.29:1968 in lvs1016 - T214041
  • 14:03 vgutierrez: running ipvsadm -D -t 10.2.2.29:1968 in lvs1006 - T214041
  • 14:01 gehel: pooling maps1004 (first time after stretch upgrade) - T198622
  • 13:46 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: dc=.*,service=.*,cluster=kubernetes,name=kubernetes1001.eqiad.wmnet
  • 13:38 gehel: starting upgrade to stretch for maps1003 - T198622
  • 12:59 addshore: swat done!
  • 12:58 fsero@puppetmaster1001: conftool action : set/pooled=no; selector: name=kubernetes1001.eqiad.wmnet
  • 12:58 addshore@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/Wikibase/view/resources/jquery/wikibase/jquery.wikibase.badgeselector.js: T213998 Fix js type error when adding badges to items (duration: 00m 53s)
  • 12:53 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: T210381: [cirrus] Enable CirrusSearchCrossClusterSearch (duration: 00m 51s)
  • 12:46 dcausse@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/UploadWizard/: T214007: Don't reuse existing input object (duration: 00m 53s)
  • 12:41 gtirloni: imported nfsd-ldap_1.2+deb9u1 in stretch-wikimedia (T209527)
  • 12:41 fsero: poweroff kubernetes1001 - T213859
  • 12:40 dcausse@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/CirrusSearch/: Hack around cross cluster search bug (duration: 00m 59s)
  • 12:34 gehel: shutting down relforge1001 for PDU swap - T213859
  • 12:33 akosiaris@deploy1001: Finished deploy [citoid/deploy@269c9c7]: (no justification provided) (duration: 00m 48s)
  • 12:32 akosiaris@deploy1001: Started deploy [citoid/deploy@269c9c7]: (no justification provided)
  • 12:29 dcausse@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/CirrusSearch/: Hack around cross cluster search bug (duration: 01m 00s)
  • 12:25 godog: poweroff restbase1010 / restbase1011 before A3 maint - T213859
  • 12:19 jynus: killing migrateActors.php --wiki=ptwiki on mwmaint, was using outdated db config T188327
  • 12:17 jijiki: poweroff rdb1005.eqiad.wmnet before A3 maint - T213859
  • 12:11 godog: poweroff ms-be1019 / ms-be1044 / ms-be1045 before A2 maint - T213748
  • 12:09 mvolz@deploy1001: scap-helm zotero finished
  • 12:09 mvolz@deploy1001: scap-helm zotero cluster codfw completed
  • 12:09 mvolz@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 12:08 elukey: stop mariadb and shutdown db1107 to ease rack a2 maintenance
  • 12:04 mvolz@deploy1001: scap-helm zotero finished
  • 12:04 mvolz@deploy1001: scap-helm zotero cluster eqiad completed
  • 12:04 mvolz@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 11:56 mvolz@deploy1001: scap-helm zotero finished
  • 11:56 mvolz@deploy1001: scap-helm zotero cluster staging completed
  • 11:56 mvolz@deploy1001: scap-helm zotero upgrade staging -f zotero-values-staging.yaml --version=0.0.1 stable/zotero [namespace: zotero, clusters: staging]
  • 11:55 arturo: T209527 copy nfsd-ldap between jessie-wikimedia and stretch-wikimedia in reprepro. It will require a rebuild though bc updated build-deps/deps
  • 11:55 mvolz@deploy1001: scap-helm zotero upgrade staging -f zotero-values-staging.yaml stable/zotero [namespace: zotero, clusters: staging]
  • 11:43 marostegui: Poweroff db1082 db1081 db1080 db1079 db1075 db1074 es1012 es1011 - T213748
  • 11:36 mvolz@deploy1001: scap-helm zotero finished
  • 11:36 mvolz@deploy1001: scap-helm zotero cluster codfw completed
  • 11:36 mvolz@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 11:16 onimisionipe: shutdown elastic103[0-5] to prepare for T213859
  • 11:09 elukey: stop eventlogging on eventlog1002 and eventlogging replication on db1108 as prep step for db1107 maintenance
  • 10:55 marostegui: Lag will be generated on labs due to maintenance on sanitarium db masters
  • 10:54 marostegui: Stop MySQL on db1082 db1081 db1080 db1079 db1075 db1074 es1012 es1011 - T213748
  • 10:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool DBs on A2 rack T213748 (duration: 00m 54s)
  • 10:39 moritzm: installing libcaca security updates
  • 10:30 arturo: T213859 icinga downtime cloudservices1004 for 1 day
  • 10:29 moritzm: installing ruby-loofah security updates
  • 10:09 marostegui: Stop MySQL on db1103:3312 and db1103:3314, also poweroff the server - T213859
  • 10:08 moritzm: installing krb5 security updates on trusty
  • 10:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103 - T213859 (duration: 00m 53s)
  • 09:59 marostegui: Poweroff dbproxy1001 dbproxy1002 dbproxy1003 for a3 maintenance - T213859
  • 09:25 marostegui: Poweroff dbstore1003 for hw maintenance T213859
  • 09:24 moritzm: power off graphite1003 for later hw maintenance (T213859)
  • 09:18 marostegui: Deploy schema change on db1095:3313 - T85757
  • 09:02 vgutierrez: rolling NIC firmware upgrade cp[1081-1090] - T203194
  • 08:42 jijiki: Enabling puppet on rdb1005 and switch redis::misc::master to rdb1006 - T213859
  • 08:37 moritzm: installing remaining systemd security updates on stretch
  • 08:32 jijiki: Restarting nutcracker on scb100* for 484572 - T213859
  • 08:32 jynus: stop, upgrade and restart db1075
  • 08:31 marostegui: Deploy schema change on s3 codfw, lag will be generated - T85757
  • 08:28 marostegui: Drop table tag_summary from enwiki - T212255
  • 08:24 jijiki: Disabling puppet on rdb1005 and switch redis::misc::master to rdb1006 - T213859
  • 07:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1123 (duration: 00m 53s)
  • 07:20 marostegui: Change thread_pool_stall_limit on db1075 and db1078 - T213858
  • 07:18 marostegui: Enable GTID on db1075 - T213858
  • 07:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove s3 ready only T213858 (duration: 00m 30s)
  • 07:03 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Switchover s3master eqiad from db1075 to db1078 T213858 (duration: 00m 30s)
  • 07:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Set s3 on read-only T213858 (duration: 00m 31s)
  • 07:00 marostegui: Start s3 failover T213858
  • 06:30 marostegui: Disable puppet on db1075 and db1078 - T213858
  • 06:26 marostegui: Enable GTID back on all hosts but db1075 db1078 - T213858
  • 06:19 marostegui: Change s3 topology to get ready for s3 failover - T213858
  • 06:14 marostegui: Disable gtid on s3 hosts - T213858
  • 06:10 marostegui: Downtime s3 hosts for 2 hours - T213858
  • 04:12 ppchelko@deploy1001: Finished deploy [mobileapps/deploy@89c4d8d]: revert new summary (duration: 01m 55s)
  • 04:10 ppchelko@deploy1001: Started deploy [mobileapps/deploy@89c4d8d]: revert new summary
  • 04:02 cdanis@deploy1001: Started restart [parsoid/deploy@4b82683]: (no justification provided)

2019-01-16

  • 23:25 ppchelko@deploy1001: Finished deploy [recommendation-api/deploy@0ff39e2]: Deployment attempt with decreased worker count (duration: 04m 08s)
  • 23:21 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@0ff39e2]: Deployment attempt with decreased worker count
  • 23:10 Krinkle: krinkle@tungsten:/srv/: rm -rf xhprof; for T196406
  • 21:35 ppchelko@deploy1001: Finished deploy [recommendation-api/deploy@c1b6b32]: Rollback update to 1a1f824 (duration: 01m 59s)
  • 21:33 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@c1b6b32]: Rollback update to 1a1f824
  • 21:29 ppchelko@deploy1001: deploy aborted: log (duration: 00m 02s)
  • 21:29 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@da83637]: log
  • 21:28 bmansurov@deploy1001: Finished deploy [recommendation-api/deploy@da83637]: Update to 1a1f824 (duration: 06m 14s)
  • 21:22 bmansurov@deploy1001: Started deploy [recommendation-api/deploy@da83637]: Update to 1a1f824
  • 21:17 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@6b344ca]: Update mobileapps to 258d76b page summary changes (duration: 06m 31s)
  • 21:10 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@6b344ca]: Update mobileapps to 258d76b page summary changes
  • 20:20 dduvall@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.13 (duration: 00m 51s)
  • 20:19 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.13
  • 19:48 gehel: switching wdqs categories traffic to new second instance, puppet will be disabled during the operation on all wdqs nodes - T213212
  • 19:29 thcipriani: restarting ci jenkins for upgrade
  • 19:13 thcipriani: restarting gerrit on cobalt for 2.15.8 upgrade
  • 19:12 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@cec7995]: Gerrit to 2.15.8 on cobalt (duration: 00m 10s)
  • 19:12 thcipriani@deploy1001: Started deploy [gerrit/gerrit@cec7995]: Gerrit to 2.15.8 on cobalt
  • 19:09 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@cec7995]: Gerrit to 2.15.8 on gerrit2001 only (duration: 00m 11s)
  • 19:09 thcipriani@deploy1001: Started deploy [gerrit/gerrit@cec7995]: Gerrit to 2.15.8 on gerrit2001 only
  • 19:04 thcipriani: starting gerrit upgrade to 2.15.8
  • 18:56 mutante: upgraded jenkins version for jessie and stretch in apt.wikimedia.org to latest LTS
  • 18:16 addshore: deploy slot done
  • 18:13 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: ConstraintsCheckJobs enabled on wikidatawiki (1% of edits) T204031 (duration: 00m 51s)
  • 18:07 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@0aa107a]: Re-deploy for fixing vars.sh (duration: 11m 49s)
  • 18:03 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: ConstraintsCheckJobs enabled on testwikidatawiki T204031 (duration: 00m 52s)
  • 17:55 smalyshev@deploy1001: Started deploy [wdqs/wdqs@0aa107a]: Re-deploy for fixing vars.sh
  • 17:53 jynus: stop upgrade and restart db1111
  • 17:36 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: [cirrus] Start using replica group settings (take 2) (T210381) (duration: 00m 51s)
  • 17:35 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [cirrus] Start using replica group settings (take 2) (T210381) (duration: 00m 51s)
  • 17:22 vgutierrez: rolling NIC firmware upgrade cp[1077-1080] - T203194
  • 17:18 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: EditorJourney: Enable data collection for viwiki T213348 (duration: 00m 52s)
  • 17:07 anomie@deploy1001: Synchronized php-1.33.0-wmf.12/includes/page/WikiPage.php: Add temporary logging for T210739 (duration: 00m 53s)
  • 17:05 vgutierrez: upgrading NIC firmware in cp1076 - T203194
  • 17:01 gehel@deploy1001: Finished deploy [wdqs/wdqs@6685dc0]: multi instance fixes (duration: 00m 27s)
  • 17:01 gehel@deploy1001: Started deploy [wdqs/wdqs@6685dc0]: multi instance fixes
  • 16:58 gehel@deploy1001: Finished deploy [wdqs/wdqs@6685dc0]: multi instance fixes (duration: 10m 29s)
  • 16:53 jynus: stop upgrade and restart db1112
  • 16:47 gehel@deploy1001: Started deploy [wdqs/wdqs@6685dc0]: multi instance fixes
  • 16:45 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 (duration: 00m 52s)
  • 16:45 vgutierrez: upgrading NIC firmware on cp1075 - T203194
  • 16:08 jynus: upgrade and stop db1123
  • 16:02 jbond42: Import new debdeploy 0.0.99.7 packages for trusty T207845
  • 15:59 jbond42: Import new debdeploy 0.0.99.7 packages for buster T207845
  • 15:59 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 (duration: 00m 52s)
  • 15:58 otto@deploy1001: Finished deploy [analytics/superset/deploy@f73b897]: bump to 0.26.3-wikimedia2 with chart format string fix (duration: 00m 36s)
  • 15:57 otto@deploy1001: Started deploy [analytics/superset/deploy@f73b897]: bump to 0.26.3-wikimedia2 with chart format string fix
  • 15:56 jbond42: Import new debdeploy 0.0.99.7 packages for jessie T207845
  • 15:41 jbond42: "Import new debdeploy 0.0.99.7 packages for stretch T207845
  • 15:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 T209815 (duration: 00m 52s)
  • 15:12 addshore: addshore@mwmaint1002:~$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=labswiki Matthias_Geisler // T213928
  • 14:56 jynus: stop upgrade db1125 (this may cause temp. lag on labsdb hosts for s7, s6, s4, s2)
  • 14:35 otto@deploy1001: Started deploy [analytics/superset/deploy@UNKNOWN]: attempt to deploy 0.26.3-wikimedia1
  • 14:29 jynus: stop upgrade db1124 (this may have temp. lag on labsdb hosts for s1, s3, s5, s8)
  • 14:20 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool es1019 fully (duration: 00m 52s)
  • 14:05 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool es1019 with low load (duration: 00m 52s)
  • 13:15 marostegui: Stop MySQL on db1078 and power it off for firmware update - T209815
  • 13:15 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 T209815 (duration: 00m 52s)
  • 13:12 dcausse: eu SWAT done
  • 13:06 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 fully (duration: 00m 52s)
  • 12:41 addshore@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/WikibaseQualityConstraints: gerrit:484654 T204031 T204022 Fix constraintsRunCheck Job class & test (duration: 00m 54s)
  • 12:40 addshore@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/WikibaseQualityConstraints: gerrit:484654 T204031 T204022 Fix constraintsRunCheck Job class & test (duration: 00m 57s)
  • 12:25 reedy@deploy1001: Synchronized wmf-config/throttle.php: T213848 (duration: 00m 53s)
  • 12:21 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy the FileExporter as a beta feature on all Wikimedia wikis (T213425) (duration: 00m 53s)
  • 12:12 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Partial Blocks on itwiki (T210444) (duration: 00m 53s)
  • 12:12 jynus: upgrade and restart db1095
  • 11:02 fsero: draining kubernetes1001 for maintenance T213859
  • 10:59 addshore: slot done
  • 10:59 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgWBQualityConstraintsEnableConstraintsCheckJobs false (duration: 00m 51s)
  • 10:53 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgWBQualityConstraintsEnableConstraintsCheckJobs true wd (duration: 00m 52s)
  • 10:48 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgWBQualityConstraintsEnableConstraintsCheckJobs true testwd (duration: 00m 52s)
  • 10:38 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wikidatawiki, wgWBQualityConstraintsEnableConstraintsCheckJobsRatio 1% T204031 gerrit:484621 (duration: 00m 52s)
  • 10:28 godog: restart rsyslog on wezen, tls listener stuck
  • 10:25 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 with low load (duration: 00m 51s)
  • 10:19 elukey: executed kafka preferred-replica-election on the logging Kafka cluster as attempt to spread load more uniformly
  • 10:19 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: testwikidatawiki, wgWBQualityConstraintsEnableConstraintsCheckJobsRatio 100 T204031 gerrit:484621 (duration: 00m 52s)
  • 10:18 addshore@deploy1001: sync-file aborted: testwikidatawiki, wgWBQualityConstraintsEnableConstraintsCheckJobsRatio 100 T204031 gerrit:484621 (duration: 00m 02s)
  • 10:14 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: testwikidatawiki, wgWBQualityConstraintsEnableConstraintsCheckJobsRatio 50 T204031 gerrit:484621 (duration: 00m 52s)
  • 10:13 addshore@deploy1001: sync-file aborted: testwikidatawiki, wgWBQualityConstraintsEnableConstraintsCheckJobsRatio 50 T204031 gerrit:484621 (duration: 00m 00s)
  • 10:03 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY, gerrit:484621 (duration: 00m 52s)
  • 09:53 godog: upgrade controller firmware on ms-be1016 - T213856
  • 09:47 jynus: upgrade and restart db1077
  • 09:42 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 (duration: 00m 52s)
  • 09:29 marostegui: Stop s3 actor-migration script in order to allow s3 to catch up and to avoid lag during the failover - T188327 T213858
  • 09:17 godog: powercycle ms-be1016 - T213856
  • 09:16 marostegui: Stop replication in sync on dbstore1002:x1 and db2034 - T213670
  • 09:10 dcausse: T210381: elasticsearch search cluster, creating completion suggester indices on psi&omega elastic instances in eqiad&codfw
  • 09:00 godog: test roll-restart rsyslog on mw hosts in eqiad - T211124
  • 08:58 akosiaris@deploy1001: scap-helm zotero finished
  • 08:58 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 08:58 akosiaris@deploy1001: scap-helm zotero install -n production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 08:57 marostegui: Re-point m3-master from dbproxy1003 to dbproxy1008 - T213865
  • 08:53 moritzm: installing systemd security updates for stretch
  • 08:53 akosiaris: depool zotero eqiad for helm release cleanup
  • 08:47 akosiaris: repool zotero in codfw
  • 08:42 filippo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Default to new logging infrastructure - T211124 (duration: 01m 05s)
  • 08:40 akosiaris@deploy1001: scap-helm zotero finished
  • 08:40 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 08:40 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 08:30 akosiaris@deploy1001: scap-helm zotero finished
  • 08:30 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 08:30 akosiaris@deploy1001: scap-helm zotero install -n production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 08:25 akosiaris@deploy1001: scap-helm zotero finished
  • 08:25 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 08:25 akosiaris@deploy1001: scap-helm zotero install -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 08:24 marostegui: Drop table tag_summary from s4 - T212255
  • 08:19 elukey: convert aria tables to innodb on dbstore1002 - T213706
  • 08:18 akosiaris: depool codfw zotero for helm release cleanups
  • 08:15 marostegui: Upgrade MySQL on db2043 (s3 codfw master)
  • 08:11 elukey: drop unneeded tables from the staging db on dbstore1002 according to T212493#4883535
  • 07:36 vgutierrez: powercycling cp1088 - T203194
  • 07:27 marostegui: Drop table tag_summary from s2 - T212255
  • 07:14 marostegui: Upgrade MySQL on db2050 and db2036
  • 06:07 SMalyshev: started transfer wdqs2005->2006
  • 06:06 marostegui: Deploy schema change on db1067 (s1 primary master) - T85757
  • 06:01 SMalyshev: depooling wdq2005 and wdqs2006 for T213854
  • 01:02 SMalyshev: repooled wdqs200[45] for now, 2006 still not done, will get to it later today
  • 00:15 mobrovac@deploy1001: Finished deploy [restbase/deploy@a04ebdd]: Restart RESTBase to pick up the fact that restbase1016 is not there - T212418 (duration: 21m 34s)

2019-01-15

  • 23:54 mobrovac@deploy1001: Started deploy [restbase/deploy@a04ebdd]: Restart RESTBase to pick up the fact that restbase1016 is not there - T212418
  • 22:53 tzatziki: removing one file for legal compliance
  • 22:50 jforrester@deploy1001: Synchronized php-1.33.0-wmf.13/extensions/WikibaseMediaInfo/resources/filepage/CaptionsPanel.js: Hot-deploy Ibb1f763f to unbreak setting captions on WikibaseMediaInfo (duration: 00m 51s)
  • 22:39 SMalyshev: repooled wdqs1008
  • 21:49 XioNoX: re-activate BGP to Zayo on cr1-eqiad - T212791
  • 21:39 SMalyshev: depooling wdqs2005 for T213854
  • 21:23 mutante: contint1001 rmdir /srv/org/wikimedia/integration/coverage ; rmdir /srv/org/wikimedia/integration/logs (T137890)
  • 21:21 mutante: doc.wikimedia.org httpd config has been removed from contint1001, is now on doc1001
  • 21:13 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.13
  • 21:09 dduvall@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.13 and rebuild l10n cache (duration: 32m 42s)
  • 20:36 dduvall@deploy1001: Started scap: testwiki to php-1.33.0-wmf.13 and rebuild l10n cache
  • 20:33 dduvall@deploy1001: Pruned MediaWiki: 1.33.0-wmf.8 (duration: 03m 04s)
  • 20:30 dduvall@deploy1001: Pruned MediaWiki: 1.33.0-wmf.6 (duration: 09m 15s)
  • 19:36 SMalyshev: started copying wdqs1008->wdqs2004 for T213854
  • 19:28 SMalyshev: depooling wdqs1008 and wdqs2004 for DB copying for T213854
  • 18:52 bblack: authdns-update for https://gerrit.wikimedia.org/r/c/operations/dns/+/484546 (make normal git stuff match manual changes already in place)
  • 18:44 hashar: [2019-01-15 18:44:06,959] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.15.6-5-g4b9c845200 ready
  • 18:43 hashar: Restarting Gerrit to catch up with a DNS change with the database
  • 18:43 volans: restarted debmonitor on debmonitor1001
  • 18:40 bblack: DNS manually updated for m1-master -> dbproxy1006 and m2-master -> dbproxy1007
  • 17:26 godog: roll-restart logstash in eqiad - T213081
  • 17:21 godog: depool logstash1007 before restarting logstash - T213081
  • 17:13 godog: set partitions to 3 for existing kafka-logging topics - T213081
  • 17:06 XioNoX: move back cr1-eqiad:xe-4/1/3 to xe-3/3/1 - T212791
  • 16:57 XioNoX: move cr1-eqiad:xe-3/3/1 to xe-4/1/3 - T212791
  • 16:52 jynus: stop db1115 for hw maintenance
  • 16:50 godog: roll-restart kafka-logging in eqiad to apply new topic defaults - T213081
  • 16:00 jynus: stop es1019 for hw maintenance T213422
  • 15:53 dcausse: T210381: elastic search clusters, catching up updates since first import on new psi&omega clusters in eqiad&codfw (from mwmaint1002)
  • 15:10 fdans@deploy1001: Finished deploy [analytics/superset/deploy@UNKNOWN]: reverting deploy of 0.26.3-wikimedia1 (duration: 00m 32s)
  • 15:10 fdans@deploy1001: Started deploy [analytics/superset/deploy@UNKNOWN]: reverting deploy of 0.26.3-wikimedia1
  • 15:02 fdans@deploy1001: Finished deploy [analytics/superset/deploy@9d6156a]: reverting deploy of 0.26.3-wikimedia1 (duration: 06m 06s)
  • 15:01 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1103 (duration: 00m 48s)
  • 14:56 fdans@deploy1001: Started deploy [analytics/superset/deploy@9d6156a]: reverting deploy of 0.26.3-wikimedia1
  • 14:41 fdans@deploy1001: Finished deploy [analytics/superset/deploy@408a30e]: deploying 0.26.3-wikimedia1 (duration: 00m 36s)
  • 14:40 fdans@deploy1001: Started deploy [analytics/superset/deploy@408a30e]: deploying 0.26.3-wikimedia1
  • 14:14 moritzm: rebooting acamar
  • 13:53 marostegui: Downtime db1115 and es1019 for 4 hours - T196726 T213422
  • 13:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1119 T85757 (duration: 00m 46s)
  • 13:15 marostegui: Deploy schema change on db1119 - T85757
  • 13:15 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1119 T85757 (duration: 00m 46s)
  • 13:00 elukey: restart memcached on mc1024 to pick up new settings (-R 200) - T208844
  • 12:47 dcausse: EU SWAT done
  • 12:36 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T210381: [cirrus] Start writing to psi & omega (take 2) (2/2) (duration: 00m 45s)
  • 12:33 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: T210381: [cirrus] Start writing to psi & omega (take 2) (1/2) (duration: 00m 45s)
  • 12:15 onimisionipe: starting upgrading of prometheus-elasticsearch-exporter for eqiad T210592
  • 12:14 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Change links of wgGEHelpPanelLinks for kowiki T209467 (duration: 00m 46s)
  • 12:09 dcausse@deploy1001: Synchronized wmf-config/CommonSettings.php: [cirrus] Add cirrussearch-big-indices tag T210381 (duration: 00m 46s)
  • 12:06 jynus: upgrade and restart db1103
  • 12:03 onimisionipe: starting upgrading of prometheus-elasticsearch-exporter for codfw T210592
  • 11:50 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103 (duration: 00m 45s)
  • 11:44 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 fully (duration: 00m 45s)
  • 11:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 T85757 (duration: 00m 45s)
  • 11:02 jynus: dropping database test on db1124:s5 with replication
  • 11:01 elukey: run 'apt-get purge tmpreaper' on mw1297,1298,2150,2151,2244,2245 (all role spare) to avoid daily cronspam
  • 10:58 END: (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) (volans@cumin2001)
  • 10:57 marostegui: Deploy schema change on db1083 - T85757
  • 10:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 T85757 (duration: 00m 46s)
  • 10:53 START: - Cookbook sre.hosts.upgrade-and-reboot (volans@cumin2001)
  • 10:49 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 with low load (duration: 00m 45s)
  • 10:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 T85757 (duration: 00m 45s)
  • 10:20 marostegui: Deploy schema change on db1080 - T85757
  • 10:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 T85757 (duration: 00m 45s)
  • 10:19 jynus: upgrade and restart db1091
  • 10:16 moritzm: installing zeromq3 security updates on stretch (jessie/trusty not affected)
  • 10:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1114 T85757 (duration: 00m 45s)
  • 09:51 marostegui: Deploy schema change on db1114 - T85757
  • 09:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1114 T85757 (duration: 00m 45s)
  • 09:45 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1106 T85757 (duration: 00m 46s)
  • 09:25 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1091 (duration: 00m 46s)
  • 09:20 addshore: deploy slot done
  • 09:18 jynus: upgrade and restart db2078
  • 09:10 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wgWBQualityConstraintsTypeCheckMaxEntities 300, T209504 (duration: 00m 46s)
  • 09:06 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209922 Add WikibaseQualityConstraints configs in testwikidatawiki (duration: 00m 47s)
  • 08:38 marostegui: Stop replication on s1 on all labs hosts - T85757
  • 08:28 marostegui: Deploy schema change on db1106 - T85757
  • 08:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1106 T85757 (duration: 00m 45s)
  • 08:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1089 T85757 (duration: 00m 46s)
  • 08:02 marostegui: Deploy schema change on db1089 - T85757
  • 08:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 T85757 (duration: 00m 45s)
  • 07:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 T85757 (duration: 00m 46s)
  • 07:28 marostegui: Drop tag_summary from wikitech - T212255
  • 07:20 marostegui: Drop tag_summary from s5 - T212255
  • 07:07 marostegui: Deploy schema change on db1099:3311 - T85757
  • 07:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 T85757 (duration: 00m 45s)
  • 06:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool pc1007 in pc1 - T208383 (duration: 00m 49s)
  • 02:12 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@c920aec]: Re-deploy namespace script (duration: 08m 42s)
  • 02:04 smalyshev@deploy1001: Started deploy [wdqs/wdqs@c920aec]: Re-deploy namespace script
  • 01:54 mutante: wdqs1009 - icinga alerts about Blazegraph process for wdqs categories. starting wdsq blazegraph,.. already running
  • 01:12 mutante: cp1078 - bnxt_en - TX timeout detected - Host cp1078 is DOWN - powercycled via mgmt (T203194)
  • 00:44 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Welcome survey experiment 2: 50% variation A, 50% variation C (duration: 00m 46s)
  • 00:37 catrope@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/GrowthExperiments/: Make welcome survey config use array_plus_2d (duration: 00m 46s)
  • 00:34 catrope@deploy1001: Synchronized php-1.33.0-wmf.12/resources/lib/ooui/oojs-ui-core.js: OOUI backport (T213544) (duration: 00m 46s)
  • 00:08 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Improve list of privileged groups (duration: 00m 46s)

2019-01-14

  • 23:49 gehel@deploy1001: Finished deploy [wdqs/wdqs@59d5f40]: New wdqs startup script for multi-instance (duration: 09m 53s)
  • 23:39 gehel@deploy1001: Started deploy [wdqs/wdqs@59d5f40]: New wdqs startup script for multi-instance
  • 23:30 mutante: doc1001 - disabling puppet, testing apache config change 483775
  • 23:12 ejegg: updated fundraising CiviCRM from 5580f0b11c to 6042acb363
  • 22:39 andrewbogott: upgraded packages and MW version on wikitech-static
  • 21:30 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@89c4d8d]: Update mobileapps to f2658de (fix ITN explore feed for dawiki) (duration: 03m 51s)
  • 21:26 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@89c4d8d]: Update mobileapps to f2658de (fix ITN explore feed for dawiki)
  • 20:37 jforrester@deploy1001: Synchronized php-1.33.0-wmf.12/resources/Resources.php: Hot-deploy I18193b19 to add missing message for OOUI v0.30.0 (duration: 00m 47s)
  • 20:27 gehel@deploy1001: Finished deploy [wdqs/wdqs@f71131e]: upgradign wdqs1010 to latest version (duration: 00m 24s)
  • 20:27 gehel@deploy1001: Started deploy [wdqs/wdqs@f71131e]: upgradign wdqs1010 to latest version
  • 20:08 gehel: disabling puppet on all wdqs servers to deploy T213234
  • 19:58 dcausse: Morning SWAT done
  • 19:37 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Clean-up: Explain why WBMI wikis don't need wmgWikibaseRepoEntityNamespaces set (duration: 00m 46s)
  • 19:32 XioNoX: re-deactivate BGP to Zayo on cr1-eqiad - T212791
  • 19:29 dcausse@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/GrowthExperiments/includes/WelcomeSurvey.php: Welcome survey: ignore check confirmed email (duration: 00m 45s)
  • 19:28 XioNoX: re-activate BGP to Zayo on cr1-eqiad - T212791
  • 19:19 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1081 with low load (duration: 00m 47s)
  • 19:09 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T204016: Remove old ArticleCreationWorkflows config (duration: 00m 46s)
  • 18:48 jynus: stop upgrade and restart db1081
  • 18:45 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1081 (duration: 00m 46s)
  • 18:18 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@f71131e]: Category script and GUI updates, blazegraph launcher updates and moved RWStore from scap to puppet (duration: 10m 56s)
  • 18:07 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@f71131e]: Category script and GUI updates, blazegraph launcher updates and moved RWStore from scap to puppet
  • 17:25 addshore: deploy slot done
  • 17:22 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T201831 T201838 wmgWikibaseMaxItemIdForNewPropertyIdHtmlFormatter fully on (duration: 00m 46s)
  • 17:13 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T201831 T201838 wmgWikibaseMaxItemIdForNewPropertyIdHtmlFormatter 3000 (duration: 00m 46s)
  • 17:11 addshore@deploy1001: sync-file aborted: T201831 T201838 wmgWikibaseMaxItemIdForNewPropertyIdHtmlFormatter 3000 (duration: 00m 01s)
  • 17:09 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: T201831 T201838 Introduce wmgWikibaseMaxItemIdForNewPropertyIdHtmlFormatter PT 2/2 (duration: 00m 45s)
  • 17:08 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T201831 T201838 Introduce wmgWikibaseMaxItemIdForNewPropertyIdHtmlFormatter PT 1/2 (duration: 00m 47s)
  • 16:56 ejegg: re-enabled fundraising scheduled jobs
  • 16:43 mobrovac@deploy1001: scap-helm -h finished
  • 16:43 mobrovac@deploy1001: scap-helm -h cluster codfw completed
  • 16:43 mobrovac@deploy1001: scap-helm -h cluster eqiad completed
  • 16:43 mobrovac@deploy1001: scap-helm -h [namespace: -h, clusters: eqiad,codfw]
  • 16:03 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 T85757 (duration: 00m 45s)
  • 16:02 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db1105:3311 T85757 (duration: 00m 46s)
  • 15:57 akosiaris@deploy1001: scap-helm zotero finished
  • 15:57 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 15:57 akosiaris@deploy1001: scap-helm zotero [namespace: zotero, clusters: eqiad]
  • 15:45 anomie: Running cleanupUsersWithNoIds.php on labswiki and labtestwiki, apparently they were left out when that was done for all other wikis (and so caused issues with the migrateActors.php run).
  • 15:44 fsero: downscaling old zotero-production-645dccfb64 replicaset on eqiad
  • 15:33 vgutierrez: rolling restart of cp1076-cp1090 to upgrade to kernel 4.9.144 - T203194
  • 15:17 ejegg: disabled fundraising scheduled jobs
  • 15:16 marostegui: Deploy schema change on db1105:3311 - T85757
  • 15:16 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db1105:3311 T85757 (duration: 00m 46s)
  • 15:08 volans: testing switchdc cookbooks in DRY-RUN mode w/ latest spicerack T205884 (no real changes expected)
  • 15:04 akosiaris: upgrade zotero pods to 2019-01-14-115905-candidate in eqiad T213693
  • 15:04 akosiaris@deploy1001: scap-helm zotero finished
  • 15:04 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 15:04 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 15:02 moritzm: imported debdeploy 0.0.99.6-1+deb10u1 for buster-wikimedia (T213527)
  • 15:02 vgutierrez: upgrading kernel in cp1075 to 4.1.144-1 - T203194
  • 15:00 moritzm: ran systemctl reset-failed on relforge1001
  • 14:57 marostegui: Drop table tag_summary from s6 - T212255
  • 14:52 akosiaris: upgrade zotero pods to 2019-01-14-115905-candidate in codfw T213693
  • 14:51 akosiaris@deploy1001: scap-helm zotero finished
  • 14:51 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 14:51 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 14:42 anomie@mwmaint1002: Running migrateActors.php on wikitech for T188327. This may cause lag in codfw.
  • 14:42 anomie@mwmaint1002: Running migrateActors.php on section 8 wikis for T188327. This may cause lag in codfw.
  • 14:42 anomie@mwmaint1002: Running migrateActors.php on section 7 wikis for T188327. This may cause lag in codfw.
  • 14:42 anomie@mwmaint1002: Running migrateActors.php on section 6 wikis for T188327. This may cause lag in codfw.
  • 14:42 anomie@mwmaint1002: Running migrateActors.php on section 5 wikis for T188327. This may cause lag in codfw.
  • 14:42 anomie@mwmaint1002: Running migrateActors.php on section 4 wikis for T188327. This may cause lag in codfw.
  • 14:42 anomie@mwmaint1002: Running migrateActors.php on section 2 wikis for T188327. This may cause lag in codfw.
  • 14:41 anomie@mwmaint1002: Running migrateActors.php on section 1 wikis for T188327. This may cause lag in codfw.
  • 14:41 anomie@mwmaint1002: Running migrateActors.php on remaining section 3 wikis for T188327. This may cause lag in codfw.
  • 14:39 volans: updated python3-phabricator on cumin[12]001 T205884
  • 14:36 volans: uploaded python{,3}-phabricator 0.7.0-2~wmf1 to apt.w.o T205884 (upstream removes egg files)
  • 14:18 dcausse: elasticsearch (search cluster): pre-populating omega & psi clusters in eqiad & codfw (from mwmaint1002 and mwmaint2001 respectively) (T210381)
  • 14:13 akosiaris@deploy1001: scap-helm zotero finished
  • 14:13 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 14:13 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 14:11 akosiaris@deploy1001: scap-helm zotero upgrade production --debug -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 14:10 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 14:04 marostegui: Add pc1007 to tendril and zarcillo - T208383
  • 13:51 akosiaris@deploy1001: scap-helm zotero finished
  • 13:51 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 13:51 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 13:49 Jeff_Green: authdns update for T210445
  • 13:48 dcausse: creating testcommonswiki index in the omega search-elastic cluster (eqiad & codfw)
  • 13:42 akosiaris@deploy1001: scap-helm zotero finished
  • 13:42 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 13:42 akosiaris@deploy1001: scap-helm zotero upgrade production --dry-run --debug -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 13:41 akosiaris: rollback zotero codfw deployment
  • 13:37 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 13:37 akosiaris@deploy1001: scap-helm zotero upgrade -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 13:10 jijiki: Restarted npre on proton1002
  • 13:03 zeljkof: eu swat finished
  • 13:03 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add http://mbc.cyfrowemazowsze.pl to $wgCopyUploadsDomains (T212469) (duration: 00m 46s)
  • 12:56 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Localisation of Babel categories on nap.wikipedia.org (T123188) (duration: 00m 44s)
  • 12:48 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure $wgImportSources for ne.wiktionary (T213023) (duration: 00m 45s)
  • 12:44 zfilipin@deploy1001: sync-file aborted: SWAT: Configure $wgNamespaceAliases for yue.wiktionary (T212678) (duration: 00m 01s)
  • 12:37 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure $wgNamespaceAliases for yue.wiktionary (T212678) (duration: 00m 45s)
  • 12:27 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure $wgAddGroups, $wgRemoveGroups and $wgImportSources for ur.wiki (T212612) (duration: 00m 46s)
  • 12:19 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect user right to patroller user group at zh.wikivoyage (T212272) (duration: 00m 46s)
  • 12:10 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create Portal namespace on shn.wikipedia (T212992) (duration: 00m 46s)
  • 12:05 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule for Berklee College of Music library (T213311) (duration: 00m 52s)
  • 11:20 volans: installed spicerack 0.0.13 on cumin1001 - T205884
  • 10:39 moritzm: start installing systemd security updates for stretch
  • 10:13 volans: installed spicerack 0.0.13 on cumin2001 for final testing - T205884
  • 10:11 volans: uploaded spicerack_0.0.13-1_amd64.deb to apt.wikimedia.org stretch-wikimedia T205884
  • 10:07 moritzm: install tmpreaper security updates on remaining hosts
  • 09:51 marostegui: Running aria_chk for all myisam tables on dbstore1002 T213670
  • 09:37 marostegui: Running aria_chk for all linter tables on dbstore1002 - T213670
  • 08:44 marostegui: Stop mysql on dbstore1002 - T213670
  • 08:38 marostegui: Stop MySQL on pc2010 to clone pc1007 - T208383
  • 07:48 elukey: executed bmc-device --debug --cold-reset on dbstore1002 - "No more sessions available" for mgmt

2019-01-13

  • 16:33 hoo: Updated operations/dumps/dcat (559dee37452..a86285f4e7) on snapshot1008

2019-01-12

  • 21:46 akosiaris: restart all zotero pods in eqiad
  • 16:12 moritzm: rebooting mw2167 for a test
  • 02:16 legoktm@deploy1001: Synchronized docroot/mediawiki.org/keys: Add Mukunda's new subkey that was used for the 1.32 release - T213521 (duration: 00m 47s)

2019-01-11

  • 21:56 jforrester@deploy1001: Finished scap: Full scap sync to update wmf.12 i18n for the weekend Idf2a67860f (duration: 19m 12s)
  • 21:37 jforrester@deploy1001: Started scap: Full scap sync to update wmf.12 i18n for the weekend Idf2a67860f
  • 18:43 legoktm@deploy1001: Synchronized wmf-config/CommonSettings.php: Update ExtensionDistributor for 1.32 release - https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/483735 (duration: 00m 46s)
  • 18:07 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2060 T210713 (duration: 00m 46s)
  • 17:10 marostegui: Deploy schema change on db2060 - T210713
  • 16:55 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2060 T210713 (duration: 00m 46s)
  • 16:53 marostegui: Defragment change_tag table on db2060 - T210713
  • 14:37 jynus: upgrade and restart db2091 (s2, s4)
  • 14:12 jynus: updating mariadb client packages on cumin* hosts
  • 11:36 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: repool es1018 fully (duration: 00m 46s)
  • 11:21 jynus: stop, upgrade and reboot es2017
  • 11:04 jynus: stop, upgrade and reboot es2016
  • 10:51 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: repool es1018 with low load (duration: 00m 46s)
  • 10:31 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: repool es2013 (duration: 00m 45s)
  • 10:30 jynus: upgrade and restart es1018
  • 09:58 jynus: upgrade and reboot es2013
  • 09:53 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: depool es2013 (duration: 00m 45s)
  • 09:49 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: depool es2013 (duration: 00m 47s)
  • 09:32 jynus: reset iLo on db2053
  • 08:49 moritzm: installing tmpreaper security updates
  • 02:40 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Ib87407165382 (duration: 00m 46s)
  • 01:20 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T211993 Enable GrowthExperiments help panel for 50% of new users on cswiki and kowiki (duration: 00m 46s)
  • 01:05 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T211993 Enable GrowthExperiments help panel on cswiki and kowiki (duration: 00m 45s)
  • 01:03 jforrester@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/WikimediaEvents/includes/PageViews.php: SWAT: T213186 GrowthExperiments: Support templates for help desk title (duration: 00m 46s)
  • 00:50 XioNoX: bump prefix limit for AS6939 in eqsin
  • 00:18 jforrester@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/AbuseFilter/includes/AbuseFilterHooks.php: T213453: Use slot in onEditFilterMergedContent and newVariableHolderForEdit in AbuseFilter (duration: 00m 47s)
  • 00:12 James_F: 482373 is live on mwdebug1002 for extensive checks.
  • 00:08 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT Help panel: Set help desk page correctly on kowiki Ia94cfc571 (duration: 00m 46s)

2019-01-10

  • 23:45 Krinkle: krinkle@tungsten: upgrade xhgui to include upstream f039fb9f99f - T213218
  • 23:45 Krinkle: upgraded xhgui to upstream 2965240c91e52 (current upstream master) - T213218
  • 23:36 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: T213497 [Commons, TestCommons] Don't use Wikibase entity search (duration: 00m 46s)
  • 22:57 jforrester@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/Wikibase/repo/includes/EditEntity/MediawikiEditFilterHookRunner.php: T213453: Pass slotrole into EditFilterMergedContent hook in Wikibase repo (duration: 00m 47s)
  • 20:47 marxarelli: both mediawiki error rates and 500 response rates have subsided back to pre-deploy levels
  • 20:19 marxarelli: seeing increase in "60 second timed out" error rate and rise in 503 rate, as was the case with group1 deployment. continuing to monitor
  • 20:11 gehel: restart blazegraph on wdqs1009 to validate new config
  • 20:02 tgr@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/WikimediaEvents/modules/ve-wme/campaigns.js: SWAT: Remove unnecessary addPlugin wrapper (T213338) (duration: 00m 53s)
  • 19:50 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove AICaptcha settings (T186244) (duration: 00m 52s)
  • 19:47 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Whitelist *.*.archive.org in wgCopyUploadsDomains (T207581) (duration: 00m 53s)
  • 19:41 tgr: ran mwscript namespaceDupes.php bnwikibooks --fix (238 links fixed)
  • 19:41 volans: installed spicerack 0.0.12-1 on cumin2001 T205884
  • 19:39 volans: uploaded spicerack_0.0.12-1_amd64.deb to apt.wikimedia.org stretch-wikimedia T205884
  • 19:39 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Note that namespaceDupes.php maintenance script run will be needed after the deployment. (T203534) (duration: 00m 53s)
  • 19:14 marostegui: Deploy schema change on dbstore1001 - T85757
  • 19:13 marostegui: Deploy schema change on dbstore1002 - T85757
  • 18:57 tzatziki: deleting three files for legal compliance
  • 18:52 anomie@mwmaint1002: Running migrateActors.php on test wikis and mediawikiwiki for T188327. This may cause lag in codfw.
  • 18:47 marostegui: Deploy schema change on s1 codfw master (db2048) with replication, this will generate lag on s1 codfw - T85757
  • 18:46 marostegui: Stop replication on s1 codfw master for a schema change - T85757
  • 18:37 marostegui: Stop replication on s8 codfw master for a schema change - T85757
  • 18:30 marostegui: Upgrade mysql and kernel on db2060
  • 18:30 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2053, db2060 for kernel and mysql upgrade (duration: 00m 51s)
  • 18:13 marostegui: Stop MySQL on db2046 for kernel upgrade
  • 18:12 marostegui: The above change was db2053 and not db2060
  • 18:11 marostegui: Stop MySQL on db2053 and db2060 for mysql and kernel upgrade
  • 18:11 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2053, db2060 for kernel and mysql upgrade (duration: 00m 53s)
  • 17:50 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: repool es2015 (duration: 00m 53s)
  • 17:49 marostegui: Deploy schema change on db2053 - T210713
  • 17:33 marostegui: Deploy schema change on db2046 - T210713
  • 16:59 jynus: stop and upgrade es2015
  • 16:52 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: depool es2015 (duration: 00m 52s)
  • 16:41 onimisionipe: data transfer from wdqs1004 -> wdqs1006 completed! - T213361
  • 16:32 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T159708 Enable Structured Data on Commons, captions-only (duration: 00m 53s)
  • 16:17 James_F: T180981 Placed patch to enable WBMI on Commons on mwdebug1002
  • 16:13 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T180981 Add Commons to wikis with WikibaseMediaInfo installed (duration: 00m 52s)
  • 16:11 jforrester@deploy1001: Synchronized dblists/wikidatarepo.dblist: T180981 Add Commons to wikis with WikibaseRepo installed (duration: 00m 54s)
  • 16:04 James_F: T180981 Placed patch to install but not enable WBMI on Commons on mwdebug1002
  • 15:56 marostegui: Deploy schema change on db1068 (s4 master) - T86338
  • 15:31 fsero: rollbacking last zotero codfw deployment
  • 15:27 marostegui: Deploy schema change on db1067 (s1 master) - T86338 T202167
  • 15:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 T86338 T202167 (duration: 00m 49s)
  • 15:24 addshore: T208330, MariaDB [testcommonswiki]> TRUNCATE TABLE wb_terms; # Was https://phabricator.wikimedia.org/P7973
  • 15:22 fsero@deploy1001: scap-helm zotero upgrade production -f /srv/scap-helm/zotero/zotero-values-codfw.yaml /srv/deployment-charts/charts/zotero-0.0.1.tgz [namespace: zotero, clusters: codfw]
  • 15:21 fsero@deploy1001: scap-helm zotero upgrade -f /srv/scap-helm/zotero/zotero-values-codfw.yaml /srv/deployment-charts/charts/zotero-0.0.1.tgz [namespace: zotero, clusters: codfw]
  • 15:20 addshore@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/Wikibase/repo/includes/Content: T208330 dont write to wb_terms for mediainfo (duration: 00m 54s)
  • 15:12 addshore@deploy1001: Synchronized php-1.33.0-wmf.9/extensions/Wikibase/repo/includes/Content: T208330 dont write to wb_terms for mediainfo (duration: 00m 55s)
  • 14:59 marostegui: Deploy schema change on db1080 - T86338 T202167
  • 14:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 T86338 T202167 (duration: 00m 52s)
  • 14:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1114 T86338 T202167 (duration: 00m 52s)
  • 14:42 fsero@deploy1001: scap-helm zotero finished
  • 14:42 fsero@deploy1001: scap-helm zotero cluster staging completed
  • 14:42 fsero@deploy1001: scap-helm zotero upgrade staging -f /srv/scap-helm/zotero/zotero-values-staging.yaml /srv/deployment-charts/charts/zotero-0.0.1.tgz [namespace: zotero, clusters: staging]
  • 14:36 fsero@deploy1001: scap-helm zotero finished
  • 14:36 fsero@deploy1001: scap-helm zotero cluster staging completed
  • 14:36 fsero@deploy1001: scap-helm zotero upgrade staging -f /srv/scap-helm/zotero/zotero-values-staging.yaml /srv/deployment-charts/charts/zotero-0.0.1.tgz [namespace: zotero, clusters: staging]
  • 14:35 fsero@deploy1001: scap-helm zotero upgrade staging -f /srv/scap-helm/zotero/zotero-values-staging.yaml [namespace: zotero, clusters: staging]
  • 14:33 fsero@deploy1001: scap-helm -h finished
  • 14:33 fsero@deploy1001: scap-helm -h cluster staging completed
  • 14:33 fsero@deploy1001: scap-helm -h [namespace: -h, clusters: staging]
  • 14:33 marostegui: Deploy schema change on db1114 - T86338 T202167
  • 14:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1114 T86338 T202167 (duration: 00m 53s)
  • 14:14 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: depool es1019 (duration: 00m 53s)
  • 13:51 arturo: T212302 icinga downtime for 2h cloudvirt[1013,1024,1026-1030].eqiad.wmnet bc wrong puppet code
  • 13:24 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: depool es1018 (duration: 00m 52s)
  • 13:10 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2012 (duration: 00m 52s)
  • 13:01 zeljkof: EU SWAT finished
  • 13:01 zfilipin@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: SWAT: Remove main page special casing from ruwikibooks and ruwikiquote (T212849) (duration: 00m 52s)
  • 12:58 zfilipin@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: SWAT: Remove main page special casing from eswiki (T212849) (duration: 00m 53s)
  • 12:53 zfilipin@deploy1001: Synchronized dblists/mobilemainpagelegacy.dblist: SWAT: Turn off main page special casing for svwiki (T213018) (duration: 00m 52s)
  • 12:46 zfilipin@deploy1001: Synchronized dblists/flow.dblist: SWAT: Disable unused Flow extension on ur.wikibooks (T207627) (duration: 00m 55s)
  • 12:42 onimisionipe: starting data transfer from wdqs1004 -> wdqs1006 - T213361
  • 12:34 onimisionipe: starting data transfer from wdqs1003 -> wdqs1006 - T213361 - aborted (nodes are in different cluster)
  • 12:28 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Re-enable QuickSurveys extension on enwiki (T209882) (duration: 00m 52s)
  • 12:20 jynus: stop and upgrade es2012
  • 12:12 zfilipin@deploy1001: Synchronized dblists/flow.dblist: SWAT: Reverted "Revert "Disable unused Flow extension on de.wikiversity"" (T207626) (duration: 00m 53s)
  • 12:01 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool es2012 (duration: 00m 52s)
  • 11:54 onimisionipe: starting data transfer from wdqs1003 -> wdqs1006 - T213361
  • 10:59 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209857 Increase CPU benchmark sampling rate (duration: 00m 53s)
  • 10:58 fsero: uploaded docker-registry_2.7.0~rc0~wmf1-1 debian package to reprepro for stretch-wikimedia (done yesterday at 17:21 UTC forgot about the log)
  • 10:26 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209857 Run CPU benchmark for a portion of navtiming pageloads (duration: 00m 52s)
  • 10:10 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209857 Run CPU benchmark for a portion of navtiming pageloads (duration: 00m 53s)
  • 09:52 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T187299 Decrease ruwiki navtiming rate (duration: 00m 52s)
  • 09:45 gilles@deploy1001: Synchronized tests/InitialiseSettingsTest.php: T211395 T211529 tests: Assert that extra namespaces have correspondent talk namespaces (duration: 00m 56s)
  • 09:34 moritzm: updated thirdparty/php72 component for stretch-wikimedia to 7.2.13
  • 01:41 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Make GrowthExperiments config wmf.12-proof (duration: 00m 52s)
  • 01:21 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert latest config patch (caused fatal errors on kowiki) (duration: 00m 52s)
  • 00:58 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure help desk page for help panel correctly on kowiki (T213186) (duration: 00m 53s)
  • 00:56 cstone: updated fundraising tools from 5f44d9dd43 to da82ed111d
  • 00:34 catrope@deploy1001: Synchronized php-1.33.0-wmf.12/includes/MovePage.php: Fix missing ATOMIC_CANCELABLE in MovePage::move() (T213168) (duration: 00m 53s)
  • 00:20 catrope@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/GrowthExperiments/: Help panel fixes (T212973, T212890, T213186) (duration: 00m 54s)
  • 00:13 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable EventLogging for GrowthExperiments help panel (T211991) (duration: 00m 54s)

2019-01-09

  • 23:51 mutante: thumb1004 - still needs broken RAM replaced, expired downtime, re-ACKed (T207721)
  • 23:39 mutante: mw2151 - change netbox status from active to staged - it's not actually active, it's role(spare) and was jessie (T192457)
  • 23:34 mutante: reinstalling mw2151.codfw.wmnet because it was the very last mw* host on jessie
  • 21:20 bblack: multatuli (ns2) - upgrade gdnsd to 9949 beta release
  • 21:04 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@bfa9241]: Increase concurrency for categoryMembershipJob T192691 (duration: 00m 45s)
  • 21:04 James_F: Creating Wikibase repo tables on Commons for T68108
  • 21:03 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@bfa9241]: Increase concurrency for categoryMembershipJob T192691
  • 21:00 James_F: Running rebuildall on TestCommons
  • 20:53 bblack: authdns1001 (ns0) - upgrade gdnsd to 9949 beta release
  • 20:45 James_F: Created Wikibase repo tables on TestCommons
  • 20:11 dduvall@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.12 (duration: 00m 53s)
  • 20:10 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.12
  • 19:28 crusnov@deploy1001: Finished deploy [netbox/deploy@7fe39e1]: Deploy Django security upgrade (duration: 04m 33s)
  • 19:23 crusnov@deploy1001: Started deploy [netbox/deploy@7fe39e1]: Deploy Django security upgrade
  • 19:01 ejegg: updated standalone SmashPig deploy from 25713ca232 to 78b92b7fef
  • 18:43 bblack: authdns2001 (ns1) - upgrade gdnsd to 9949 beta release
  • 18:26 XioNoX: add bgp sessions to AS31800 on cr1-eqsin
  • 18:19 marostegui: Rename table tag_summary on enwiki on db1089 - T212255
  • 18:18 XioNoX: add bgp sessions to AS38895 on cr1-eqsin
  • 18:04 marostegui: Drop valid_tag from s3 master (db1075) - T212254
  • 17:39 tarrow: That last one was SWAT: T209504 Increase PHP constraint check entities to 150
  • 17:36 tarrow@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 53s)
  • 17:28 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 - T86338
  • 17:18 James_F: Ran `namespaceDupes.php --wiki=bewikibooks` on mwmaint1002, no change
  • 17:16 bblack: uploaded gdnsd-2.99.9949-beta-1+wmf1 to reprepro for stretch-wikimedia
  • 17:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 T86338 T202167 (duration: 00m 52s)
  • 16:29 marostegui: Deploy schema change on db1083 - T86338 T202167
  • 16:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 T86338 T202167 (duration: 00m 53s)
  • 16:17 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1082 with full weight (duration: 00m 53s)
  • 16:11 jforrester@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/Wikibase/repo/RepoHooks.php: T213227 RepoHooks::onApiCheckCanExecute: Only fail if the edit is for our entity's slot (duration: 00m 54s)
  • 15:50 marostegui: Drop valid_tag tables from db1095 (s3) - T212254
  • 15:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1106 T86338 T202167 (duration: 00m 51s)
  • 15:23 jijiki: restarting scb* pdfrender
  • 15:10 marostegui: Deploy schema change on db1106 (sanitarium s1 master) with replication, lag will be generated on s1 labs - T86338 T202167
  • 15:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1106 T86338 T202167 (duration: 00m 52s)
  • 14:39 elukey: restart Hadoop HDFS namenodes on an-master100[1,2] to complete decom of analytics1028->41
  • 14:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 T212254 (duration: 00m 53s)
  • 14:36 volans@deploy1001: Finished deploy [debmonitor/deploy@0f096de]: Deploy Django security upgrade (duration: 01m 50s)
  • 14:34 volans@deploy1001: Started deploy [debmonitor/deploy@0f096de]: Deploy Django security upgrade
  • 14:28 marostegui: valid_tag table on db1077 with replication (lag will be generated on labs s3) - T212254
  • 14:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 T212254 (duration: 00m 52s)
  • 13:32 urandom: forcing removal of restbase1016-c (host down way too long to salvage) -- T212418
  • 13:29 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1082 with low weight (duration: 00m 52s)
  • 13:26 zeljkof: EU SWAT finished
  • 13:22 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.9/: SWAT: Fix order of arguments in ChangeTags::getPrevTags ([T212703]) (duration: 05m 50s)
  • 13:08 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.12/: SWAT: Fix order of arguments in ChangeTags::getPrevTags ([T212703]) (duration: 06m 54s)
  • 13:00 zeljkof: extending eu swat for 5-10 minutes
  • 12:51 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable signature button in toolbar for the "Arbitration" namespace in ruwiki (T213049) (duration: 00m 52s)
  • 12:44 moritzm: installing OpenSSL 1.0.2 security updates for stretch
  • 12:40 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable reader trust survey (T209882) (duration: 01m 07s)
  • 12:02 gehel: repool wdqs100[78] - data import complete - T213210
  • 11:55 jynus: enabling gtid on db1124:s5
  • 11:54 jynus: enabling gtid on db1082
  • 11:23 jynus: stopping db1082 and db2052 s5 replication in sync to migrate db1124:s5 master
  • 10:30 moritzm: fixed package installation status on db2062
  • 10:01 volans: upgraded spicerack to 0.0.11 on cumin2001 T205884
  • 10:00 volans: uploaded spicerack_0.0.11 to apt.wikimedia.org stretch-wikimedia T205884
  • 09:44 hashar: Some CI npm jobs get broken due to a faulty node module. https://phabricator.wikimedia.org/T213249
  • 09:38 banyek: repooling labdsb1010 - T210693
  • 09:26 banyek: dropping materialized views on labdb1010 - T210693
  • 09:26 banyek: depooled labsdb1010
  • 08:28 moritzm: installing openssl security updates for on stretch-based DB servers
  • 07:55 moritzm: installing libseccomp updates from stretch point release
  • 07:43 hashar: contint1001: restarted Zuul to take in account SMTP configuration | https://gerrit.wikimedia.org/r/376739 | T93414
  • 06:03 kartik@deploy1001: Finished deploy [cxserver/deploy@1098942]: Update cxserver to 656c468 (duration: 04m 08s)
  • 05:59 kartik@deploy1001: Started deploy [cxserver/deploy@1098942]: Update cxserver to 656c468
  • 01:15 jforrester@deploy1001: Synchronized php-1.33.0-wmf.12/extensions/Wikibase/repo/RepoHooks.php: T213227 Don't have onApiCheckCanExecute die for inactive entity types (duration: 00m 53s)
  • 01:04 jforrester@deploy1001: Synchronized docroot/: T187716 Remove mobilelanding.php, no longer pointed to by Apache (duration: 00m 52s)
  • 00:58 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [Wikimania] Add 2019 content to default search (duration: 00m 53s)
  • 00:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T202683 [Wikimania] Create year namespaces for each Wikimania, 2005–2019 (duration: 00m 53s)
  • 00:34 tgr@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Make password policy and logging code saner (duration: 00m 52s)
  • 00:33 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make password policy and logging code saner (duration: 00m 55s)

2019-01-08

  • 23:44 SMalyshev: repooled wdqs1004
  • 23:35 eileen: process-control config revision is 9dc6e63fcd
  • 23:00 XioNoX: Update pfw3-codfw/eqiad security policies - T213100
  • 22:39 XioNoX: deactivate policy-statement BGP_fundraising_aggregates term nat on pfw3-eqiad/codfw - T211028
  • 22:29 gehel: starting data copy from wdqs1007 to wdqs1008 (both will be depooled) - T213217
  • 22:27 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: TestCommons: Add default search NSes (duration: 00m 51s)
  • 22:22 James_F: Ran /docroot/noc/createTxtFileSymlinks.sh for new dblist
  • 22:21 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Use new wikidatarepo dblist where appropriate (duration: 00m 52s)
  • 22:20 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: dblists: Load wikibaserepo (duration: 00m 52s)
  • 22:15 jforrester@deploy1001: scap failed: average error rate on 9/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 22:14 jforrester@deploy1001: Synchronized dblists/wikidata.dblist: dblists: Remove testcommons from wikidata list (duration: 00m 52s)
  • 22:13 jforrester@deploy1001: Synchronized dblists/wikidatarepo.dblist: dblists: Add wikidatarepo list (duration: 00m 53s)
  • 22:12 urandom: forcing removal of restbase1016-b (host down way too long to salvage) -- T212418
  • 22:08 marostegui: Drop valid_tag table from db2043 with replication (s3 codfw master - lag will be generated) - T212254
  • 22:03 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: cleanup - Idfa129a65a41 (duration: 00m 53s)
  • 21:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 T212254 (duration: 00m 52s)
  • 21:49 marostegui: Drop valid_tag table from db1078 (s3) - T212254
  • 21:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 T212254 (duration: 00m 53s)
  • 21:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 T212254 (duration: 00m 53s)
  • 21:38 marostegui: Drop valid_tag table from db1123 (s3) - T212254
  • 21:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 T212254 (duration: 00m 53s)
  • 21:31 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.12
  • 21:03 dduvall@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.12 and rebuild l10n cache (duration: 39m 22s)
  • 20:42 ejegg: updated payments-wiki from b8acb95a2a to c455bbc6bb
  • 20:24 dduvall@deploy1001: Started scap: testwiki to php-1.33.0-wmf.12 and rebuild l10n cache
  • 20:24 gehel: starting data copy from wdqs1004 to wdqs1007 (both will be depooled) - T213217
  • 20:21 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: TestCommons: Don't enable entities, we're not Wikidata.org (duration: 01m 44s)
  • 20:11 XioNoX: change BGP_fundraising_aggregates term nat from static to aggregate on pfw3-eqiad - T211028
  • 19:51 ejegg: updated fundraising CiviCRM from b8e3a71845 to 5580f0b11c
  • 19:48 krinkle@deploy1001: Finished deploy [performance/navtiming@68fd54d]: (no justification provided) (duration: 00m 05s)
  • 19:48 krinkle@deploy1001: Started deploy [performance/navtiming@68fd54d]: (no justification provided)
  • 19:48 dduvall@deploy1001: Pruned MediaWiki: 1.33.0-wmf.12 (duration: 06m 26s)
  • 19:11 arlolra: Updated Parsoid to 2c5dc7b (T197616, T205491, T209772, T199926, T209194, T204622)
  • 19:06 marostegui: Drop valid_tag table from s1 - T212254
  • 19:00 arlolra@deploy1001: Finished deploy [parsoid/deploy@4b82683]: Updating Parsoid to 2c5dc7b (duration: 10m 40s)
  • 18:54 XioNoX: make pfw3-codfw source NAT similar to pfw3-eqiad - T211028
  • 18:54 ejegg: updated SmashPig standalone install from fb3268897b to 25713ca232
  • 18:50 marostegui: Drop valid_tag table from s4 - T212254
  • 18:50 XioNoX: add NAT workaround to pfw3-eqiad - T211028
  • 18:49 arlolra@deploy1001: Started deploy [parsoid/deploy@4b82683]: Updating Parsoid to 2c5dc7b
  • 18:38 XioNoX: temporarily permit ssh from frpm1001 to pfw3-eqiad on pfw3-eqiad
  • 18:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 T86338 T202167 (duration: 00m 45s)
  • 18:27 jynus: restarting s5 replication on labsdb1009/10/11
  • 17:41 moritzm: installing libseccomp updates from stretch point release
  • 17:40 mobrovac@deploy1001: Finished deploy [restbase/deploy@503b29c]: Add test-commons and nap.wikisource, take #2 (duration: 02m 29s)
  • 17:38 mobrovac@deploy1001: Started deploy [restbase/deploy@503b29c]: Add test-commons and nap.wikisource, take #2
  • 17:37 mobrovac@deploy1001: Finished deploy [restbase/deploy@503b29c]: Add test-commons and nap.wikisource - T210752 T197616 (duration: 96m 50s)
  • 17:33 _joe_: applying the new apache configuration to jobrunners in eqiad
  • 17:24 elukey: roll restart of aqs on aqs100* to pick up new Druid settings
  • 17:20 _joe_: depooling mw1299 for testing of the apache change
  • 17:16 SMalyshev: restarted Blazegraph wdqs1006 due to unresponsiveness (caused by load?)
  • 16:56 urandom: forcing removal of restbase1016-a (host down way too long to salvage) -- T212418
  • 16:56 jynus: changing db1124:s5 replication to db2066
  • 16:55 marostegui: Deploy schema change on db1105:3311 T86338 T202167
  • 16:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 T86338 T202167 (duration: 00m 44s)
  • 16:54 jynus: stopping s5 replication on labsdb1009/10/11 to prevent undoable mistakes
  • 16:34 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2019 - T212833 (duration: 02m 51s)
  • 16:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1089 T86338 T202167 (duration: 00m 45s)
  • 16:12 XioNoX: add BGP sessions to AS64050 in AMS-IX
  • 16:04 marostegui: Drop valid_tag table from s7 - T212254
  • 16:00 mobrovac@deploy1001: Started deploy [restbase/deploy@503b29c]: Add test-commons and nap.wikisource - T210752 T197616
  • 15:59 marostegui: Deploy schema change on db1089 T86338 T202167
  • 15:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 T86338 T202167 (duration: 00m 45s)
  • 15:45 marostegui: Drop valid_tag table from s2 - T212254
  • 15:32 marostegui: Stop MySQL on es2019 for upgrade - T212833
  • 15:23 godog: briefly stop carbon daemons on graphite1004 to move /srv/whisper -> /srv/carbon/whisper
  • 15:17 marostegui: Increase connections from 10 to 50 for recommendationapiservice on m2 - T212154
  • 15:10 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool es2019 - T212833 (duration: 00m 44s)
  • 15:04 hashar: Restarted CI Jenkins
  • 13:02 zeljkof: EU SWAT finished
  • 12:59 jynus: transfering db1102:s5 mariadb datadir to db1082
  • 12:57 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Give all users (including IPs) the pagequality right in plwikisource (T212478) (duration: 00m 45s)
  • 12:45 akosiaris@deploy1001: scap-helm zotero finished
  • 12:45 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 12:45 akosiaris@deploy1001: scap-helm zotero install --name production2 -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 12:44 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow ptwikis bureaucrats to grant/revoke rollbacker user group (T212735) (duration: 00m 45s)
  • 12:39 akosiaris@deploy1001: scap-helm zotero upgrade production2 -f zoterov2-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 12:29 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use localized wgMetaNamespace and wgMetaNamespaceTalk in satwiki (T211294) (duration: 00m 45s)
  • 12:23 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for students writing Wikipedia program (T212226) (duration: 00m 44s)
  • 12:14 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for University of Southern California editathon (T212917) (duration: 00m 45s)
  • 12:07 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: T212768 [cirrus] re-enable HHVM connection pooling (duration: 00m 45s)
  • 12:01 mobrovac@deploy1001: Finished deploy [restbase/deploy@503b29c] (dev-cluster): Add test-commons and nap.wikisource (duration: 12m 38s)
  • 11:49 mobrovac@deploy1001: Started deploy [restbase/deploy@503b29c] (dev-cluster): Add test-commons and nap.wikisource
  • 11:46 mobrovac@deploy1001: Synchronized wmf-config/CommonSettings.php: Increase time out on the MW side to 60s - T204183 (duration: 00m 51s)
  • 11:36 akosiaris@deploy1001: scap-helm zotero finished
  • 11:36 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 11:36 akosiaris@deploy1001: scap-helm zotero upgrade production -f zoterov2-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 11:35 akosiaris@deploy1001: scap-helm zotero finished
  • 11:35 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 11:35 akosiaris@deploy1001: scap-helm zotero upgrade production -f zoterov2-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 11:33 mobrovac@deploy1001: Started restart [electron-render/deploy@94d27d7]: Electron strugling, restart - T213154
  • 11:29 oblivian@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=zotero,name=codfw
  • 11:24 oblivian@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=zotero,name=codfw
  • 11:07 jynus: stoping and restarting db1102 (s5, s4) for upgrade
  • 11:04 moritzm: rebooting mw1261
  • 10:48 moritzm: installing libseccomp updates from stretch point release
  • 10:34 dcausse: elastic@eqiad setting crosscluster conf on production search cluster (T213150)
  • 10:25 banyek: executing schema change on db1062 - T85757
  • 09:39 foks: reset user email for Zergiorubio
  • 09:26 akosiaris@deploy1001: scap-helm zotero finished
  • 09:26 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 09:26 akosiaris@deploy1001: scap-helm zotero install --name production2 -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 09:22 jynus: stop replication on db1124:s5 T213108
  • 09:21 akosiaris@deploy1001: scap-helm zotero finished
  • 09:21 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 09:21 akosiaris@deploy1001: scap-helm zotero install --name production2 -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 09:19 hashar: gerrit: resaved configuration for All-Projects by changing "Max Reviewers" from 3 to 4. Might enable adding reviewers automatically based on git blame. See task for config diff # T101131
  • 09:12 mobrovac@deploy1001: Finished deploy [cpjobqueue/deploy@f91cf04]: Increase the concurrency of categoryMembershipJob - T192691 (duration: 00m 59s)
  • 09:12 mobrovac@deploy1001: Started deploy [cpjobqueue/deploy@f91cf04]: Increase the concurrency of categoryMembershipJob - T192691
  • 05:39 SMalyshev: restarted some Blazegraph servers as precaution against corruption issues
  • 04:26 onimisionipe: depooling wdqs1008 - T213134
  • 03:23 kartik@deploy1001: Finished deploy [cxserver/deploy@b669f95]: Update cxserver to d6b1d6f (duration: 05m 00s)
  • 03:18 kartik@deploy1001: Started deploy [cxserver/deploy@b669f95]: Update cxserver to d6b1d6f
  • 00:22 gehel: restarting tilerator on all maps servers
  • 00:06 gehel: depooling wdqs1007 (something looks like DB corruption)

2019-01-07

  • 23:56 eileen: update civicrm revision changed from bcb4b7a7d1 to b8e3a71845, config revision is 260be32d0a
  • 22:08 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: TestCommons: Re-enable uploading of files, accidentally prevented (duration: 00m 44s)
  • 21:19 XioNoX: push NAT changes to pfw3-eqiad - T211028
  • 21:16 awight@deploy1001: Finished deploy [ores/deploy@9253beb]: T212530: new ORES models; revscoring 2.3.0 (duration: 15m 28s)
  • 21:13 mforns@deploy1001: Finished deploy [analytics/refinery@faac592]: deploying analytics/refinery to account with refinery-source v0.0.83 (duration: 06m 52s)
  • 21:06 mforns@deploy1001: Started deploy [analytics/refinery@faac592]: deploying analytics/refinery to account with refinery-source v0.0.83
  • 21:00 awight@deploy1001: Started deploy [ores/deploy@9253beb]: T212530: new ORES models; revscoring 2.3.0
  • 20:19 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: TestCommons: Final go-switch for WBMI Ie52b8af006ba (duration: 00m 45s)
  • 19:52 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove redundant namespace talk definitions (T206952) (duration: 00m 44s)
  • 19:46 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set $wgMetaNamespace for bewikibooks (T212665) (duration: 00m 45s)
  • 19:43 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable WikibaseRepo and WikibaseMediaInfo on testcommonswiki (duration: 00m 44s)
  • 19:42 XioNoX: push firewall change to pfw3-codfw/eqiad - T211712
  • 19:40 catrope@deploy1001: Synchronized wmf-config/Wikibase.php: Set empty clientDbList for testcommonswiki (duration: 00m 44s)
  • 19:38 catrope@deploy1001: Synchronized dblists/wikidata.dblist: Enable Wikidata on testcommonswiki (duration: 00m 44s)
  • 19:28 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add importupload to sysops on testcommons (duration: 00m 45s)
  • 19:14 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable Flow beta feature on viwikisource (T212929) (duration: 00m 45s)
  • 19:13 catrope@deploy1001: Synchronized dblists/flow.dblist: Enable Flow on viwikisource (T212929) (duration: 00m 45s)
  • 19:11 RoanKattouw: Ran emptyUserGroup.php for autoreview, reviewer and editor groups on srwikinews (T212058)
  • 18:51 XioNoX: re-deactivate bgp sessions to Zayo on cr1-eqiad - T212791
  • 18:20 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@d8f911c]: new GUI, Updater & Blazegraph build (duration: 10m 13s)
  • 18:18 XioNoX: activate bgp sessions to Zayo on cr1-eqiad - T212791
  • 18:10 jynus: manually creating tables on es1015, es1017 with replication for testcommonswiki
  • 18:10 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@d8f911c]: new GUI, Updater & Blazegraph build
  • 18:07 onimisionipe@deploy1001: deploy aborted: (no justification provided) (duration: 00m 04s)
  • 18:06 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@d8f911c]: (no justification provided)
  • 18:05 XioNoX: deactivate bgp sessions to Zayo on cr1-eqiad T212791
  • 17:35 akosiaris: restart pdfrender on scb1004
  • 17:35 akosiaris: restart pdfrender
  • 17:23 kartik@deploy1001: Finished deploy [cxserver/deploy@594420b]: Update cxserver to 7632c43 (duration: 04m 06s)
  • 17:19 kartik@deploy1001: Started deploy [cxserver/deploy@594420b]: Update cxserver to 7632c43
  • 16:24 jynus: shutting down mariadb again and rebooting db1107
  • 16:15 jynus: starting mariadb on db1107
  • 16:12 onimisionipe: starting inplace reindexing for enwiki - T212224
  • 16:07 volans: powercycle db1107
  • 16:03 elukey: stop eventlogging mysql consumers on eventlog1002 and eventlogging replication on db1108 due to issues with db1107
  • 16:02 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 00m 45s)
  • 15:46 cmjohnson1: replacing bad fuse on the PDU rack A2 eqiad
  • 14:19 moritzm: added jbond to WMF-LDAP group in Phabricator (T213079)
  • 13:56 ariel@deploy1001: Finished deploy [dumps/dumps@acd9bca]: logging and quiet mode for adds-changes and other dumps (duration: 00m 05s)
  • 13:56 ariel@deploy1001: Started deploy [dumps/dumps@acd9bca]: logging and quiet mode for adds-changes and other dumps
  • 13:02 zeljkof: EU SWAT finished
  • 13:01 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: cirrus: increase number of shards (T212224) (duration: 00m 44s)
  • 12:48 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Restrict moving categories for users at srwiki (T213050) (duration: 00m 44s)
  • 12:40 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Cleanup old throttle rules (duration: 00m 44s)
  • 12:34 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: To lift a cap on account creation from IP for mrwiki community (T212921) (duration: 00m 43s)
  • 12:30 Zoranzoki21: tools.zoranzoki21wiki Archived https://www.mediawiki.org/w/index.php?title=Extension:Woopra (https://www.wikidata.org/wiki/Q21679347) - T212994
  • 12:29 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable reader trust survey (T209882) (duration: 00m 45s)
  • 12:21 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Quiz extension on ru.wikibooks (T212622) (duration: 00m 45s)
  • 12:15 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect user right to editor user group at pl.wikisource (T212655) (duration: 00m 44s)
  • 12:11 gtirloni: disabled notifications for cloudvirt0124 (T212360)
  • 12:11 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable extendedmover user group at en.wiktionary (T212662) (duration: 00m 46s)
  • 12:07 kartik@deploy1001: Finished deploy [cxserver/deploy@2d54a64]: Deploy Google Translation (T90208) (duration: 05m 07s)
  • 12:02 kartik@deploy1001: Started deploy [cxserver/deploy@2d54a64]: Deploy Google Translation (T90208)
  • 10:36 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1079 after schema change - T85757 (duration: 00m 44s)
  • 10:31 filippo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Move group1 to new logging infrastructure - T211124 (duration: 00m 45s)
  • 10:30 banyek: repooling db1079 after schema change - T85757
  • 10:27 banyek: restarting replication on db1079 - T85757
  • 09:55 banyek: executing schema change on db1079 with replication enabled - T85757
  • 09:53 banyek: stopping replication on db1079 - T85757
  • 09:47 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1079 for schema change - T85757 (duration: 01m 02s)
  • 09:36 banyek: depooling db1079 for schema change - T85757
  • 08:30 moritzm: rolling restart of swift backend servers to pick up OpenSSL security update
  • 07:24 elukey: restart pdfrender on scb1002

2019-01-06

  • 14:50 ariel@deploy1001: Finished deploy [dumps/dumps@cb30b6c]: check xml files for closing mediawiki tag (duration: 00m 06s)
  • 14:50 ariel@deploy1001: Started deploy [dumps/dumps@cb30b6c]: check xml files for closing mediawiki tag

2019-01-05

  • 20:23 elukey: manually clean up of big logs under /var/log/.. on analytics-tool1002 due to root partition almost filled up

2019-01-04

  • 23:07 mutante: scandium apt-get remove nodejs nodes-legacy ; puppet agent -tv - after merging gerrit:482150 this fixed "you have held broken packages" issue, now we are at a puppet dependecy cycle with apt::pin T201366
  • 15:42 bawolff@deploy1001: Synchronized private/PrivateSettings.php: T212667 - More aggressive anti-spam measures for account creation on kowiki (duration: 00m 48s)
  • 14:08 moritzm: rebooting etcd1001-1003 to pick up SSBD-enabled qemu
  • 13:52 moritzm: rebooting etcd1004-1006 to pick up SSBD-enabled qemu
  • 13:33 moritzm: rebooting kubernetes staging etcd hosts to pick up SSBD-enabled qemu
  • 13:11 moritzm: rebooting kubernetes staging master to pick up SSBD-enabled qemu
  • 12:57 moritzm: rebooting kubernetes staging workers for kernel security update
  • 11:58 moritzm: installing libsndfile security updates
  • 11:33 moritzm: installing jasper security updates
  • 11:31 moritzm: installing libdatetime-timezone-perl updates for recent tz changes
  • 10:47 arturo: T212898 reimaging cloudvirt1024 as stretch
  • 10:46 moritzm: rolling restart of swift proxies to pick up OpenSSL update
  • 09:57 jijiki: restarting thumbor services to pick up 481141
  • 09:50 onimisionipe: restarting nginx on all wdqs hosts
  • 09:40 banyek: executing schema change on dbstore1002 - T85757
  • 09:13 moritzm: restarting nginx on puppetdb hosts to pick up new OpenSSL
  • 09:03 banyek: executing schema change on db1116 - T85757
  • 08:44 moritzm: restarting nginx on francium to pick up new OpenSSL
  • 08:16 elukey: restart eventlogging daemons on eventlog1002 to pick up openssl updates
  • 07:56 moritzm: installing OpenSSL security updates
  • 00:07 mutante: an-coord1001 - apt-get clean to free disk space, reacting to Icinga alert for running out of disk

2019-01-03

  • 23:08 volans: restarted pdfrender on scb1004
  • 22:29 volans: restarted all slaves on dbstore1002 (relayed from banyek)
  • 22:14 banyek: stopping all slaves on dbstore1002 (NOT labsdb)
  • 22:14 banyek: stopping all slaves on labsdb1002
  • 20:50 reedy@deploy1001: Synchronized multiversion/MWMultiVersion.php: Fix error for testcommons (duration: 00m 44s)
  • 20:46 reedy@deploy1001: Synchronized dblists/group0.dblist: Add testcommonswiki to group0 (duration: 00m 43s)
  • 20:43 reedy@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 02m 05s)
  • 20:24 reedy@deploy1001: Synchronized wmf-config/db-codfw.php: T197616 (duration: 00m 44s)
  • 20:23 reedy@deploy1001: Synchronized wmf-config/db-eqiad.php: T197616 (duration: 00m 44s)
  • 20:13 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T197616 (duration: 00m 44s)
  • 20:12 reedy@deploy1001: Synchronized multiversion/MWMultiVersion.php: T197616 (duration: 00m 44s)
  • 20:11 reedy@deploy1001: rebuilt and synchronized wikiversions files: T197616
  • 20:09 reedy@deploy1001: Synchronized dblists/: T197616 (duration: 00m 45s)
  • 18:51 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@1182b3b]: Update mobileapps to f6ad0e5: Set timeout for backend /page/html requests, part 2 (duration: 05m 27s)
  • 18:46 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@1182b3b]: Update mobileapps to f6ad0e5: Set timeout for backend /page/html requests, part 2
  • 18:37 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@c470ed2]: Update mobileapps to f6ad0e5: Set timeout for backend /page/html requests (duration: 04m 11s)
  • 18:33 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@c470ed2]: Update mobileapps to f6ad0e5: Set timeout for backend /page/html requests
  • 18:21 volans: restart pdfrender on scb1003
  • 17:58 ariel@deploy1001: Finished deploy [dumps/dumps@10dc8ad]: return properly if commands failed (duration: 00m 08s)
  • 17:58 ariel@deploy1001: Started deploy [dumps/dumps@10dc8ad]: return properly if commands failed
  • 16:32 XioNoX: remove old 10.64.22.0/24 IPs from cloud-instance-transport1-b-eqiad - T207663
  • 16:22 moritzm: rebooting kubernetes workers in eqiad for kernel security update
  • 16:02 arturo: reimaging cloudvirt1013 cloudvirt1026-1028 to stretch
  • 15:48 moritzm: restart parsoid on wtp1025 to pick up OpenSSL update for nodejs
  • 15:43 jijiki: Enabled puppet on mw servers after merging 481796 - T197616
  • 15:31 jijiki: Disabling puppet on mw servers to test 481796 - T197616
  • 15:14 ejegg: updated Fundraising CiviCRM from b33dcd3c94 to bcb4b7a7d1
  • 14:37 moritzm: rebooting kubernetes workers in codfw for kernel security update
  • 14:37 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1101:3317 after schema change - T85757 (duration: 00m 44s)
  • 14:32 banyek: repooling db1101:3317 after schema change - T85757
  • 14:21 moritzm: rebooting kubernetes masters in eqiad to pick up SSBD-enabled qemu
  • 14:14 moritzm: rebooting kubernetes mastes in codfw to pick up SSBD-enabled qemu
  • 14:05 arturo: T209616 reimage cloudvirt1029 as debian stretch
  • 13:43 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1101:3317 for schema change - T85757 (duration: 00m 44s)
  • 13:41 banyek: depooling db1101:3317 for schema change - T85757
  • 13:38 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1098:3317 after schema change - T85757 (duration: 00m 44s)
  • 13:34 banyek: repooling db1098:3317 after schema change - T85757
  • 13:24 kartik@deploy1001: Finished deploy [cxserver/deploy@3b2ede7]: Update cxserver to 2369a18 (duration: 04m 30s)
  • 13:20 kartik@deploy1001: Started deploy [cxserver/deploy@3b2ede7]: Update cxserver to 2369a18
  • 12:58 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1098:3317 for schema change - T85757 (duration: 00m 45s)
  • 12:55 banyek: depooling db1098:3317 for schema change - T85757
  • 12:54 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1094 after schema change - T85757 (duration: 00m 45s)
  • 12:49 banyek: repooling db1094 after schema change - T85757
  • 12:41 arturo: T212302 reimaging again cloudvirt1030 to test final puppet code
  • 12:33 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1094 for schema change - T85757 (duration: 00m 46s)
  • 12:28 banyek: depooling db1094 for schema change - T85757
  • 12:27 moritzm: restarting tor on torrelay1001 to pick up OpenSSL security update
  • 11:02 _joe_: manually reloading icinga to pick up changes to commands.cfg
  • 10:55 moritzm: installing apache updates on puppetmasters
  • 10:22 moritzm: installing ghostscript security updates on jessie
  • 09:51 elukey: restart memcached on mc1023 to apply -R 200 - T208844
  • 09:46 moritzm: remove imagemagick remnants from ATS hosts (obsoleted by upstream packaging change which dropped the webp plugin)
  • 09:39 moritzm: installing nginx updates on puppetdb*
  • 09:26 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: repool es2019 - T212833 (duration: 01m 33s)
  • 09:18 banyek: repooling es2019 - T212833
  • 08:46 moritzm: rolling restart of proton to pick up OpenSSL update
  • 08:35 banyek: depooled es2019 as host was unsresponsive - T212833
  • 08:35 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: depool es2019, host is unsresponsible - T212833 (duration: 00m 49s)
  • 08:11 moritzm: installing OpenSSL security updates
  • 00:21 mutante: notebook1004 - started nagios-nrpe-server one more time

2019-01-02

  • 23:59 mutante: notebook1004 still keeps running out of memory from some user actions and that kills nagios-nrpe-server and that causes a bunch of Icinga alerts
  • 23:39 mutante: notebook1004 - systemctl start nagios-nrpe-server
  • 23:39 mutante: notebook1004 - systemctl status nagios-nrpe-server
  • 20:59 herron@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid,service=parsoid,name=wtp1028.eqiad.wmnet
  • 20:59 herron: repooling wtp1028 T212624
  • 20:52 herron: rebooting wtp1028 — looking for POST errors T212624
  • 20:05 Krinkle: mwmaint1002: foreachwikiindblist s5 deleteEqualMessages.php
  • 20:04 Krinkle: mwmaint1002: foreachwikiindblist s2 deleteEqualMessages.php
  • 18:35 volans: restarting icinga on icinga1001 T212669
  • 16:50 XioNoX: create BGP sessions to AS3214 in AMS-IX
  • 16:46 XioNoX: remove BGP sessions to AS42949 in AMS-IX (leaving the IX)
  • 16:43 XioNoX: remove BGP sessions to AS6866 in AMS-IX (leaving the IX)
  • 16:33 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1090:3317 after schema change - T85757 (duration: 00m 46s)
  • 16:30 arturo: reimaging cloudvirt1030 with stretch, server cleanup after puppet refactoring
  • 16:29 moritzm: restarting Superset to pick up openssl security update
  • 16:25 moritzm: restarting Hue to pick up openssl security update
  • 16:23 arturo: T212302 re-enable puppet in all {cloud,lab}virt* servers, all was fine
  • 16:22 banyek: repooling db1090:3317 after schema change (T85757)
  • 16:11 arturo: T212302 disable puppet in all {cloud,lab}virt* servers to merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/481194/
  • 15:39 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1090:3317 for schema change - T85757 (duration: 00m 44s)
  • 15:34 moritzm: installing OpenSSL security updates
  • 15:31 banyek: depooling db1090:3317 for schema change (T85757)
  • 15:13 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1086 after schema change - T85757 (duration: 00m 44s)
  • 15:07 banyek: repooling db1086 after schema change (T85757)
  • 14:49 banyek: executing schema change on db1086 - T85757
  • 14:48 moritzm: installing ghostscript security update for jessie
  • 14:47 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1086 for schema change - T85757 (duration: 00m 45s)
  • 14:38 banyek: depooling db1086 for schema change (T85757)
  • 14:15 ema: cp hosts: upgrade OpenSSL from 1.1.0f to 1.1.0j
  • 13:39 moritzm: installing ghostscript update for stretch
  • 13:33 moritzm: installing libav security updates
  • 13:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1119 T86338 T202167 (duration: 00m 44s)
  • 13:17 moritzm: installing openjpeg2 security updates
  • 13:17 banyek: executing schema change on db2040 (s7 codfw master) replication lag could be expected on codfw - T85757
  • 13:13 banyek: stopping replication on db2077 prior to executing schema change on codfw s7 master (db2040) - T85757
  • 13:06 marostegui: Deploy schema change on db1119 - T86338 T202167
  • 13:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1119 T86338 T202167 (duration: 00m 45s)
  • 13:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 T86338 T202167 (duration: 00m 47s)
  • 12:00 moritzm: rebooting labtestpuppetmaster2001 for kernel security update
  • 11:53 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe1006.eqiad.wmnet
  • 11:51 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1006.eqiad.wmnet
  • 11:50 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe1006.codfw.wmnet
  • 11:46 ema: replace TLS certificates on ms-fe eqiad hosts T212215
  • 11:41 moritzm: rebooting labtestweb2001 for kernel security update
  • 11:24 marostegui: Deploy schema change on db1099:3311 - T86338 T202167
  • 11:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 T86338 T202167 (duration: 00m 45s)
  • 11:17 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2006.codfw.wmnet
  • 11:10 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2006.codfw.wmnet
  • 10:59 ema: replace TLS certificates on ms-fe codfw hosts T212215
  • 10:52 moritzm: rebooting centrallog1001 for kernel security update
  • 10:48 volans: testing the new spicerack package on cumin2001, in the unlikely event you need to use spicerack cookbooks today please use cumin1001
  • 10:45 godog: ms-be2018 Flashing Smart Array P840 in Slot 3 [ 3.00 -> 6.60 ]
  • 10:43 moritzm: removed labvirt1013 from debmonitor, got renamed in T212513
  • 10:42 volans: uploaded spicerack_0.0.10-1_amd64.deb to apt.wikimedia.org stretch-wikimedia
  • 10:03 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2096 (duration: 00m 44s)
  • 09:50 marostegui: Stop MySQL on db2096 for kernel and mysql upgrade
  • 09:49 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2096 (duration: 00m 45s)
  • 09:48 marostegui@deploy1001: sync-file aborted: Depool db2096 (duration: 00m 01s)
  • 09:18 moritzm: installing c3p0 security updates
  • 09:07 Zoranzoki21: Drop valid_tag from s8 by Marostegui - T212254
  • 09:06 godog: eqiad-prod: final weight for ms-be10[44-50].eqiad.wmnet - T209618
  • 08:56 moritzm: installing libarchive security updates
  • 07:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 - T212692 (duration: 00m 46s)
  • 07:30 marostegui: Fix login.logging table on db1078 - T212692
  • 07:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T212692 (duration: 00m 47s)
  • 07:01 marostegui: Deploy schema change on s1 codfw master (lag will be generated on s1 codfw) - T202167 T86338
  • 06:54 marostegui: Drop empty valid_tag table from labswiki labtestwiki - T212254
  • 06:49 marostegui: Drop empty valid_tag table from s5 - T212254
  • 06:25 marostegui: Drop valid_tag from s6 - T212254
  • 06:15 marostegui: Fix last chunks on db1124:338 - T212574


Other archives

2000s

2010s