Server admin log/Archive 34

From Wikitech
Jump to navigation Jump to search

2018-05-31

  • 23:59 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 23:38 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 02m 22s)
  • 23:36 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 23:35 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 06m 57s)
  • 23:28 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 23:02 pnorman@tin: Finished deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 02m 22s)
  • 22:59 pnorman@tin: Started deploy [tilerator/deploy@709ca69] (cleartables): Redeploy to 2004 to try to reproduce error
  • 22:51 pnorman@tin: Finished deploy [tilerator/deploy@2a26f1e] (cleartables): Redeploy to 2004 to try to reproduce error (duration: 02m 14s)
  • 22:48 pnorman@tin: Started deploy [tilerator/deploy@2a26f1e] (cleartables): Redeploy to 2004 to try to reproduce error
  • 22:11 mholloway-shell@tin: Finished deploy [tilerator/deploy@2a26f1e] (cleartables): (no justification provided) (duration: 12m 46s)
  • 21:58 mholloway-shell@tin: Started deploy [tilerator/deploy@2a26f1e] (cleartables): (no justification provided)
  • 21:55 jynus: temporarily reducing s4-codfw-master consistency to aliviate lag (binlog_sync, flush_log)
  • 21:00 mutante: dzahn@neodymium:~$ sudo wmf-auto-reimage-host --new phab1002.eqiad.wmnet (T196019)
  • 20:57 mholloway-shell@tin: Finished deploy [tilerator/deploy@2a26f1e] (cleartables): (no justification provided) (duration: 00m 07s)
  • 20:57 mholloway-shell@tin: Started deploy [tilerator/deploy@2a26f1e] (cleartables): (no justification provided)
  • 19:47 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.6
  • 19:42 mholloway-shell@tin: Started deploy [tilerator/deploy@2a26f1e] (cleartables): (no justification provided)
  • 19:40 mholloway-shell@tin: Started deploy [tilerator/deploy@2a26f1e] (cleartables): (no justification provided)
  • 19:31 mholloway-shell@tin: Started deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided)
  • 19:29 mholloway-shell@tin: Finished deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided) (duration: 00m 06s)
  • 19:29 mholloway-shell@tin: Started deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided)
  • 19:27 mholloway-shell@tin: Started deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided)
  • 19:25 mholloway-shell@tin: Finished deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided) (duration: 00m 05s)
  • 19:25 mholloway-shell@tin: Started deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided)
  • 19:11 mholloway-shell@tin: Finished deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided) (duration: 00m 06s)
  • 19:11 mholloway-shell@tin: Started deploy [tilerator/deploy@UNKNOWN] (cleartables): (no justification provided)
  • 19:04 ariel@tin: Finished deploy [dumps/dumps@038c8b3]: tempdir split into subdirs (duration: 00m 04s)
  • 19:04 ariel@tin: Started deploy [dumps/dumps@038c8b3]: tempdir split into subdirs
  • 16:07 addshore: WikibaseLexeme slot done (7 min overrun)
  • 16:06 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: REVERT Load WikibaseLexeme on testwiki (again) T195615 (duration: 01m 21s)
  • 16:03 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: REVERT Load WikibaseLexeme on group0 T195615 (duration: 01m 21s)
  • 15:59 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Load WikibaseLexeme on group0 T195615 (duration: 01m 18s)
  • 15:52 andrewbogott: rebooting labtestservices2001 to troubleshoot unknown load problems
  • 15:50 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Load WikibaseLexeme on testwiki (again) T195615 (duration: 01m 21s)
  • 15:46 herron: enabling localhost:25 exim smtp listeners in production realm T175361
  • 15:44 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/WikibaseLexeme: Only add repo-specific entity type definition elements in Repo context T195615 (duration: 01m 32s)
  • 15:42 addshore@tin: Synchronized php-1.32.0-wmf.6/extensions/WikibaseLexeme: Only add repo-specific entity type definition elements in Repo context T195615 (duration: 01m 32s)
  • 15:31 jynus: reimage db2079
  • 14:52 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: NOOP patch and revert Load WikibaseLexeme on testwiki (sanity) (duration: 01m 22s)
  • 14:38 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/WikibaseLexeme/src/WikibaseLexemeHooks.php: T195615 Dont run repo only hooks on clients (duration: 01m 20s)
  • 14:36 addshore@tin: Synchronized php-1.32.0-wmf.6/extensions/WikibaseLexeme/src/WikibaseLexemeHooks.php: T195615 Dont run repo only hooks on clients (duration: 01m 24s)
  • 14:27 addshore@tin: Synchronized wmf-config/Wikibase.php: Wikibase.php shift around the loading of WikibaseLexeme (duration: 01m 22s)
  • 14:21 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY gerrit (duration: 01m 21s)
  • 14:20 addshore@tin: Synchronized wmf-config/Wikibase-labs.php: BETA ONLY gerrit (duration: 01m 21s)
  • 14:08 ottomata: beginning restarts of Kafka main-eqiad to enable SSL port - T193778
  • 13:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool sanitarium masters - T190704 (duration: 01m 21s)
  • 13:23 hashar@tin: Synchronized php-1.32.0-wmf.6/vendor: WikibaseLexeme: Encoding problems in labels - T195359 (duration: 03m 12s)
  • 12:54 akosiaris: reimage kubernetes200{3,4}.codfw.wmnet
  • 12:30 marostegui: Stop replication on all sanitarium masters - T190704
  • 11:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool sanitarium masters - T190704 (duration: 01m 22s)
  • 10:45 elukey: removed Pivot from thorium (pivot.wikimedia.org now simply redirects to Turnilo)
  • 09:45 marostegui: Stop MySQL on db2062 to clone db2092
  • 09:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 for cloning db2092 (duration: 01m 22s)
  • 09:40 volans: restarting Icinga, issues processing the command file
  • 09:35 jynus: testing icinga alerting
  • 09:28 elukey: reimage druid1006 to Debian Stretch
  • 09:11 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 fully (duration: 01m 21s)
  • 08:42 gehel: power off elastic2018 - T196045
  • 08:39 Amir1: ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --sleep 2 --check-old
  • 08:09 gehel: power reset elastic2018
  • 08:08 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,service=mathoid,cluster=scb,name=scb.*
  • 08:07 akosiaris@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,service=mathoid,cluster=scb,name=scb.*
  • 07:54 akosiaris: reimage kubernetes1004 without swap
  • 07:48 ema: power-cycle elastic2018
  • 07:41 marostegui: Stop Replication on db2066
  • 07:40 akosiaris: re-enable puppet across the fleet
  • 07:30 akosiaris: disable puppet for https://gerrit.wikimedia.org/r/#/c/436468/ merge cross fleet
  • 07:27 marostegui: Deploy schema change on s5 codfw master (db2052) this will generate lag on codfw - T191316 T192926 T89737 T195193
  • 07:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 after alter table (duration: 01m 22s)
  • 06:38 marostegui: Deploy schema change on db2066 - T191316 T192926 T89737 T195193
  • 06:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 for alter table (duration: 01m 21s)
  • 06:30 marostegui: Deploy schema change on db2092:3315 and db2094:3315 - T191316 T192926 T89737 T195193
  • 06:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2084:3315 after alter table (duration: 01m 28s)
  • 06:09 akosiaris: reimage ganeti1003, ganeti1007 to stretch
  • 05:59 elukey: reimage druid1005 to Debian Stretch
  • 05:51 elukey: delete /tmp/scap_l10n_1501525840,scap_l10n_1501525840,l10nstuff,l10nstuff3 from tin to free some space in the root partition (1.9G left)
  • 05:40 elukey: restart pdfrender on scb1001
  • 05:25 marostegui: Deploy schema change on db2084:3315 - T191316 T192926 T89737 T195193
  • 05:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2084:3315 for alter table (duration: 01m 27s)
  • 05:13 moritzm: installing python-crypto security updates on trusty
  • 04:19 l10nupdate@tin: ResourceLoader cache refresh completed at Thu May 31 04:19:57 UTC 2018 (duration 14m 48s)
  • 04:18 moritzm: rebooting wtp2001/wtp2015 for microcode updates
  • 04:05 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.6) (duration: 18m 15s)
  • 03:03 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.5) (duration: 16m 18s)
  • 02:09 pnorman@tin: Finished deploy [tilerator/deploy@2a26f1e] (cleartables): Deploy style with fewer fonts (duration: 00m 24s)
  • 02:08 pnorman@tin: Started deploy [tilerator/deploy@2a26f1e] (cleartables): Deploy style with fewer fonts
  • 02:02 pnorman@tin: Finished deploy [tilerator/deploy@78448de] (cleartables): Deploy style with fewer fonts (duration: 00m 25s)
  • 02:01 pnorman@tin: Started deploy [tilerator/deploy@78448de] (cleartables): Deploy style with fewer fonts
  • 01:39 pnorman@tin: Finished deploy [tilerator/deploy@fad9969] (cleartables): Deploy updated stylesheet (duration: 00m 24s)
  • 01:39 pnorman@tin: Started deploy [tilerator/deploy@fad9969] (cleartables): Deploy updated stylesheet
  • 01:38 pnorman@tin: Finished deploy [tilerator/deploy@78d1b82] (cleartables): Deploy updated stylesheet (duration: 00m 05s)
  • 01:38 pnorman@tin: Started deploy [tilerator/deploy@78d1b82] (cleartables): Deploy updated stylesheet
  • 01:30 pnorman@tin: Finished deploy [tilerator/deploy@78d1b82] (cleartables): Deploy updated stylesheet (duration: 00m 25s)
  • 01:30 pnorman@tin: Started deploy [tilerator/deploy@78d1b82] (cleartables): Deploy updated stylesheet
  • 01:05 mutante: $lang.planet.wikimedia.org is changing software from planet-venus to rawdog

2018-05-30

  • 22:40 thcipriani@tin: Synchronized php-1.32.0-wmf.6/skins/Timeless/includes/TimelessTemplate.php: Fix condition for "emptyPortlet" class T196026 (duration: 01m 21s)
  • 22:22 thcipriani@tin: Synchronized php-1.32.0-wmf.6/extensions/GlobalPreferences/includes/Hooks.php: SWAT: Do not type hint PreferencesFormPreSave hook against PreferencesForm T196023 (duration: 01m 22s)
  • 21:47 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.32.0-wmf.6
  • 20:58 thcipriani@tin: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.6
  • 20:45 thcipriani@tin: Finished scap: testwiki to php-1.32.0-wmf.6 and rebuild l10n cache (duration: 114m 02s)
  • 19:38 pnorman@tin: Finished deploy [tilerator/deploy@9e40702]: Restore test2001 test2002 (duration: 00m 27s)
  • 19:37 pnorman@tin: Started deploy [tilerator/deploy@9e40702]: Restore test2001 test2002
  • 19:33 pnorman@tin: Started deploy [tilerator/deploy@bc35971] (cleartables): Use parameterized dbname on test2004
  • 19:31 pnorman@tin: Finished deploy [tilerator/deploy@bc35971] (cleartables): Use parameterized dbname on test2004 (duration: 00m 13s)
  • 19:31 pnorman@tin: Started deploy [tilerator/deploy@bc35971] (cleartables): Use parameterized dbname on test2004
  • 18:51 thcipriani@tin: Started scap: testwiki to php-1.32.0-wmf.6 and rebuild l10n cache
  • 18:05 Amir1: Morning SWAT is done
  • 18:04 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Update Wikidata wgPropertySuggesterDeprecatedIds (duration: 01m 01s)
  • 17:54 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on a bunch of additional wikis (T195263) (duration: 01m 02s)
  • 17:47 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable the datetime selector on Sp:Block on all Wikimedia wikis (T193785) (duration: 01m 03s)
  • 17:42 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgCookieSetOnIpBlock on test wiki (T195930) (duration: 01m 03s)
  • 16:38 moritzm: upgrading openjdk-7 on conf100[1-3]
  • 16:21 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with full weight, repool db1089 with low (duration: 01m 02s)
  • 15:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2038 after alter table (duration: 01m 01s)
  • 15:43 moritzm: installing spice security updates
  • 15:22 jynus: stop and reimage db1082
  • 15:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with higher load, depool db1082 (duration: 01m 01s)
  • 15:09 marostegui: Deploy schema change on db2038 - T191316 T192926 T89737 T195193
  • 15:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2038 for alter table (duration: 01m 01s)
  • 14:21 herron: changing logstash elasticsearch index prefix for syslogs to 'logstash-syslog' https://gerrit.wikimedia.org/r/431860
  • 14:11 ottomata: enabling SSL port for Kafka main-codfw cluster (take 2 :) ) T193778
  • 13:48 raynor: EU SWAT finished
  • 13:46 pmiazga@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Remove unused PopupsAnonsExperimentalGroupSize config variable (T173952) (duration: 01m 01s)
  • 13:44 pmiazga@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove unused PopupsAnonsExperimentalGroupSize config variable (T173952) (duration: 01m 02s)
  • 13:42 ppchelko@tin: Finished deploy [changeprop/deploy@4503987]: Fix a bug with no content in ores response (duration: 01m 38s)
  • 13:40 ppchelko@tin: Started deploy [changeprop/deploy@4503987]: Fix a bug with no content in ores response
  • 13:39 ppchelko@tin: Finished deploy [eventstreams/deploy@14e0b03]: Recreate config with new puppet and restart service T167180 (duration: 02m 07s)
  • 13:37 ppchelko@tin: Started deploy [eventstreams/deploy@14e0b03]: Recreate config with new puppet and restart service T167180
  • 13:32 elukey: reboot analytics1002 (Hadoop master node standby) to pick up new cpu microcode
  • 13:29 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low load (duration: 00m 59s)
  • 13:27 ppchelko@tin: Finished deploy [changeprop/deploy@43310d4]: Emit revision-score event T167180 (duration: 01m 32s)
  • 13:26 ppchelko@tin: Started deploy [changeprop/deploy@43310d4]: Emit revision-score event T167180
  • 13:24 akosiaris: emptying ganeti1003, ganeti1007 for stretch upgrade
  • 13:22 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add WMDS support question feed to mediawikiwiki RSS whitelist (T185087) (duration: 01m 01s)
  • 13:12 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable detection of changes in moved paragraphs on most wikis (T195375) (duration: 01m 02s)
  • 13:10 marostegui: Deploy schema change on dbstore2001:3315 - T191316 T192926 T89737 T195193
  • 13:00 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2089:3315 after alter table (duration: 01m 02s)
  • 12:40 akosiaris: reimage ganeti1002, ganeti1005 as stretch
  • 12:35 elukey: reboot analytics1029,1042,1070 to pick up the new cpu-microcode
  • 12:34 gehel: starting elasticsearch cluster restart on codfw - T193734
  • 11:04 jynus: stop and reimage db1089
  • 10:44 marostegui: Deploy schema change on db2089:3315 - T191316 T192926 T89737 T195193
  • 10:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2089:3315 for alter table (duration: 01m 01s)
  • 10:42 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 (duration: 01m 01s)
  • 10:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2059 after alter table (duration: 01m 02s)
  • 09:47 marostegui: Deploy schema change on db2059 - T191316 T192926 T89737 T195193
  • 09:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2059 for alter table (duration: 01m 02s)
  • 09:24 XioNoX: peering setup with RIPE RIS in eqsin
  • 08:47 elukey: reimage druid1004 to Debian Stretch
  • 08:38 marostegui: Stop and reboot db2094 and db2095 for testing - T190704
  • 08:22 akosiaris: upgrade python3-tornado on scb1001 and restart apertium-apy. T194883
  • 07:38 Nikerabbit: running refresh-translatable-pages.php for wikis having Translate
  • 07:14 moritzm: installing git security updates
  • 06:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool sanitariums masters for s1, s3, s5, s8 - T190704 (duration: 01m 01s)
  • 06:43 marostegui: Stop db1087 and db2045 in sync - T190704
  • 06:31 marostegui: Stop db1082 and db2052 in sync - T190704
  • 06:25 marostegui: Stop db1077 and db2043 in sync - T190704
  • 06:17 elukey: reimage druid1001 to Debian stretch
  • 06:11 marostegui: Stop db1106 and db2048 in sync - T190704
  • 06:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool sanitariums masters for s1, s3, s5, s8 - T190704 (duration: 01m 00s)
  • 05:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool sanitariums masters for s2, s4, s6, s7 - T190704 (duration: 01m 00s)
  • 05:50 elukey: restart Kafka mirror maker on kafka10[12-23] - failures to consume after rebalance
  • 05:41 marostegui: Stop db1079 and db2040 in sync - T190704
  • 05:31 marostegui: Stop db1085 and db2039 in sync - T190704
  • 05:23 marostegui: Stop db1074 and db2035 in sync - T190704
  • 05:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool sanitariums masters for s2, s4, s6, s7 - T190704 (duration: 01m 03s)
  • 05:06 marostegui: Deploy schema change on s1 primary master db1052 - T188299
  • 03:15 l10nupdate@tin: ResourceLoader cache refresh completed at Wed May 30 03:15:44 UTC 2018 (duration 14m 3s)
  • 03:01 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.5) (duration: 12m 57s)
  • 01:08 mutante: built added rawdog_2.22-1-wmf1 to apt.wikimedia.org, upgraded rawdog on planet2001. Unpacking rawdog (2.22-1-wmf1) over (2.22-1) (T180498)
  • 00:45 pnorman@tin: Finished deploy [tilerator/deploy@bc35971]: Use parameterized dbname on test2004 (duration: 00m 21s)
  • 00:45 pnorman@tin: Started deploy [tilerator/deploy@bc35971]: Use parameterized dbname on test2004
  • 00:41 pnorman@tin: Finished deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004 (duration: 00m 12s)
  • 00:41 pnorman@tin: Started deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004
  • 00:28 pnorman@tin: Finished deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004 (duration: 00m 05s)
  • 00:28 pnorman@tin: Started deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004
  • 00:25 pnorman@tin: Finished deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004 (duration: 02m 17s)
  • 00:22 pnorman@tin: Started deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004
  • 00:22 pnorman@tin: Finished deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004 (duration: 00m 21s)
  • 00:21 pnorman@tin: Started deploy [tilerator/deploy@a194185]: Deploy test of stretch build of tilerator to test2004

2018-05-29

  • 23:10 pnorman@tin: Finished deploy [kartotherian/deploy@2b75c93]: Deploy test of stretch build of Kartotherian to test2004 (duration: 00m 23s)
  • 23:10 pnorman@tin: Started deploy [kartotherian/deploy@2b75c93]: Deploy test of stretch build of Kartotherian to test2004
  • 22:09 mutante: boron - apt-get build-dep rawdog (installed libtidy5 python-feedparser python-tidylib
  • 21:31 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.32.0-wmf.5
  • 21:06 thcipriani@tin: Synchronized php-1.32.0-wmf.5/extensions/VisualEditor/lib/ve: SWAT: Update VE core submodule to wmf/1.32.0-wmf.5 HEAD (9032a90ca) T195514 (duration: 01m 23s)
  • 20:38 volans@tin: Finished deploy [debmonitor/deploy@361c94a]: Initial sync (3) (duration: 50m 13s)
  • 19:48 volans@tin: Started deploy [debmonitor/deploy@361c94a]: Initial sync (3)
  • 19:16 thcipriani: cutting branch for wmf.6, will not deploy wmf.6 as wmf.5 is not currently on group2 as the train is blocked on T195514 which is blocked on T195868 which is blocked on T195906
  • 18:56 volans@tin: Finished deploy [debmonitor/deploy@fd06bd3]: Initial sync (2) (duration: 01m 42s)
  • 18:55 volans@tin: Started deploy [debmonitor/deploy@fd06bd3]: Initial sync (2)
  • 18:22 volans@tin: Finished deploy [debmonitor/deploy@e2efb6b]: Initial sync (duration: 00m 35s)
  • 18:21 volans@tin: Started deploy [debmonitor/deploy@e2efb6b]: Initial sync
  • 18:04 arlolra@tin: Finished deploy [parsoid/deploy@e87f54d]: Reverting Parsoid to fd49ab4 (duration: 02m 29s)
  • 18:01 arlolra@tin: Started deploy [parsoid/deploy@e87f54d]: Reverting Parsoid to fd49ab4
  • 17:59 ottomata: beginning rolling restarts of kafka main-codfw to enable SSL listener
  • 17:36 arlolra@tin: Finished deploy [parsoid/deploy@e87f54d]: Updating Parsoid to bf3a2fd2 (duration: 09m 48s)
  • 17:32 moritzm: repooled mw2182 (was down for hardware maintenance)
  • 17:26 arlolra@tin: Started deploy [parsoid/deploy@e87f54d]: Updating Parsoid to bf3a2fd2
  • 17:25 foks: Removing 2FA - T187312
  • 17:12 bsitzmann@tin: Finished deploy [mobileapps/deploy@ac4c6be]: Update mobileapps to b2fb793 (T192664) (duration: 06m 28s)
  • 17:05 bsitzmann@tin: Started deploy [mobileapps/deploy@ac4c6be]: Update mobileapps to b2fb793 (T192664)
  • 16:59 elukey: roll restart of kafka mirror maker on kafka-jumbo100* to pick up the new zookeeper settings
  • 16:56 XioNoX: bounced analytics1031 switchport to fix weird issue of that host not being able to receive traffic from analytics1001
  • 16:44 elukey: roll restart of kafka mirror maker on kafka100[1-3] to pick up new zk settings
  • 16:32 thcipriani: upgrading blubber to 0.4.0 for integration machines
  • 15:57 addshore: really done with wb_terms related syncs now
  • 15:52 addshore@tin: Synchronized wmf-config/Wikibase.php: Revert - Dont load PropertySuggester T195520 (duration: 01m 19s)
  • 15:48 elukey: roll restart kafka on kafka-jumbo* to pick up new zookeeper settings
  • 15:45 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/PropertySuggester: Use CirrusSearch for PropertySuggester (duration: 01m 21s)
  • 15:29 addshore: Wikibase - Re enable wb_terms things window - 2 more patches inbound...
  • 15:27 elukey: restart hadoop yarn/hdfs daemons to pick up the new zookeeper settings
  • 15:11 addshore: Wikibase - Re enable wb_terms things window done
  • 14:57 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase: TermSqlIndex::getMatchingTerms actually execute select (duration: 02m 19s)
  • 14:57 marostegui: Move s6 topology back to its normal status
  • 14:49 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase: TermSqlIndex::getMatchingTerms actually execute select (duration: 02m 18s)
  • 14:32 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase: Re add TermSqlIndex::getMatchingTerms select, but dont call (duration: 02m 18s)
  • 14:29 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase: Re add TermSqlIndex::getMatchingTerms select, but dont call (duration: 02m 13s)
  • 14:24 elukey: roll restart kafka on kafka100[1-3] (job queues) to pick up the new zookeeper settings
  • 14:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool all databases in row C - T187962 (duration: 01m 19s)
  • 14:13 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase: track all wb_terms table access via statsd (duration: 02m 19s)
  • 14:13 gehel: comleted rolling restart of relforge for plugin upgrade - T193734
  • 14:10 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase: track all wb_terms table access via statsd (duration: 02m 21s)
  • 14:07 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/CentralNotice: Convert numerical URL parameters to numbers for AndyRussG (was left on tin) (duration: 01m 25s)
  • 14:03 elukey: swap zookeeper from conf1003 to conf1006
  • 13:56 XioNoX: rolling back ns0 and ping1001 redirects - T187962
  • 13:47 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create 2 extra namespaces for bdwikimedia (T195700) (duration: 01m 39s)
  • 13:42 moritzm: upgrading remaining job runners in eqiad to hhvm-wikidiff 1.7.0
  • 13:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Revert "Revert "Temp rate limit for arwiki due to mass vandalism""" (T192668) (duration: 01m 51s)
  • 13:32 volans: restarted ircecho
  • 13:28 volans: puppet run on failed hosts completed
  • 13:25 moritzm: powered down mw2182 for hardware diagnosis
  • 13:19 gehel: rolling restart of relforge for plugin upgrade - T193734
  • 13:16 volans: running puppet on failed only hosts
  • 13:12 volans: stopped ircecho temporarily
  • 12:54 moritzm: installing xdg-utils security updates
  • 11:21 marostegui: Restar db1125 mysql - T195595
  • 11:14 moritzm: upgrading snapshot hosts to hhvm-wikidiff 1.7.0 (HHVM is unused, just for completeness)
  • 11:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Disable read only on s6 T194939 T187962 (duration: 01m 37s)
  • 11:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Enable read only on s6 T194939 T187962 (duration: 01m 35s)
  • 10:55 XioNoX: Eqiad row C server move starting - T187962
  • 10:53 XioNoX: Eqiad row C server move starting
  • 10:35 moritzm: upgrading mw1308-mw1311 to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 10:09 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all jobs to EventBus file 2/2 - T190327 T195500 (duration: 01m 47s)
  • 10:06 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all jobs to EventBus file 1/2 - T190327 T195500 (duration: 01m 39s)
  • 10:05 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c6dc83d]: Enable all jobs apart from exceptions for everything. T190327 (duration: 00m 58s)
  • 10:04 ppchelko@tin: Started deploy [cpjobqueue/deploy@c6dc83d]: Enable all jobs apart from exceptions for everything. T190327
  • 09:20 XioNoX: redirect ns0 to baham - T187962
  • 09:16 XioNoX: disable ping1001 redirect - T187962
  • 09:13 marostegui: Downtime s6 replicas for 4 hours - T195595
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool all databases in row C - T187962 (duration: 01m 35s)
  • 09:05 moritzm: upgrading labweb servers to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 08:40 jynus: performing topology changes on s6 ahead of a possible failover
  • 08:24 moritzm: upgrading remaining API servers in eqiad to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 07:56 moritzm: upgrading mw1276-mw1290 to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 07:49 elukey: reimage druid1002 to debian stretch
  • 07:47 gilles@tin: Synchronized wmf-config/InitialiseSettings.php: T187299 Launch performance survey on ruwiki (duration: 01m 50s)
  • 07:26 moritzm: upgrading remaining app servers in eqiad to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 06:52 elukey: roll restart hadoop master daemons to pick up the new zookeeper settings
  • 05:20 marostegui: Restart MySQL on db2045 (s8 codfw master) - T195598
  • 05:14 marostegui: Stop MySQL on db2094 and db2095 for testing - T190704
  • 04:12 l10nupdate@tin: ResourceLoader cache refresh completed at Tue May 29 04:12:10 UTC 2018 (duration 14m 32s)
  • 03:57 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.5) (duration: 14m 29s)
  • 02:59 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 13m 18s)

2018-05-28

  • 20:14 twentyafterfour: Test failures on https://gerrit.wikimedia.org/r/#/c/435825/ are preventing deployment of the fix for a critical deployment blocker (see T195514) 1.32.0-wmf.5 still blocked refs T191051
  • 20:10 twentyafterfour: train still held up by test failures: https://gerrit.wikimedia.org/r/#/c/435825/
  • 20:02 elukey: restart kafka on kafka1003 as attempt to solve the under-replicated partitions warning
  • 19:22 twentyafterfour@tin: Synchronized php-1.32.0-wmf.5/extensions/CentralNotice/: sync wmf.5 CentralNotice for AndyRussG (duration: 01m 25s)
  • 19:12 elukey: roll restart of kafka-mirror maker (main eqiad -> jumbo) on kafka-jumbo* for zookeeper conf updates
  • 19:07 twentyafterfour: attempting to get the wmf.5 train back on track. Deploying a fix for T195514 (https://gerrit.wikimedia.org/r/c/435292/) to unblock T191051
  • 18:16 elukey: restart kafka mirror maker on kafka1012->14 - failed after the last round of kafka restarts
  • 17:26 elukey: roll restart of kafka on kafka-jumbo* to pick up the new zookeeper settings
  • 17:20 gehel@tin: Finished deploy [wdqs/wdqs@0e40344]: WDQS updater and GUI (duration: 08m 59s)
  • 17:19 elukey: restart kafka on kafka1012->23 to pick up the new zookeeper settings
  • 17:11 gehel@tin: Started deploy [wdqs/wdqs@0e40344]: WDQS updater and GUI
  • 17:08 moritzm: upgrading mwdebug servers in codfw to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 16:48 moritzm: upgrading codfw video scalers to hhvm-wikidiff 1.7.0
  • 16:31 elukey: roll restart kafka on kafka100[1-3] to pick up new zookeeper settings
  • 16:21 elukey: zookeeper cluster restart completed (main-eqiad / conf1*)
  • 16:18 elukey: stop and mask zookeeper on conf1002
  • 16:16 elukey: restart prometheus-burrow-exporter on kafkamon*
  • 16:12 moritzm: upgrading job runners in codfw to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 16:02 marostegui: Stop MySQL on db2095 for testing - T190704
  • 15:59 elukey: swap zookeeper from conf1002 to conf1005
  • 15:56 marostegui: Reboot db1124 and db1125 for more testing - T190704
  • 15:49 _joe_: uploading cergen 0.2.3
  • 14:44 Amir1: EU SWAT is done
  • 14:39 ladsgroup@tin: Synchronized php-1.32.0-wmf.4/extensions/ArticlePlaceholder: Add config variable to disable SearchHookHandler (T195753) (duration: 01m 21s)
  • 14:34 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Disable search integration with Article Placeholder temporarily (T195753) (duration: 01m 20s)
  • 14:29 ladsgroup@tin: Synchronized php-1.32.0-wmf.5/extensions/ArticlePlaceholder: Add config variable to disable SearchHookHandler (T195753) (duration: 01m 18s)
  • 14:23 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Disable Special:ItemDisambiguation in Wikidata (T195756) (duration: 01m 20s)
  • 14:21 ema: cp1045,cp2001,cp3007,cp5001: reboot with intel-microcode T127825
  • 14:00 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable "File mover" flag on zh.wikipedia (T195247) (duration: 01m 19s)
  • 13:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: New protection level on the Hungarian Wikipedia - trusted (T194568) (duration: 01m 20s)
  • 13:40 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Revert "Enable $wgUseRCPatrol on azwiki"" (T194389) (duration: 01m 20s)
  • 13:34 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use uploaded HD logos for yiwikisource (T193562) (duration: 01m 19s)
  • 13:28 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Upload new logos for yiwikisource (T193562) (duration: 01m 19s)
  • 13:21 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable template editor group on newiki (T195557) (duration: 01m 21s)
  • 13:13 volans: restarted pdfrender on scb1002
  • 13:03 moritzm: upgrading app servers in codfw to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 12:09 arturo: T194665 aborrero@install1002:~$ sudo -i reprepro --noskipold -C 'thirdparty/mono-project-stretch' update stretch-wikimedia
  • 12:08 arturo: T194665 aborrero@install1002:~$ sudo -i reprepro --noskipold -C 'thirdparty/mono-project-jessie' update jessie-wikimedia
  • 12:05 arturo: aborrero@install1002:~$ sudo -i reprepro --noskipold -C 'thirdparty/mono-project-trusty' update trusty-wikimedia
  • 11:31 ema: cp1008: reboot with intel-microcode T127825
  • 11:27 moritzm: upgrading API servers in codfw to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 11:10 marostegui: Stop MySQL on db2092 to copy its content to db2094 - T190704
  • 10:36 moritzm: upgrading mw1266-mw1275 to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 10:14 moritzm: upgrading eqiad video scalers to hhvm-wikidiff 1.7.0
  • 10:09 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 20s)
  • 10:08 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 21s)
  • 09:07 moritzm: upgrading mw1299-mw1306 to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 09:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 after alter table (duration: 01m 20s)
  • 09:04 marostegui: Deploy schema change on s1 primary master (db1052) - T191519
  • 09:00 marostegui: Deploy schema change on db1052 (s1 primary master) - T190148
  • 08:27 moritzm: upgrading mw1221-mw1235 to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 07:40 XioNoX: enable backup tunnel routing between cr2-ulsfo and cr1-eqdfw - T195584
  • 07:31 marostegui: Deploy schema change on db1083 - T190148 T191519 T188299
  • 07:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 for alter table (duration: 01m 20s)
  • 07:29 moritzm: upgrading mw1238-mw1258 to hhvm-wikidiff 1.7.0 (HHVM bytecode cache needs to be pruned during rollout)
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 after alter table (duration: 01m 22s)
  • 07:10 moritzm: uploaded hhvm-wikidiff2 1.7.0 (source package name php-wikidiff2) to apt.wikimedia.org
  • 06:44 moritzm: installing ruby-loofah security updates
  • 06:36 elukey: reimage druid1003 to Debian Stretch (Analytics cluster, backend for Pivot/Turnilo)
  • 05:25 marostegui: Stop MySQL on db2075 to copy its content to db2095 - T190704
  • 05:23 marostegui: Deploy schema change on db1106 with replication, this will generate lag on labs - T190148 T191519 T188299
  • 05:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 for alter table (duration: 01m 24s)
  • 03:14 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 14m 30s)

2018-05-26

  • 21:47 reedy@tin: Synchronized composer.json: (no justification provided) (duration: 01m 19s)
  • 21:45 reedy@tin: Synchronized multiversion/: multiversion (duration: 01m 21s)
  • 21:42 reedy@tin: Synchronized vendor/: canhasvendor (duration: 01m 46s)
  • 19:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 after alter table (duration: 01m 21s)
  • 14:25 marostegui: Add tmp1 index back on db1101:3318 - T194273
  • 14:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 for alter table (duration: 01m 07s)
  • 14:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 after alter table (duration: 01m 20s)
  • 09:56 marostegui: Add tmp1 index back on db1087 (sanitarium master), this will generate lag on labsdb hosts - T194273
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 for alter table (duration: 01m 20s)
  • 09:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 after alter table (duration: 01m 21s)
  • 05:21 marostegui: Add tmp1 index back on db1099:3318 - T194273
  • 05:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 for alter table (duration: 01m 21s)
  • 05:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 after alter table (duration: 01m 22s)

2018-05-25

  • 22:57 mutante: apt.wikimedia.org - import jenkins-debian-glue_0.18.4-wmf3 for jessie-wikimedia (T193910)
  • 21:27 legoktm@tin: Synchronized php-1.32.0-wmf.5/skins/MonoBook/: Temporarily remove responsive support (T195625) (duration: 01m 21s)
  • 20:33 mutante: LDAP: added user wmde-leszek to group 'nda' (T195358)
  • 17:46 XenoRyet: updated civicrm from 4d797fc592 to 0b97f1f5b2
  • 15:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1119 after alter table (duration: 01m 20s)
  • 13:47 akosiaris: repool ulsfo, links have been stable for quite a few hours
  • 13:27 marostegui: Deploy schema change on db1119 - https://phabricator.wikimedia.org/T190148 https://phabricator.wikimedia.org/T191519 https://phabricator.wikimedia.org/T188299
  • 13:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1119 for alter table (duration: 01m 20s)
  • 13:15 marostegui: Add indexes back on s8 codfw primary master (db2045) this will generate lag on codfw - T194273
  • 13:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 after alter table (duration: 01m 20s)
  • 12:28 moritzm: fixed dpkg installation state on mx2001
  • 11:32 akosiaris: switch to SSH RSA 2048 bit keys for eqiad ganeti intracluster communication
  • 11:22 akosiaris: upgrade eqiad ganeti cluster to ganeti 2.15.2-7+deb9u1~bpo8+1
  • 11:21 akosiaris: rebalance row_B codfw ganeti nodegroup. Cluster is now fully upgraded to stretch
  • 11:18 akosiaris: powercycling ms-be1034, box is unresposive, tons of logs "sd 0:1:0:1: rejecting I/O to offline device"
  • 10:37 XioNoX: test force mtu 1400 between cp1074 and cp3039 - T195365
  • 09:20 marostegui: Deploy schema change on db1105:3311 - T190148 T191519 T188299
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 for alter table (duration: 01m 20s)
  • 09:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1119 after alter table (duration: 01m 19s)
  • 09:10 marostegui@tin: scap failed: average error rate on 9/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 after alter table (duration: 01m 20s)
  • 08:44 marostegui: Stop MySQL on db1120 to transfer its content to db1125 - T190704
  • 08:39 marostegui: Add tmp1 back on db1092 - https://phabricator.wikimedia.org/T194273
  • 08:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 for alter table (duration: 01m 20s)
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 after alter table (duration: 01m 20s)
  • 08:26 gilles@tin: Synchronized wmf-config/InitialiseSettings.php: T187299 Launch performance survey on frwiki (duration: 01m 22s)
  • 07:05 jynus: stop db1117:m2 to clone it to db1065
  • 06:57 marostegui: Deploy schema change on db1114 - T190148 T191519 T188299
  • 06:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 for alter table (duration: 01m 20s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 after alter table (duration: 01m 20s)
  • 06:37 jynus: reimage db1065 after raid rebuild
  • 05:57 marostegui: Add tmp1 index back on dbstore1002 - T194273
  • 05:25 marostegui: Stop MySQL on db1116 to copy its content to db1124 - T190704
  • 05:23 marostegui: Deploy schema change on db1089 - T190148 T191519 T188299
  • 05:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 for alter table (duration: 01m 20s)
  • 05:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 after alter table (duration: 01m 21s)
  • 05:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 for alter table (duration: 01m 20s)
  • 05:05 marostegui: Add tmp1 index back to db1109 - T194273
  • 04:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 after alter table (duration: 01m 21s)
  • 03:10 papaul: OS install on db209[4-5]

2018-05-24

  • 23:13 ladsgroup@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase: Dumps: Allow several --entity-type arguments (T195420) (duration: 02m 25s)
  • 22:21 ladsgroup@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase/lib/includes/Store/Sql/TermSqlIndex.php: Log the query that would hit wb_terms. (T195520) (duration: 01m 20s)
  • 22:11 ladsgroup@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase/lib/includes/Store/Sql/TermSqlIndex.php: Log the query that would hit wb_terms. (T195520) (duration: 01m 21s)
  • 20:58 marostegui: Add tmp1 index back on db1104
  • 20:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 for alter table (duration: 01m 20s)
  • 20:56 legoktm@tin: Synchronized php-1.32.0-wmf.5/extensions/VisualEditor/: no-op, sync with git state (duration: 01m 21s)
  • 20:54 legoktm@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase/lib/includes/Store/Sql/TermSqlIndex.php: no-op, sync with git state (duration: 01m 20s)
  • 20:41 legoktm@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase/lib/includes/Store/Sql/TermSqlIndex.php: no-op, sync with git state (duration: 01m 20s)
  • 20:17 legoktm@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase/lib/includes/Store/Sql/TermSqlIndex.php: wmf.5 this time (duration: 01m 19s)
  • 20:14 legoktm@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase/lib/includes/Store/Sql/TermSqlIndex.php: Add debug logging (duration: 01m 19s)
  • 20:11 legoktm@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase/lib/includes/Store/Sql/TermSqlIndex.php: Disable TermSqlIndex::getMatchingTerms (duration: 01m 20s)
  • 19:49 ejegg: updated payments-wiki from c81e25f8d3 to 43989ebc96
  • 19:44 reedy@tin: Synchronized wmf-config/Wikibase.php: Disable PropSuggester (duration: 01m 21s)
  • 19:30 twentyafterfour@tin: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 19:14 twentyafterfour: Starting the MediaWiki train for Thursday May 24, today I will be deploying wmf/1.32.0-wmf.5 to all wikis
  • 18:31 mutante: netmon1002, netmon2001: systemctl mask uwsgi; systemctl reset-failed - to fix Icinga alert about broken DPKG since last netbox deploy and to match existing status on labtestweb2001
  • 18:21 XioNoX: repool esams
  • 18:01 no_justification: gerrit restarting on cobalt, back soon
  • 17:50 mutante: mwdebug1001 - apt-get remove to clean up packages in "non ii" states
  • 16:59 anomie: Running populateExternallinksIndex60.php on group 1 for T59176
  • 16:50 XioNoX: depooled esams - investigating issues
  • 16:31 anomie: Running deduplicateArchiveRevId.php on group 2 for T193180
  • 16:28 jynus: manually failover the backup host for m2 to db1117:3322
  • 16:22 mutante: tin apt-get clean saved 7% disk space on / - fixing disk space alert
  • 16:19 marostegui: Deploy schema change on db1099:3311 - T191519 T188299 T190148
  • 16:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 for alter table (duration: 01m 13s)
  • 16:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 after alter table (duration: 01m 17s)
  • 16:15 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/Wikibase/repo/maintenance/dispatchChanges.php: Dont use WikibaseRepo class in dispatchChanges constructor (duration: 10m 28s)
  • 16:13 mutante: deploy2001 - let mwdeploy own .~tmp~ in /srv/mediawiki-staging
  • 15:48 cmjohnson: swapped failed disk 0 db1065
  • 15:06 moritzm: installing glibc updates from stretch point release
  • 14:57 marostegui: Deploy schema change on dbstore1001:s1 - T191519 T188299 T190148
  • 14:56 papaul: shutting down elastic2020 for maintenance
  • 14:51 moritzm: upgrading mwdebug servers in eqiad to wikidiff 1.7.0
  • 14:09 akosiaris: empty ganeti2004 for stretch reimage
  • 14:00 mutante: deploy2001: scap pull to sync. then add as scap master and host (gerrit:433616)
  • 13:37 akosiaris: rebalance row_A codfw ganeti nodegroup. Fully upgrade to stretch now
  • 13:33 marostegui: Deploy schema change on db1067 - T191519 T188299 T190148
  • 13:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 for alter table (duration: 01m 00s)
  • 13:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1080 after alter table (duration: 00m 56s)
  • 13:27 anomie: Running deduplicateArchiveRevId.php on group 1 for T193180
  • 13:25 moritzm: upgrading mw1262-mw1265 (canary hosts) to wikidiff 1.7.0
  • 13:13 moritzm: upgrading mw1261 (canary host) to wikidiff 1.7.0
  • 11:31 moritzm: rebooting mw1261, mw1276, mw1319, mw1312, mw1258, mw1221 to use Intel microcode updates
  • 11:19 marostegui: Deploy schema change on db1080 - T191519 T188299 T190148
  • 11:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 for alter table (duration: 01m 08s)
  • 11:10 marostegui: Deploy schema change on dbstore1002:s1 - T191519 T188299 T190148
  • 11:05 XioNoX: GTT work on eqiad-esams link starting soon
  • 10:55 krinkle@tin: Synchronized php-1.32.0-wmf.4/includes/resourceloader/ResourceLoaderUserModule.php: T195380 (duration: 01m 08s)
  • 10:53 Krinkle: Unexpected dirty git status at tin:/srv/mediawiki-staging/php-1.32.0-wmf.4/extensions/JADE (1 file is locally deleted, but not committed)
  • 10:51 krinkle@tin: Synchronized php-1.32.0-wmf.5/includes/resourceloader/ResourceLoaderUserModule.php: T195380 (duration: 01m 08s)
  • 09:55 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all jobs to EventBus for everything except wikipedia, commons and wikidata, file 2/2 - T190327 (duration: 01m 06s)
  • 09:55 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b537fa1]: Switch all non-special jobs for everything except wikipedia, commons and wikidata T190327 (duration: 00m 42s)
  • 09:54 ppchelko@tin: Started deploy [cpjobqueue/deploy@b537fa1]: Switch all non-special jobs for everything except wikipedia, commons and wikidata T190327
  • 09:53 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all jobs to EventBus for everything except wikipedia, commons and wikidata, file 1/2, take #2 - T190327 (duration: 01m 08s)
  • 09:37 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all jobs to EventBus for everything except wikipedia, commons and wikidata, file 1/2 - T190327 (duration: 01m 09s)
  • 09:33 pnorman@tin: Finished deploy [kartotherian/deploy@9fc09ef]: Do a fresh deploy of Kartotherian to production (duration: 02m 35s)
  • 09:31 pnorman@tin: Started deploy [kartotherian/deploy@9fc09ef]: Do a fresh deploy of Kartotherian to production
  • 09:27 pnorman@tin: Finished deploy [kartotherian/deploy@9fc09ef]: Do test deploy of kartotherian to maps-test2004 (duration: 00m 22s)
  • 09:27 pnorman@tin: Started deploy [kartotherian/deploy@9fc09ef]: Do test deploy of kartotherian to maps-test2004
  • 09:26 pnorman@tin: Finished deploy [kartotherian/deploy@9fc09ef]: Do test deploy of kartotherian to maps-test2004 (duration: 00m 23s)
  • 09:25 pnorman@tin: Started deploy [kartotherian/deploy@9fc09ef]: Do test deploy of kartotherian to maps-test2004
  • 09:25 ppchelko@tin: Finished deploy [cpjobqueue/deploy@f66dacb]: Correctly commit offsets for multi-topic rules (duration: 00m 49s)
  • 09:24 ppchelko@tin: Started deploy [cpjobqueue/deploy@f66dacb]: Correctly commit offsets for multi-topic rules
  • 08:55 pnorman@tin: Finished deploy [tilerator/deploy@9e40702] (cleartables): Deploy scap fixes to cleartables map test server in verbose mode (duration: 00m 13s)
  • 08:55 pnorman@tin: Started deploy [tilerator/deploy@9e40702] (cleartables): Deploy scap fixes to cleartables map test server in verbose mode
  • 08:55 pnorman@tin: Finished deploy [tilerator/deploy@9e40702] (cleartables--force): Deploy scap fixes to cleartables map test server in verbose mode (duration: 00m 05s)
  • 08:55 pnorman@tin: Started deploy [tilerator/deploy@9e40702] (cleartables--force): Deploy scap fixes to cleartables map test server in verbose mode
  • 08:51 pnorman@tin: Finished deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server in verbose mode (duration: 00m 13s)
  • 08:51 pnorman@tin: Started deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server in verbose mode
  • 08:47 pnorman@tin: Finished deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server in verbose mode (duration: 00m 05s)
  • 08:47 pnorman@tin: Started deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server in verbose mode
  • 08:45 marostegui: Deploy schema change on s1 codfw primary master (db2048), this will generate lag on codfw - T191519 T188299 T190148
  • 08:45 pnorman@tin: Finished deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server in verbose mode (duration: 00m 01s)
  • 08:45 pnorman@tin: Started deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server in verbose mode
  • 08:40 ayounsi@tin: Finished deploy [netbox/deploy@ac54feb]: Adding service name in scap.cfg (duration: 00m 33s)
  • 08:39 ayounsi@tin: Started deploy [netbox/deploy@ac54feb]: Adding service name in scap.cfg
  • 08:35 pnorman@tin: Finished deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server go 3 (duration: 00m 23s)
  • 08:35 pnorman@tin: Started deploy [tilerator/deploy@9e40702]: Deploy scap fixes to cleartables map test server go 3
  • 08:32 gilles@tin: Synchronized wmf-config/InitialiseSettings.php: T187299 Launch performance survey on cawiki and enwikivoyage (duration: 01m 08s)
  • 08:21 gilles: Deployment of cawiki and enwikivoyage performance survey
  • 08:16 jynus: stop db2037 to clone it to db2078 and upgrade
  • 08:05 marostegui: Deploy schema change on s8 primary master (db1071) - T191519 T188299 T190148 T194270
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 after alter table (duration: 01m 09s)
  • 07:48 jynus: stop db2042 to clone it to db2078 and upgrade
  • 07:36 marostegui: Deploy schema change on db1109 - T191519 T188299 T190148 T194270
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 for alter table (duration: 01m 08s)
  • 07:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 after alter table (duration: 01m 29s)
  • 07:17 moritzm: installing procps security updates
  • 06:47 marostegui: Deploy schema change on db1104 - T191519 T188299 T190148 T194270
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 for alter table (duration: 01m 08s)
  • 06:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 after alter table (duration: 01m 08s)
  • 06:24 moritzm: installing remaining curl security updates in eqiad
  • 06:17 marostegui: Deploy schema change on db1087, this will generate lag on labs on s8 - T191519 T188299 T190148 T194270
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 for alter table (duration: 01m 09s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 (duration: 01m 09s)
  • 04:04 l10nupdate@tin: ResourceLoader cache refresh completed at Thu May 24 04:04:35 UTC 2018 (duration 7m 16s)
  • 03:57 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.5) (duration: 13m 29s)
  • 03:06 mutante: added wmde-fisch to LDAP group nda (T195223)
  • 03:01 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 12m 48s)
  • 00:15 ebernhardson@tin: Synchronized php-1.32.0-wmf.4/extensions/CirrusSearch/includes/Job/CheckerJob.php: Drop cirrus checker job metastore transition check (duration: 01m 08s)
  • 00:14 ebernhardson@tin: Synchronized php-1.32.0-wmf.5/extensions/CirrusSearch/includes/Job/CheckerJob.php: Drop cirrus checker job metastore transition check (duration: 01m 08s)
  • 00:10 ebernhardson: upgrade cirrussearch metastore to 1.0 on eqiad and codfw
  • 00:09 ebernhardson@tin: Synchronized php-1.32.0-wmf.5/extensions/CirrusSearch/: SWAT: Convert cirrus metastore to single type (duration: 01m 24s)
  • 00:07 ebernhardson@tin: Synchronized php-1.32.0-wmf.4/extensions/CirrusSearch/: SWAT: Convert cirrus metastore to single type (duration: 01m 24s)
  • 00:00 ebernhardson@tin: Synchronized php-1.32.0-wmf.4/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWSaveDialog.js: SWAT: T195323: MWSaveDialog: Fix typo in no-categories branch (duration: 01m 07s)

2018-05-23

  • 23:49 ebernhardson@tin: Synchronized php-1.32.0-wmf.4/extensions/VisualEditor/modules/ve-mw/ui/tools/ve.ui.MWPopupTool.js: SWAT: Fix typo in API call for version number help (duration: 01m 08s)
  • 23:45 ebernhardson@tin: Synchronized php-1.32.0-wmf.5/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWSaveDialog.js: SWAT: T195323: MWSaveDialog: Fix typo in no-categories branch (duration: 01m 08s)
  • 23:42 ebernhardson@tin: Synchronized php-1.32.0-wmf.4/extensions/MobileFrontend/resources/mobile.editor.common/editor.less: SWAT: T194832: Fix layout of editor switcher dropdown (duration: 01m 08s)
  • 23:10 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T152296: Remove dangerous unused botadmin group at mlwik{tionary|isource} (duration: 01m 10s)
  • 21:29 mutante: arming keyholder on deploy2001
  • 20:57 mutante: rebooting deploy2001
  • 20:56 arlolra@tin: Finished deploy [parsoid/deploy@de18a58]: Reverting Parsoid deploy (duration: 02m 37s)
  • 20:53 arlolra@tin: Started deploy [parsoid/deploy@de18a58]: Reverting Parsoid deploy
  • 20:52 pnorman@tin: Finished deploy [tilerator/deploy@63617a9]: Deploy scap fixes to cleartables map test server go 2 (duration: 00m 23s)
  • 20:52 pnorman@tin: Started deploy [tilerator/deploy@63617a9]: Deploy scap fixes to cleartables map test server go 2
  • 20:43 pnorman@tin: Finished deploy [tilerator/deploy@18faaa6]: Deploy scap fixes to cleartables map test server (duration: 00m 08s)
  • 20:42 pnorman@tin: Started deploy [tilerator/deploy@18faaa6]: Deploy scap fixes to cleartables map test server
  • 20:30 bsitzmann@tin: Finished deploy [mobileapps/deploy@5896151]: Update mobileapps to 29ebe0f (T192664) (duration: 08m 54s)
  • 20:28 arlolra: Updated Parsoid to dccfeafd (T157418, T194777, T195317, T195174, T194763, T194658)
  • 20:21 bsitzmann@tin: Started deploy [mobileapps/deploy@5896151]: Update mobileapps to 29ebe0f (T192664)
  • 20:18 arlolra@tin: Finished deploy [parsoid/deploy@de18a58]: Updating Parsoid to dccfeafd (duration: 13m 09s)
  • 20:05 arlolra@tin: Started deploy [parsoid/deploy@de18a58]: Updating Parsoid to dccfeafd
  • 19:36 mutante: reinstalling naos as deploy2001, booting to PXE (T193916)
  • 19:23 twentyafterfour@tin: Synchronized php: group1 wikis to 1.32.0-wmf.5 refs T191051 (duration: 01m 08s)
  • 19:21 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.5 refs T191051
  • 18:24 twentyafterfour@tin: Synchronized README: testing sync-masters without naos (duration: 01m 09s)
  • 17:54 twentyafterfour: Finished SWATting, thanks everyone!
  • 17:52 twentyafterfour@tin: Synchronized wmf-config/InitialiseSettings.php: sync https://gerrit.wikimedia.org/r/#/c/433546/ for SWAT refs T194871 (duration: 01m 19s)
  • 17:47 twentyafterfour@tin: Synchronized php-1.32.0-wmf.4/extensions/Translate/MessageCollection.php: syncing https://gerrit.wikimedia.org/r/#/c/434700/ refs T195347 (duration: 01m 19s)
  • 17:45 twentyafterfour@tin: Synchronized php-1.32.0-wmf.5/extensions/Translate/MessageCollection.php: syncing https://gerrit.wikimedia.org/r/#/c/434699/ refs T195347 (duration: 01m 20s)
  • 17:18 krinkle@tin: Synchronized w/robots.php: Ib1c6d676 - Bye, wgTitle (duration: 01m 20s)
  • 17:17 krinkle@tin: Synchronized w/extract2.php: Ib1c6d676 - Bye, wgTitle (duration: 01m 19s)
  • 17:14 twentyafterfour@tin: Synchronized wmf-config/Wikibase-production.php: SWAT deploying https://gerrit.wikimedia.org/r/#/c/434658/ refs T194273 (duration: 01m 20s)
  • 16:31 SMalyshev: starting wikidata full reindex for T163642
  • 16:11 addshore@tin: Finished scap: WikibaseLexeme: Do not refer to the spelling variant as language, T193603, Patch 1, Patch 2 (duration: 103m 38s)
  • 16:10 anomie: Running deduplicateArchiveRevId.php on group 0 for T193180
  • 15:53 anomie: Running populateExternallinksIndex60.php on group 0 for T59176
  • 15:30 jynus: stop db2044 for cloning to db2078 + upgrade
  • 14:52 jynus: restart db2078 for upgrade and to convert it to multiinstance
  • 14:30 moritzm: installing curl security updates on mediawiki canaries along with HHVM restart to pick up new library version
  • 14:28 addshore@tin: Started scap: WikibaseLexeme: Do not refer to the spelling variant as language, T193603, Patch 1, Patch 2
  • 14:12 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable HD logos for wikimania2018wiki T194340 (duration: 01m 20s)
  • 14:05 zeljkof: EU SWAT finished
  • 14:01 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Disable wikidiff2 inline moved paragraphs by default (T194271) (duration: 01m 18s)
  • 13:55 jynus: stopping db1065 database to move it to m2
  • 13:47 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change logo assets for wikimania2018wiki (T194340) (duration: 01m 20s)
  • 13:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable wikidiff2 inline moved paragraphs on production (T194271) (duration: 01m 21s)
  • 13:22 zfilipin@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Revert "Disable wikidiff2 inline moved paragraphs by default" (T194271) (duration: 01m 19s)
  • 13:17 zfilipin@tin: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 13:05 jynus: starting logical dump of m3-master
  • 13:00 jynus: starting logical dump of m2-master
  • 12:47 addshore: addshore@terbium:~$ echo 'https://www.wikidata.org/wiki/Special:EntityData/L1.rdf' | mwscript purgeList.php
  • 12:43 addshore@tin: Synchronized wmf-config/Wikibase.php: Fix disabledRdfExportEntityTypes for wikidata T168260 (duration: 01m 20s)
  • 12:16 XioNoX: set normal metric on codfw-eqsin link
  • 12:06 elukey: upgrade druid public to druid 0.11 (druid100[4-6])
  • 11:45 akosiaris: reimage ganeti2002, ganeti2006 as debian stretch
  • 11:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@59674ba]: Correctly commit pending messages for multi-topic rules (duration: 00m 47s)
  • 11:44 ppchelko@tin: Started deploy [cpjobqueue/deploy@59674ba]: Correctly commit pending messages for multi-topic rules
  • 11:33 addshore: addshore@mw1317:~$ scap pull # It seemed to be missing changes.....
  • 11:30 moritzm: installing curl security updates
  • 11:09 addshore: Lexeme deploy window probably done (unless something explodes)
  • 10:42 addshore: addshore@terbium mwscript extensions/WikibaseLexeme/maintenance/createBlacklistedLexemes.php --wiki testwikidatawiki
  • 10:40 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable WikibaseLexeme on wikidata.org T191457 T168260 https://gerrit.wikimedia.org/r/#/c/434453 (duration: 01m 20s)
  • 10:32 jynus: stop db1065 for cloning (proxys will complain)
  • 10:27 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ffdc19a]: Logging improvements (duration: 00m 47s)
  • 10:26 ppchelko@tin: Started deploy [cpjobqueue/deploy@ffdc19a]: Logging improvements
  • 10:21 jynus: restarting db1117 for setup and upgrade
  • 10:17 dcausse: manually updating mapping on wikidatawiki elastic indices to add new lexeme fields
  • 10:15 addshore: updateSearchIndexConfig.php with --justMapping failed :(
  • 10:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 after alter table (duration: 01m 20s)
  • 10:14 addshore: addshore@wasat:~$ mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki wikidatawiki --justMapping --cluster codfw
  • 09:35 addshore: Finished, forceSearchIndex.php --wiki testwikidatawiki
  • 09:34 moritzm: upgrading remaining job runners in eqiad to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 09:30 marostegui: Deploy schema change on db1101:3318 - T191519 T188299 T190148 T194270
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 for alter table (duration: 01m 20s)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 after alter table (duration: 01m 19s)
  • 09:02 addshore: addshore@terbium:~$ mwscript extensions/CirrusSearch/maintenance/forceSearchIndex.php --wiki testwikidatawiki --queue
  • 09:00 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: [testwikidata] Add Lexeme NS to ContentNamespaces (duration: 01m 20s)
  • 08:31 moritzm: upgrading remaining app/API servers in codfw to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 08:21 marostegui: Deploy schema change on db1099:3318 - T191519 T188299 T190148 T194270
  • 08:20 addshore: all the updateSearchIndexConfig runs from wasat also had --cluster=codfw
  • 08:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 for alter table (duration: 01m 20s)
  • 08:18 addshore: addshore@wasat:~$ mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki testwikidatawiki --reindexAndRemoveOk --indexIdentifier=now
  • 08:18 addshore: addshore@terbium:~$ mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki testwikidatawiki --reindexAndRemoveOk --indexIdentifier=now
  • 08:17 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: testwikidata: Add Lexeme NS to wgNamespacesToBeSearchedDefault (duration: 01m 19s)
  • 08:06 addshore: addshore@wasat:~$ mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki testwikidatawiki --reindexAndRemoveOk --indexIdentifier=now --reindexProcesses=10
  • 08:00 moritzm: upgrading remaining app servers in eqiad to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 07:54 addshore: addshore@terbium:~$ mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki testwikidatawiki --reindexAndRemoveOk --indexIdentifier=now
  • 07:52 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: testwikidata: Add Property NS to wgNamespacesToBeSearchedDefault (duration: 01m 20s)
  • 07:45 addshore@tin: Synchronized php-1.32.0-wmf.5/extensions/WikibaseLexeme: Use ISO code und for missing language code (duration: 01m 30s)
  • 07:44 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/WikibaseLexeme: Use ISO code und for missing language code (duration: 01m 31s)
  • 07:38 moritzm: upgrading remaining API servers in eqiad to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 07:29 elukey: upload druid debs 0.11.0-3 to stretch-wikimedia
  • 06:52 moritzm: upgrading mw1299-mw1306 (job runners) to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 06:49 marostegui: Re-add indexes on wb_terms on db1092 - T194273
  • 06:44 elukey: restart zookeeper on druid100[4-6] for openjdk-8 upgrades
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 for alter table (duration: 01m 20s)
  • 06:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 after alter table (duration: 01m 20s)
  • 05:43 marostegui: Deploy schema change on db1092 - T191519 T188299 T190148 T194273 T194270
  • 05:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 for alter table (duration: 01m 27s)
  • 05:21 marostegui: Deploy schema change on dbstore1002:s8 - T191519 T188299 T190148 T194273 T194270
  • 05:12 marostegui: Deploy schema change on s8 codfw primary master (db2045), this will generate lag on codfw - T194273
  • 05:06 marostegui: Deploy schema change on s3 primary master (db1075) - T191519 T188299 T190148
  • 04:04 l10nupdate@tin: ResourceLoader cache refresh completed at Wed May 23 04:04:43 UTC 2018 (duration 7m 22s)
  • 03:57 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.5) (duration: 16m 39s)
  • 02:58 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 10m 54s)

2018-05-22

  • 23:29 maxsem@tin: Synchronized dblists/categories-rdf.dblist: https://gerrit.wikimedia.org/r/#/c/433740/ (duration: 01m 17s)
  • 22:59 twentyafterfour: train for group0 1.32.0-wmf.5 completed. Tune in tomorrow for more excitement!
  • 22:58 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.32.0-wmf.5 refs T191051
  • 22:54 twentyafterfour@tin: Finished scap: testwikis wikis to 1.32.0-wmf.5 refs T191051 (duration: 76m 01s)
  • 21:38 twentyafterfour@tin: Started scap: testwikis wikis to 1.32.0-wmf.5 refs T191051
  • 21:13 twentyafterfour@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_2833418486" --threads=10 --lang en --quiet' returned non-zero exit status 255 (duration: 39m 54s)
  • 21:12 ejegg: updated CiviCRM from 5b8c868a00 to 4d797fc592
  • 20:33 twentyafterfour@tin: Started scap: testwikis wikis to 1.32.0-wmf.5 refs T191051
  • 19:45 pnorman@tin: Finished deploy [tilerator/deploy@18faaa6]: Update tilerator, fix variable substitution (duration: 02m 46s)
  • 19:43 pnorman@tin: Started deploy [tilerator/deploy@18faaa6]: Update tilerator, fix variable substitution
  • 19:39 pnorman@tin: Finished deploy [tilerator/deploy@18faaa6]: (no justification provided) (duration: 00m 29s)
  • 19:38 pnorman@tin: Started deploy [tilerator/deploy@18faaa6]: (no justification provided)
  • 19:35 pnorman@tin: Finished deploy [tilerator/deploy@18faaa6]: (no justification provided) (duration: 00m 25s)
  • 19:35 pnorman@tin: Started deploy [tilerator/deploy@18faaa6]: (no justification provided)
  • 19:34 pnorman@tin: Finished deploy [tilerator/deploy@a4a3fc7]: (no justification provided) (duration: 46m 19s)
  • 19:17 twentyafterfour@tin: Synchronized php-1.32.0-wmf.4/extensions/ContentTranslation/: sync https://gerrit.wikimedia.org/r/#/c/434529/ to fix T194810 (duration: 01m 02s)
  • 19:14 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group2 wikis to 1.32.0-wmf.4 refs T191050
  • 19:03 twentyafterfour: Today's train: Promoting group2 wikis to 1.32.0-wmf.4 followed by group0 to 1.32.0-wmf.5
  • 18:48 pnorman@tin: Started deploy [tilerator/deploy@a4a3fc7]: (no justification provided)
  • 18:44 pnorman: tilerator deploying a4a3fc7
  • 17:03 marostegui: Deploy schema change on s8 codfw primary master (db2045), this will generate lag on codfw - T194270
  • 16:52 elukey: restart zookeeper on druid100[1,3] to complete the openjdk-8 upgrade
  • 16:51 addshore: addshore@terbium:~$ mwscript extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php --wiki testwikidatawiki --reindexAndRemoveOk --indexIdentifier=now
  • 16:43 elukey: upload druid debs 0.11.0-3 to jessie-wikimedia
  • 16:25 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2064 from config - T195228 (duration: 01m 18s)
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2064 from config - T195228 (duration: 01m 19s)
  • 16:14 moritzm: upgrading remaining video scalers to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 15:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@4312549]: Increase the concurrency for low traffic topics (duration: 00m 41s)
  • 15:19 ppchelko@tin: Started deploy [cpjobqueue/deploy@4312549]: Increase the concurrency for low traffic topics
  • 14:47 addshore@tin: Synchronized wmf-config/Wikibase.php: Wikidata dispatch, set defaults for dispatchChanges settings (duration: 01m 19s)
  • 14:29 addshore: SWAT done
  • 14:28 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T193680 Use uploaded logo in wgLogo for liwikibooks (duration: 01m 16s)
  • 14:26 addshore@tin: Synchronized static/images/project-logos/liwikibooks.png: SWAT: T193680 Change liwikibooks logo (duration: 01m 18s)
  • 14:24 addshore: that last also included https://gerrit.wikimedia.org/r/#/c/434495/ - Use correct logo-size for wikimania2018wiki
  • 14:24 addshore@tin: Synchronized static/images/project-logos/wikimania2018wiki.png: SWAT: Change logo for wikimania2018wiki T194340 (duration: 01m 19s)
  • 14:07 elukey: upgrading druid on druid100[123] to 0.11
  • 13:51 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Revert Revert Temp rate limit for arwiki due to mass vandalism (duration: 01m 18s)
  • 13:46 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Revert Enable $wgUseRCPatrol on azwiki (duration: 01m 19s)
  • 13:39 elukey: upload druid 0.11 debs to jessie|stretch wikimedia
  • 13:26 moritzm: upgrading API servers in codfw to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 13:25 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable $wgUseRCPatrol on azwiki T194389 (duration: 01m 20s)
  • 13:21 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert: Temp rate limit for arwiki due to mass vandalism T192668 (duration: 01m 18s)
  • 13:18 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/WikibaseLexeme: Add L171081 to clearBlacklistedLexemes (duration: 01m 28s)
  • 13:13 moritzm: upgrading labweb servers to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 13:06 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/WikibaseLexeme: Handle invalid lexemeId in data when using wbeditentity new=form && Lexeme term languages: codes beyond MW default (duration: 01m 28s)
  • 13:01 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/WikibaseLexeme: Lemma validation: language covered in deserializer (duration: 01m 30s)
  • 12:59 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/WikibaseLexeme: Use the same language validation for representations and lemmas (duration: 01m 25s)
  • 12:57 moritzm: installing xdg-utils security updates on trusty
  • 12:55 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/WikibaseLexeme: Use ChangeOps consistently throughout API (duration: 01m 30s)
  • 12:42 addshore@tin: Synchronized wmf-config: #1 #2 BETA ONLY profiler stuff (duration: 01m 20s)
  • 12:33 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/Wikibase/repo: API: when validating change op make sure the edited entity is also validated T190928 (duration: 01m 52s)
  • 12:18 gehel: set `unchecked_tombstone_compaction=true` for maps eqiad - T194966
  • 12:15 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Set wgLexemeLanguageCodePropertyId for wikidatawiki T194248 (duration: 01m 19s)
  • 12:11 moritzm: installing imagemagick security updates
  • 12:07 addshore@tin: Finished scap: WikimediaMessages - wikidata-copyright, include the lexeme namespace T169333 (duration: 56m 45s)
  • 11:31 moritzm: upgrading application servers in deployment-prep to wikidiff 1.7.0 (T190717)
  • 11:10 addshore@tin: Started scap: WikimediaMessages - wikidata-copyright, include the lexeme namespace T169333
  • 11:06 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Add 171081 to wmgWikibaseIdBlacklist for wikibase-lexeme T194248 T187060 (duration: 01m 19s)
  • 11:04 ppchelko@tin: Started restart [cpjobqueue/deploy@b45cd3b]: KafkaConsumer is not connected error
  • 10:32 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch cross-wiki posting jobs to EventBus - T190327 (duration: 01m 18s)
  • 10:31 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b45cd3b]: Switch cross-wiki posting jobs for everything T175210 (duration: 01m 03s)
  • 10:30 ppchelko@tin: Started deploy [cpjobqueue/deploy@b45cd3b]: Switch cross-wiki posting jobs for everything T175210
  • 10:24 moritzm: upgrading video scalers in codfw to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 10:23 moritzm: upgrading snapshot hosts to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 09:56 jynus: stop and reimage db2043
  • 09:47 moritzm: upgrading mw123[8-9], mw1266-mw1275 to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 09:22 moritzm: upgrading mw1280-mw1290 to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 07:50 jynus: stop and reimage db2050
  • 07:42 jynus: stop and reimage db2057
  • 07:21 marostegui: Deploy schema change on s8 codfw master (db2045) with replication, this will generate lags on codfw - T191519 T188299 T190148
  • 07:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 after alter table (duration: 01m 19s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T193835 (duration: 01m 19s)
  • 05:10 marostegui: Deploy schema change on db1078 - T191519 T188299 T1901482
  • 05:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 for alter table (duration: 01m 44s)
  • 03:39 l10nupdate@tin: ResourceLoader cache refresh completed at Tue May 22 03:39:38 UTC 2018 (duration 7m 36s)
  • 03:32 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 12m 10s)
  • 02:48 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.3) (duration: 12m 02s)

2018-05-21

  • 22:50 Krinkle: webperf1001: restart navtiming service, to test with new ipv6 capabilities
  • 22:13 mutante: enabled IPv6 on webperf machines
  • 21:59 foks: adding email to User:Quuxplusone - T194929
  • 21:24 ottomata: granted User:CN=kafka_fundraising_client read permissions for group fundraising* on kafka-jumbo (for kafkatee webrequest consumption: kafka acls --add --allow-principal User:CN=kafka_fundraising_client --consumer --topic '*' --group 'fundraising*'
  • 20:32 twentyafterfour@tin: Synchronized php: group1 wikis to 1.32.0-wmf.4 refs T191050 (duration: 01m 18s)
  • 20:31 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.4 refs T191050
  • 19:19 gehel: clearing cassandra snaphosts on maps* nodes to regain some space - T194966
  • 17:13 gehel@tin: Finished deploy [wdqs/wdqs@e01dd03]: new WDQS GUI version (duration: 08m 44s)
  • 17:04 gehel@tin: Started deploy [wdqs/wdqs@e01dd03]: new WDQS GUI version
  • 16:53 fdans@tin: Finished deploy [analytics/refinery@16cb3be]: deploying new jar for upgraded ua parsing (duration: 05m 40s)
  • 16:47 fdans@tin: Started deploy [analytics/refinery@16cb3be]: deploying new jar for upgraded ua parsing
  • 15:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 with low load (duration: 01m 18s)
  • 15:23 demon@tin: Pruned MediaWiki: 1.32.0-wmf.2 [keeping static files] (duration: 01m 56s)
  • 15:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool fully db1109 (duration: 01m 18s)
  • 14:57 demon@tin: Pruned MediaWiki: 1.31.0-wmf.30 (duration: 03m 19s)
  • 14:50 jynus: stop and reimage db1074
  • 14:46 demon@tin: Pruned MediaWiki: 1.31.0-wmf.29 (duration: 03m 29s)
  • 14:42 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 and pool db1077 with 100% weight (duration: 01m 20s)
  • 14:41 demon@tin: Pruned MediaWiki: 1.31.0-wmf.28 (duration: 03m 54s)
  • 14:08 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 with low weight (duration: 01m 20s)
  • 13:37 jynus: stop and reimage db2035
  • 13:22 bawolff@tin: Synchronized wmf-config/InitialiseSettings.php: Increase edit rate limits on commons (T194864) (duration: 01m 24s)
  • 12:51 jynus: stop and reimage db1077
  • 10:20 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 20s)
  • 10:18 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 22s)
  • 09:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 with full weight (duration: 01m 21s)
  • 09:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 with full weight, repool db1105 with low weight (duration: 01m 21s)
  • 08:02 marostegui: Deploy schema change on db1077 with replication, this will generate lags on labs - T191519 T188299 T1901482
  • 08:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for alter table (duration: 01m 22s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1123 after alter table (duration: 01m 21s)
  • 05:54 marostegui: Restart MySQL on db2075 and db2092 for testing
  • 05:50 marostegui: Restart MySQL on db1116 and db1120 for testing
  • 05:40 marostegui: Deploy schema change on db1123 - T191519 T188299 T1901482
  • 05:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1123 for alter table (duration: 01m 20s)
  • 05:27 marostegui: Stop MySQL and reboot db1067 - T194852
  • 05:11 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2064 - T195228 (duration: 01m 44s)
  • 05:09 marostegui: Deploy schema change on s7 primary master (db1062) - T191519 T188299 T1901482
  • 04:14 l10nupdate@tin: ResourceLoader cache refresh completed at Mon May 21 04:14:03 UTC 2018 (duration 7m 40s)
  • 04:06 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 15m 28s)
  • 03:08 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.3) (duration: 14m 23s)
  • 01:20 ottomata: bouncing main -> jumbo MirrorMaker with increased max.request.size - T189464

2018-05-20

  • 12:21 arturo: reboot labtestneutron2002.codfw.wmnet
  • 10:50 tstarling@tin: Synchronized wmf-config/CommonSettings.php: Scribunto maxLangCacheSize (duration: 01m 23s)
  • 08:24 ariel@tin: Finished deploy [dumps/dumps@5438d41]: sync after reimage of snapshot1007 (duration: 00m 03s)
  • 08:24 ariel@tin: Started deploy [dumps/dumps@5438d41]: sync after reimage of snapshot1007

2018-05-19

  • 13:36 mholloway-shell@tin: Finished deploy [kartotherian/deploy@58d2b0a]: Update maps/kartotherian/package to 7520fa5 (duration: 02m 28s)
  • 13:33 mholloway-shell@tin: Started deploy [kartotherian/deploy@58d2b0a]: Update maps/kartotherian/package to 7520fa5
  • 08:31 volans: restarted pdfrender on scb1001

2018-05-18

  • 21:45 tgr: updated some OAuth consumer for a hackathon project: update oauth_registered_consumer set oarc_callback_url = 'http://localhost' where oarc_consumer_key = '2828bd9ca9bdcd81a960721819f25e90';
  • 21:17 demon@tin: Finished deploy [gerrit/gerrit@a07d943]: quota plugin (duration: 00m 11s)
  • 21:16 demon@tin: Started deploy [gerrit/gerrit@a07d943]: quota plugin
  • 19:36 Reedy: tstarling manually loaded Tor Exit Nodes on wikitech
  • 19:01 chasemp: unlock bawolff gerrit account
  • 18:11 mutante: neon, netmon1002 - start ferm service
  • 18:09 mutante: einsteinium: started ferm service
  • 18:03 jynus: stop replication and start schema change on db1105
  • 17:48 faidon@tin: Finished deploy [netbox/deploy@90164b3]: Update netbox to 2.2.4 + WMF patches (duration: 00m 05s)
  • 17:48 faidon@tin: Started deploy [netbox/deploy@90164b3]: Update netbox to 2.2.4 + WMF patches
  • 17:47 faidon@tin: Finished deploy [netbox/deploy@90164b3]: Update netbox to 2.2.4 + WMF patches (duration: 00m 05s)
  • 17:47 faidon@tin: Started deploy [netbox/deploy@90164b3]: Update netbox to 2.2.4 + WMF patches
  • 17:45 faidon@tin: Finished deploy [netbox/deploy@90164b3]: Update netbox to 2.2.4 + WMF patches (duration: 00m 32s)
  • 17:45 faidon@tin: Started deploy [netbox/deploy@90164b3]: Update netbox to 2.2.4 + WMF patches
  • 16:00 gehel: reduce replication of maps v4 keyspace to 3
  • 15:59 reedy@tin: Synchronized php-1.32.0-wmf.4/extensions/VisualEditor/: Fix dialog (duration: 01m 19s)
  • 15:45 reedy@tin: Synchronized wmf-config/throttle.php: throttling! (duration: 01m 22s)
  • 15:44 marostegui: Stop MySQL and reboot db1067 - T194852
  • 15:38 reedy@tin: Synchronized php-1.32.0-wmf.3/extensions/VisualEditor/: Fix dialog (duration: 01m 25s)
  • 15:31 gehel: rolling restart of cassandra on maps1* (repair was started on each node, instead of sequentially)
  • 15:23 gehel: clear cassandra snapshots on maps1002
  • 13:59 marostegui: Manually fail disk #6 on db1066 - T194870
  • 13:19 bawolff_: reset 2FA for Trizek_(WMF)
  • 12:26 jynus: stop and reimage db2041
  • 10:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 with low load (duration: 01m 20s)
  • 09:47 marostegui: Stop MySQL on db1116 for testing
  • 09:44 marostegui: Stop MySQL on db2092 for testing
  • 09:43 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 09:33 gehel: cleared v3 snapshot on maps servers
  • 09:25 jynus: stop and reimage db1085
  • 09:24 gehel: drop v3 keyspace on cassandra maps (unused since migration to i18n)
  • 09:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 (duration: 01m 20s)
  • 08:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:s6 and other vslow hosts (duration: 01m 21s)
  • 06:14 XioNoX: bumping eqsin-codfw link OSPF metric to 5000 (due to packet loss on link)
  • 05:37 marostegui: Stop MySQL on db1120 to copy its content to db2075 - T190704
  • 05:33 marostegui: Deploy schema change on dbstore1002:s3 - T191519 T188299 T190148
  • 05:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T193847 (duration: 01m 22s)
  • 05:24 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011
  • 05:18 marostegui: Stop MySQL and reboot db1067 - T194852
  • 01:34 twentyafterfour@tin: Synchronized php-1.32.0-wmf.4: sync wmf.4 to deploy https://gerrit.wikimedia.org/r/#/c/433673/ (duration: 09m 54s)
  • 01:27 twentyafterfour: syncing wmf.4 again to deploy https://gerrit.wikimedia.org/r/#/c/433673/ refs T194900 T191050
  • 00:19 mutante: rdb2004 - down in Icinga since >1d, nothing on console, dont see a SAL entry. powercycling

2018-05-17

  • 23:35 twentyafterfour: MediaWiki Train for 1.32.0-wmf.4 remains blocked by critical bugs, see T191050 for a list of blockers.
  • 23:34 twentyafterfour@tin: Synchronized php: group1 wikis to 1.32.0-wmf.3 refs T191050 (duration: 01m 20s)
  • 23:32 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.3 refs T191050
  • 23:29 twentyafterfour: rolling back
  • 23:29 twentyafterfour: still seeing Notice: Undefined variable: nonce in /srv/mediawiki/php-1.32.0-wmf.4/includes/resourceloader/ResourceLoaderClientHtml.php on line 272
  • 23:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.32.0-wmf.4 refs T191050 (duration: 01m 17s)
  • 23:26 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.4 refs T191050
  • 23:22 twentyafterfour@tin: Synchronized php-1.32.0-wmf.4/: sync https://gerrit.wikimedia.org/r/#/c/433673/ refs T194900 (duration: 09m 54s)
  • 22:53 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/433673/ refs T194900 T191050
  • 19:46 twentyafterfour@tin: Synchronized php: group1 wikis to 1.32.0-wmf.3 (duration: 01m 20s)
  • 19:44 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.3
  • 19:41 twentyafterfour: rolling back due to spike of undefined variable notices in resourceloader and ApiCSPReport.php
  • 19:39 twentyafterfour@tin: Synchronized php: group1 wikis to 1.32.0-wmf.4 (duration: 01m 21s)
  • 19:38 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.4
  • 19:33 twentyafterfour: getting the train back on track. Starting with group1 to 1.32.0-wmf.4 right now, will do all wikis to wmf.4 after verifying that group1 looks stable.
  • 19:28 twentyafterfour@tin: Synchronized php-1.32.0-wmf.4/extensions/Echo/: unbreak T194848 (duration: 01m 24s)
  • 19:11 twentyafterfour: train is still blocked by T194848
  • 17:23 arlolra: Updated Parsoid to fd49ab4 (T194821, T194687)
  • 17:15 arlolra@tin: Finished deploy [parsoid/deploy@091b891]: Updating Parsoid to fd49ab4 (duration: 09m 35s)
  • 17:06 arlolra@tin: Started deploy [parsoid/deploy@091b891]: Updating Parsoid to fd49ab4
  • 16:11 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 T174047 T194341
  • 16:04 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 (duration: 01m 21s)
  • 15:29 marostegui: Manually fail disk #6 on db1064 to get it replaced
  • 15:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 with full weight (duration: 01m 21s)
  • 15:00 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010
  • 14:39 papaul: shutting down furud for shelves swap
  • 14:35 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010 https://phabricator.wikimedia.org/T174047 https://phabricator.wikimedia.org/T194341
  • 14:17 marostegui: Manually fail disk #2 on db1064 to get it replaced
  • 14:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1066 IP - T193847 (duration: 01m 21s)
  • 13:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db1066 IP - T193847 (duration: 01m 17s)
  • 13:50 marostegui: Power off db1066 for a rack change - T193847
  • 13:46 marostegui: Stop MySQL on db1066 for a rack change - T193847
  • 13:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for a rack change - T193847 (duration: 01m 21s)
  • 13:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 (duration: 01m 21s)
  • 13:36 jynus: restarted db1105 by mistake, turning it back on
  • 13:15 jynus: stop and reimage db1106
  • 12:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 20s)
  • 12:50 marostegui: Deploy schema change on s3 codfw primary master (db2043) this will generate lag on codfw - T191519 T188299 T190148
  • 11:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090:3317 after alter table (duration: 01m 21s)
  • 11:08 marostegui: Stop MySQL and poweroff db1067 - T194852
  • 10:12 mobrovac@tin: Finished deploy [citoid/deploy@8a26508]: Update citoid to 2f35126 - T179123 T185217 (duration: 02m 52s)
  • 10:09 mobrovac@tin: Started deploy [citoid/deploy@8a26508]: Update citoid to 2f35126 - T179123 T185217
  • 09:30 reedy@tin: Synchronized wmf-config/throttle.php: Throttle for Barcelona Hackathon (duration: 01m 22s)
  • 08:41 jynus: stop and reimage db2049
  • 08:04 jynus: stop and reimage db2056
  • 07:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 with low load (duration: 01m 20s)
  • 07:25 elukey: bounced all the prometheus burrow exporters on kafkamon* hosts to refresh their metrics and drop old/expired cgroups
  • 07:22 marostegui: Deploy schema change on db1090:3317 - T191519 T188299 T190148
  • 07:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090:3317 for alter table (duration: 01m 21s)
  • 07:19 marostegui@tin: Synchronized wmf-config/db-codfw.php: Specify the sanitarium masters in codfw (duration: 01m 21s)
  • 07:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 after alter table (duration: 01m 20s)
  • 07:05 jynus: stop and reimage db1093
  • 07:02 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 01m 22s)
  • 05:52 marostegui: Disable BBU auto-learn on new hosts - T192979
  • 05:31 marostegui: Deploy schema change on db1079 with replication (this will generate lag on labs s7) - T191519 T188299 T190148
  • 05:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 for alter table (duration: 01m 22s)
  • 05:20 marostegui: Force BBU learn cycle on db1054 - T194867
  • 05:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 after alter table (duration: 01m 48s)
  • 05:09 marostegui: Deploy schema change on s4 primary master (db1068) - T191519 T188299 T190148
  • 04:10 l10nupdate@tin: ResourceLoader cache refresh completed at Thu May 17 04:10:13 UTC 2018 (duration 7m 42s)
  • 04:02 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 14m 08s)
  • 03:04 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.3) (duration: 14m 27s)
  • 00:57 mutante: installing OS on webperf1002, webperf2002

2018-05-16

  • 21:58 cmjohnson1: swapping PEM 1 asw-c8-eqiad
  • 20:13 marostegui: Force WriteBack policy on db1067 - T194852
  • 19:38 twentyafterfour: The train for 1.32.0-wmf.4 is blocked by fatals in Echo extension. See T194848
  • 19:28 twentyafterfour@tin: Synchronized php-1.32.0-wmf.4/extensions/CongressLookup: sync CongressLookup extension on wmf.4 refs T191050 (duration: 01m 22s)
  • 17:35 milimetric@tin: Finished deploy [analytics/refinery@a205447]: Fix drop partitions script (duration: 15m 12s)
  • 17:33 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on wikis with < 100 ns0 errors in high priority cats (T193685) (duration: 01m 22s)
  • 17:20 milimetric@tin: Started deploy [analytics/refinery@a205447]: Fix drop partitions script
  • 17:16 milimetric@tin: Started deploy [analytics/refinery@15be6ae]: Fix drop partitions script
  • 16:38 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with low weight (duration: 01m 21s)
  • 16:34 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db1067 IP - T193835 (duration: 01m 21s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db1067 IP - T193835 (duration: 01m 17s)
  • 16:22 elukey: upgrade burrow on kafkamon1001 from 1.0 to 1.1
  • 16:18 marostegui: Power off db1067 for rack move - T193835
  • 16:18 godog: move ms-be1036 for asw2-c-eqiad - T187962
  • 16:03 godog: move ms-be1035 for asw2-c-eqiad - T187962
  • 16:03 _joe_: installing cergen 0.2.2 on the puppetmaster frontends
  • 15:50 andrewbogott: upgrading kernel, microcode and rebooting labnet1002
  • 15:50 jynus: stop and reimage db1088
  • 15:49 godog: move ms-be1034 for asw2-c-eqiad - T187962
  • 15:36 godog: move ms-be1025 for asw2-c-eqiad - T187962
  • 15:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 (duration: 03m 26s)
  • 15:27 _joe_: uploading cergen 0.2.2 to stretch, jessie
  • 15:18 godog: move ms-be1024 for asw2-c-eqiad - T187962
  • 15:17 elukey: upgrade burrow from 1.0.0 to 1.1.0 on kafkamon* hosts
  • 15:17 elukey: upload burrow 1.1.0 to stretch|jessie-wikimedia
  • 15:17 bstorm_: labsdb1004 reboot and upgrade kernel to 4.9.0-0.bpo.6-amd64
  • 15:10 godog: pool ms-fe1008 for asw2 move - T187962
  • 15:02 marostegui: Stop MySQL on labsdb1004
  • 15:01 marostegui: Stop MySQL on db1067 - T193835
  • 14:54 godog: depool ms-fe1008 for asw2 move - T187962
  • 14:50 moritzm: upgrading codfw job runners to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 14:49 godog: pool ms-fe1007 for asw2 move - T187962
  • 14:19 filippo@neodymium: conftool action : set/pooled=no; selector: name=ms-fe1007.eqiad.wmnet
  • 14:17 godog: depool ms-fe1007 for asw2 move - T187962
  • 14:04 jynus: stop and reimage db2048 (s1 master)
  • 13:56 marostegui: Restart mysql on db1116 for testing
  • 13:54 marostegui: Deploy schema change on db1098:3317 - T191519 T188299 T190148
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for alter table (duration: 01m 20s)
  • 13:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 after alter table (duration: 01m 21s)
  • 13:43 addshore: SWAT all done
  • 13:42 addshore@tin: Synchronized php-1.32.0-wmf.3/autoload.php: SWAT: Deduplicate archive.ar_rev_id (duration: 01m 21s)
  • 13:41 addshore@tin: Synchronized php-1.32.0-wmf.3/includes/installer/DatabaseUpdater.php: SWAT: Deduplicate archive.ar_rev_id (duration: 01m 21s)
  • 13:40 ottomata: rolling restart of Kafka jumbo brokers to apply jdk.tls.namedGroups=secp256r1 https://phabricator.wikimedia.org/T182993
  • 13:39 addshore@tin: Synchronized php-1.32.0-wmf.3/maintenance/: SWAT: Deduplicate archive.ar_rev_id (duration: 01m 30s)
  • 13:38 addshore@tin: Synchronized php-1.32.0-wmf.4/autoload.php: SWAT: Deduplicate archive.ar_rev_id (duration: 01m 21s)
  • 13:37 jynus: restart dbstore2002 for upgrade
  • 13:36 addshore@tin: Synchronized php-1.32.0-wmf.4/includes/installer/DatabaseUpdater.php: SWAT: Deduplicate archive.ar_rev_id (duration: 01m 21s)
  • 13:34 addshore@tin: Synchronized php-1.32.0-wmf.4/maintenance/: SWAT: Deduplicate archive.ar_rev_id (duration: 01m 31s)
  • 13:26 addshore@tin: Synchronized wmf-config/Wikibase.php: SWAT: Configure WikibaseLexeme after Repo & Client T191458 T194250 (duration: 01m 21s)
  • 13:20 addshore@tin: Synchronized php-1.32.0-wmf.4/extensions/ContentTranslation/modules/ve-cx/init/ve.init.mw.CXTarget.js: SWAT: Fix mistake in 84caceee that causes exceptions with MT card T194811 (duration: 01m 21s)
  • 13:17 jynus: restarting mariadb processes on dbstore2001 T194516
  • 13:07 moritzm: rebooting deployment-ms-be03 for tests related to IBPB passthrough
  • 12:56 jynus: stop and reimage db2039 (s6 master)
  • 11:28 addshore: WikibaseLexeme slot done
  • 11:10 moritzm: rebooting multatuli for some tests
  • 11:01 addshore: addshore@terbium mwscript extensions/WikibaseLexeme/maintenance/createBlacklistedLexemes.php --wiki testwikidatawiki
  • 10:57 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Enable WikibaseLexeme on test.wikidata.org T191458 (duration: 01m 20s)
  • 10:47 addshore@tin: Synchronized wmf-config/Wikibase.php: Prepare Lexeme config for test.wikidata.org T194250 PT 2/2 (duration: 01m 21s)
  • 10:45 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Prepare Lexeme config for test.wikidata.org T194250 PT 1/2 (duration: 01m 22s)
  • 10:41 moritzm: uploaded linux 4.9.88-1+deb9u1~bpo8+1 to apt.wikimedia.org/jessie-wikimedia
  • 10:34 jynus: stop and reimage db2046
  • 10:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 as it will be moved to a different rack - T193835 (duration: 01m 21s)
  • 10:14 marostegui: Drop unused tables msg_resource msg_resource_links from s2 - T194663
  • 09:57 jynus: stop and reimage db2053
  • 09:25 moritzm: upgrading video scalers to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 09:22 marostegui: Deploy schema change on db1101:3317 - T191519 T188299 T190148
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 for alter table (duration: 01m 21s)
  • 09:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 after alter table (duration: 01m 21s)
  • 08:54 jynus: restart dbproxy1002 for upgrade
  • 08:48 moritzm: upgrading mw1221-mw1235 to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 08:45 jynus: restart dbproxy1003 for upgrade
  • 08:20 marostegui: Stop MySQL on db1120 on s2, s4, s6 and s7 to copy its content to db2075 - T190704
  • 07:57 _joe_: depooled ULSFO from live traffic in dns
  • 07:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2075 to convert it to temporary sanitarium (duration: 01m 20s)
  • 07:36 moritzm: installing systemd updates from stretch SUA
  • 07:33 marostegui: Deploy schema change on db1094 - T191519 T188299 T190148
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 for alter table (duration: 01m 20s)
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 after alter table (duration: 01m 34s)
  • 07:25 marostegui: Drop unused tables msg_resource msg_resource_links from s1 - T194663
  • 07:18 moritzm: upgrading mw1240-mw1258 to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 06:56 moritzm: upgrading mwdebug servers to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 06:45 marostegui: Drop unused tables msg_resource msg_resource_links from s3 codfw - T194663
  • 06:25 marostegui: Drop unused tables msg_resource msg_resource_links from s7 - T194663
  • 06:19 marostegui: Drop unused tables msg_resource msg_resource_links from s4 - T194663
  • 06:19 elukey: update analytics-in4 on cr1/cr2 eqiad to allow conf100[4-6] (new zookeeper hosts)
  • 06:17 marostegui: Drop unused tables msg_resource msg_resource_links from s8 - T194663
  • 06:15 marostegui: Drop unused tables msg_resource msg_resource_links from s6 - T194663
  • 06:11 marostegui: Drop unused tables msg_resource msg_resource_links from s5 - T194663
  • 05:54 elukey: removed acpi_power_meter manually from conf1004 (blacklisted module in puppet), Acpi errors in dmesg
  • 05:22 marostegui: Deploy schema change on db1086 - T191519 T188299 T190148
  • 05:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 for alter table (duration: 01m 23s)
  • 05:14 marostegui: Deploy schema change on s2 primary master db1054 - T191519 T188299 T190148
  • 04:43 kartik@tin: Finished deploy [cxserver/deploy@7e898c7]: Update cxserver to 112a1a1 (T191285) (duration: 03m 52s)
  • 04:39 kartik@tin: Started deploy [cxserver/deploy@7e898c7]: Update cxserver to 112a1a1 (T191285)
  • 03:46 l10nupdate@tin: ResourceLoader cache refresh completed at Wed May 16 03:46:12 UTC 2018 (duration 7m 41s)
  • 03:38 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.4) (duration: 16m 26s)
  • 02:40 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.3) (duration: 08m 25s)

2018-05-15

  • 23:33 mutante: creating ganeti VM webperf2002.eqiad.wmnet on ganeti2004 (link: private, row: A, cpus: 4, ram: 8, disk: 50) (T194390)
  • 23:11 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2139.codfw.wmnet
  • 23:10 mutante: mw2139 - reimaged, scap pull, apache-fast-test baseurls from naos, repooled with confctl (T194426)
  • 23:06 mutante: creating ganeti VM webperf1002.eqiad.wmnet on ganeti1004 (link: private, row: A, cpus: 4, ram: 8, disk: 50) (T194390)
  • 22:39 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Deploying GlobalPreferences T190425 (duration: 01m 21s)
  • 22:21 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.4 refs T191050
  • 22:18 twentyafterfour@tin: Synchronized php-1.32.0-wmf.4/includes/api/ApiLogin.php: fix syntax error (duration: 01m 39s)
  • 21:34 twentyafterfour@tin: Finished scap: testwikis to 1.32.0-wmf.4 refs T191050 (duration: 66m 19s)
  • 20:27 twentyafterfour@tin: Started scap: testwikis to 1.32.0-wmf.4 refs T191050
  • 20:22 mutante: [radium:~] $ sudo apt-get autoremove
  • 19:53 cwd: re-enabled process-control
  • 19:42 twentyafterfour@tin: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_770814178" --threads=10 --lang en --quiet' returned non-zero exit status 255 (duration: 02m 47s)
  • 19:40 cwd: disabled process-control for sql trigger refresh
  • 19:39 twentyafterfour@tin: Started scap: testwikis wikis to 1.32.0-wmf.4
  • 19:05 mutante: mw2139 - wmf-auto-reimage --conftoool --new (because it got "Failed to icinga_downtime" and has a new mainboard (T194426)
  • 19:03 mutante: mw2139 - wmf-auto-reimage --conftoool --no-verify (T194426)
  • 17:27 twentyafterfour: branching 1.32.0-wmf.4 refs T191050
  • 17:15 ottomata: rolling restart kafka-jumbo100[456]
  • 16:36 elukey: rolling restart of aqs on aqs* nodes to pick up the new druid config
  • 16:27 milimetric@tin: Finished deploy [analytics/refinery@679cf09]: Update partition drop script after schema change (duration: 07m 13s)
  • 16:25 mobrovac@tin: Started restart [cpjobqueue/deploy@58935d5]: (no justification provided)
  • 16:21 mobrovac@tin: Started restart [changeprop/deploy@e468d8e]: (no justification provided)
  • 16:20 milimetric@tin: Started deploy [analytics/refinery@679cf09]: Update partition drop script after schema change
  • 16:12 elukey: roll restart kafka on kafka-jumbo to pick up new zookeeper settings
  • 16:11 joal@tin: Finished deploy [analytics/aqs/deploy@a736558]: Deploying druid-configuration patch (duration: 05m 47s)
  • 16:05 joal@tin: Started deploy [analytics/aqs/deploy@a736558]: Deploying druid-configuration patch
  • 15:52 elukey: rolling restart of hadoop master daemons to pick up new zookeeper settings
  • 15:20 elukey: roll restart of Kafka Analytics to pick up new zookeeper settings
  • 14:59 elukey: roll restart of kafka daemons on kafka100[1-3] to pick up new zookeeper settings and group.initial.rebalance.delay.ms = 10s
  • 14:28 mobrovac@tin: Started restart [changeprop/deploy@e468d8e]: Restart after Kafka settings change
  • 14:28 mobrovac@tin: Started restart [cpjobqueue/deploy@58935d5]: Restart after Kafka settings change
  • 14:19 ottomata: temporarily disabling puppet on analytics1003 to run refine-eventbus after jumbo based camus eventbus import finishes
  • 14:14 elukey: swap conf1001 with conf1004 in the zookeeper main eqiad's config + roll restart of the service
  • 14:10 mobrovac@tin: Started restart [cpjobqueue/deploy@58935d5]: Restart after Kafka settings change
  • 14:09 mobrovac@tin: Started restart [changeprop/deploy@e468d8e]: Restart after Kafka settings change
  • 14:00 andrewbogott: rebooting labnet1001
  • 13:50 elukey: roll restart of kafka main codfw (kafka200[1-3]) to pick up group.initial.rebalance.delay.ms = 10s
  • 13:31 jynus: stop db2055 for reimage
  • 13:09 chasemp: disable puppet for all openstack things in eqiad
  • 13:07 andrewbogott: stopping nodepool and puppet on labnodepool1001 for T193579
  • 12:59 andrewbogott: stopping puppet on labnet1001 and 1002, silencing icinga for T193579
  • 12:42 jynus: stop db2060 for reimage
  • 12:14 moritzm: uploaded intel-microcode 20180425 for jessie-wikimedia/stretch-wikimedia
  • 10:57 jynus: stop db2067 for reimage
  • 10:49 joal@tin: Finished deploy [analytics/refinery@25abeec]: Fix for regular weekly deploy (duration: 06m 45s)
  • 10:46 jynus: stop db2066 for reimage
  • 10:43 joal@tin: Started deploy [analytics/refinery@25abeec]: Fix for regular weekly deploy
  • 10:16 jynus: stop db2065 for reimage
  • 10:15 moritzm: installing uwsgi security update on graphite servers in eqiad
  • 10:07 moritzm: installing php5 security updates on trusty
  • 09:51 moritzm: upgrading API server canaries to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 09:47 jynus: stop and restart db2091 for upgrade
  • 09:36 joal@tin: Finished deploy [analytics/refinery@b2f4c3c]: Regular weekly deploy (duration: 05m 38s)
  • 09:30 joal@tin: Started deploy [analytics/refinery@b2f4c3c]: Regular weekly deploy
  • 09:26 jynus: stop and restart db2088 for upgrade
  • 09:21 moritzm: upgrading app server canaries to HHVM 3,18.5+dfsg-1+wmf8+deb9u1
  • 09:03 jynus: stop db2061 for reimage
  • 08:42 jynus: stop db2068 for reimage
  • 03:11 l10nupdate@tin: ResourceLoader cache refresh completed at Tue May 15 03:10:59 UTC 2018 (duration 7m 11s)
  • 03:03 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.3) (duration: 06m 32s)
  • 02:33 bawolff@tin: Finished scap: Backport https://gerrit.wikimedia.org/r/#/c/433096/ - log js loads of unregistered user js subpages (duration: 56m 27s)
  • 01:37 bawolff@tin: Started scap: Backport https://gerrit.wikimedia.org/r/#/c/433096/ - log js loads of unregistered user js subpages
  • 01:17 bawolff@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/433095/ log security channel (duration: 01m 02s)

2018-05-14

  • 23:39 bawolff@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/433089/ - re-enable sending csp logs to logstash (duration: 01m 01s)
  • 23:36 bawolff@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/432334/ Use english messages in badpass log (duration: 01m 16s)
  • 22:24 eileen: update process-control to 97fd42b3ab disabling dedupe catch up
  • 21:54 mutante: mwdebug1001 - temp modifying apache 08-wikimedia.conf to test gerrit:429863
  • 21:03 arlolra: Updated Parsoid to 945ed23 (T194082, T194083, T194084)
  • 20:54 arlolra@tin: Finished deploy [parsoid/deploy@28fcc4e]: Updating Parsoid to 945ed23 (duration: 11m 29s)
  • 20:42 arlolra@tin: Started deploy [parsoid/deploy@28fcc4e]: Updating Parsoid to 945ed23
  • 20:09 bsitzmann@tin: Finished deploy [mobileapps/deploy@ccffa6b]: Update mobileapps to 39c16e4 (T193440 T193439 T194065) (duration: 07m 49s)
  • 20:02 bsitzmann@tin: Started deploy [mobileapps/deploy@ccffa6b]: Update mobileapps to 39c16e4 (T193440 T193439 T194065)
  • 19:11 urandom: rolling cassandra restart, restbase dev environment
  • 18:20 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Create the eventcoordinator user group on enwiki - T193075 (duration: 01m 02s)
  • 17:12 gehel@tin: Finished deploy [wdqs/wdqs@ef37bf7]: new WDQS GUI version (duration: 08m 43s)
  • 17:03 gehel@tin: Started deploy [wdqs/wdqs@ef37bf7]: new WDQS GUI version
  • 16:19 elukey: umount/remount /mnt/hdfs on stat1005 to pick up new openjdk upgrades
  • 15:11 jynus: stop db2064 for reimage
  • 14:56 marostegui: Drop unused table long_run_profiling from enwiki - T194661
  • 14:45 moritzm: rebooting labtestnet2002 for some microcode/kernel tests
  • 14:44 mobrovac@tin: Finished deploy [restbase/deploy@75dc661]: API: Add /transform/list/tool/{tool}{/from}{/to}, take #2 (duration: 03m 58s)
  • 14:40 mobrovac@tin: Started deploy [restbase/deploy@75dc661]: API: Add /transform/list/tool/{tool}{/from}{/to}, take #2
  • 14:39 mobrovac@tin: Finished deploy [restbase/deploy@75dc661]: API: Add /transform/list/tool/{tool}{/from}{/to} - T163203 (duration: 23m 35s)
  • 14:30 milimetric@tin: Finished deploy [analytics/refinery@541823e]: deploying refinery to update python logic for cron jobs (duration: 19m 46s)
  • 14:30 otto@tin: Finished deploy [analytics/turnilo/deploy@9b2c8f0]: initial deploy (duration: 00m 28s)
  • 14:29 otto@tin: Started deploy [analytics/turnilo/deploy@9b2c8f0]: initial deploy
  • 14:19 moritzm: rebooting silver for some microcode/kernel tests
  • 14:15 mobrovac@tin: Started deploy [restbase/deploy@75dc661]: API: Add /transform/list/tool/{tool}{/from}{/to} - T163203
  • 14:10 milimetric@tin: Started deploy [analytics/refinery@541823e]: deploying refinery to update python logic for cron jobs
  • 13:58 herron: added temporary iptables drop rules on fermium for IPs with many hits logged against the list subscribe rate limit
  • 13:53 reedy@tin: Synchronized php-1.32.0-wmf.3/extensions/CirrusSearch: Partially revert deprecation of global namespace handling in prefix (duration: 01m 20s)
  • 13:50 reedy@tin: Synchronized wmf-config/throttle.php: T194630 (duration: 01m 02s)
  • 13:41 jynus: stop db2063 for reimage
  • 13:18 marostegui: Deploy schema change on dbstore1002:s7 - T191519 T188299 T190148
  • 13:08 akosiaris: remove ganeti2003, ganeti2007 from the ganeti cluster. stretch reimaging in progress
  • 13:06 akosiaris: reboot ganeti2008 for kernel ugprade
  • 12:38 kartik@tin: Finished deploy [cxserver/deploy@a7ef01b]: Update cxserver to 176b507 (duration: 03m 26s)
  • 12:35 kartik@tin: Started deploy [cxserver/deploy@a7ef01b]: Update cxserver to 176b507
  • 11:23 akosiaris: upload apertium-streamparser to apt.wikimedia.org/jessie-wikimedia/main T192978
  • 10:17 moritzm: installing systemd updates from stretch SUA update
  • 09:41 moritzm: installing gunicorn security updates
  • 09:28 moritzm: installing ghostscript security updates on trusty
  • 09:23 moritzm: installing libmad security updates
  • 09:11 moritzm: installing wavpack security updates for stretch (jessie/trusty not affected)
  • 08:41 moritzm: uploaded HHVM 3.18.5+dfsg-1+wmf8+deb9u1 to apt.wikimedia.org
  • 07:57 marostegui: Deploy schema change on s7 codfw master (db2040) with replication, this will generate lag on codfw - T191519 T188299 T190148
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1091 after alter table (duration: 01m 02s)
  • 06:38 elukey: rolling restart of cassandra on aqs* for openjdk-8 upgrades
  • 06:35 moritzm: installing wget security updates on trusty (Debian already fixed)
  • 06:16 marostegui: Drop unused flaggedrevs from s3 testwiki - T174801
  • 05:22 marostegui: Deploy schema change on db1091 - T191519 T188299 T190148
  • 05:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 for alter table (duration: 01m 03s)
  • 03:18 l10nupdate@tin: ResourceLoader cache refresh completed at Mon May 14 03:18:25 UTC 2018 (duration 7m 25s)
  • 03:11 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.3) (duration: 16m 20s)

2018-05-13

  • 21:10 bawolff: Reset some botpasswords associated with T194204
  • 20:09 bawolff: Deployed patch for T194605

2018-05-12

  • 06:50 foks: rm 2FA from User:Fatemi (at 6:38 PM UTC, Friday May 11)

2018-05-11

  • 21:28 herron: updated lists rate limit to 1 subscribe per 60 minutes as a temporary measure until problem requests slow down T194032
  • 20:59 herron: deployed rate limiting for POST requests to mailman list subscription URIs https://gerrit.wikimedia.org/r/432168 T194032
  • 20:27 chasemp: temp changes on fermium for T194032
  • 19:26 chasemp: on fermium
  • 19:26 chasemp: disable puppet and temp block a few IPs I believe are bad actors hammering mailman
  • 18:21 chasemp: change 'Advertise this list when people ask what lists are on this machine' to no for cloud-admin-l and cloud-admin-feed
  • 17:21 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 with full weight (duration: 01m 02s)
  • 16:26 jynus: restarting mariadb@s8 at dbstore2001
  • 16:20 jynus: removing x1 from dbstore2001
  • 14:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1076 with low load, increase db1066 load (duration: 01m 02s)
  • 14:30 jynus: reset labsdb proxies to its config defaults after rolling restart for upgrade
  • 14:28 elukey: restart kafka brokers on kafka10[20,22,23] to pick up openjdk-7 security upgrades
  • 14:14 andrewbogott: rebooting labvirt1001 for T194258
  • 14:13 elukey: restart Hadoop daemons on analytics100[12] for openjdk security upgrades
  • 13:39 jynus: stopping and restarting labsdb1009 for upgrade
  • 13:32 jynus: reloading haproxy configuration for dbproxy1011 to point to labsdb1011
  • 12:09 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1066 with low load (duration: 01m 02s)
  • 11:48 Amir1: making change_tag_def table on all wikis (T194302)
  • 10:07 elukey: reimage analytics1052 to Debian Stretch (Hadoop Journal node)
  • 09:30 jynus: stopping db1076 for maintenance
  • 09:25 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 (duration: 01m 02s)
  • 07:59 elukey: reimage analytics1035 to Debian Stretch
  • 07:00 jynus: depool and upgrade/restart of dbproxy1011
  • 02:09 ejegg: updated payments-wiki from 55d0808853 to c81e25f8d3

2018-05-10

  • 23:18 ejegg: updated CiviCRM from 4c6a9c9c1c to 5b8c868a00
  • 23:09 thcipriani@tin: Synchronized dblists/categories-rdf.dblist: SWAT: Add wikis with more that 1000 categories to categories dump T194139 (duration: 01m 02s)
  • 20:27 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2140.codfw.wmnet
  • 20:27 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2140.co3dfw.wmnet
  • 20:26 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2144.codfw.wmnet
  • 20:24 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2202.codfw.wmnet
  • 20:17 sbisson@tin: Finished deploy [kartotherian/deploy@06572bb]: Kartotherian: add fonts for Syriac and Inuktituk (duration: 03m 40s)
  • 20:13 sbisson@tin: Started deploy [kartotherian/deploy@06572bb]: Kartotherian: add fonts for Syriac and Inuktituk
  • 20:05 XenoRyet: updated payments-wiki from 4a8aada491 to 55d0808853
  • 20:02 twentyafterfour: New branch 1.32.0-wmf.3 appears to be stable on all wikis. This completes the train for the week. Tune in again next week, same bat time, same bat channel. refs T191049
  • 19:57 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.3
  • 19:53 twentyafterfour: all deployment blockers resolved, proceeding to deploy mediawiki 1.32.0-wmf.3 to all wikis
  • 19:14 twentyafterfour@tin: Synchronized php-1.32.0-wmf.3/includes/libs/rdbms/: deploy https://gerrit.wikimedia.org/r/#/c/432415/ refs T194308 (duration: 01m 23s)
  • 18:56 maxsem@tin: Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/432416/ labs only (duration: 01m 20s)
  • 18:30 sbisson@tin: Finished deploy [tilerator/deploy@a5ec109]: Tilerator: load source and style from yaml (duration: 08m 45s)
  • 18:21 sbisson@tin: Started deploy [tilerator/deploy@a5ec109]: Tilerator: load source and style from yaml
  • 18:16 sbisson@tin: Finished deploy [kartotherian/deploy@3aa87ff]: Kartotherian: load style from yaml (duration: 06m 16s)
  • 18:09 sbisson@tin: Started deploy [kartotherian/deploy@3aa87ff]: Kartotherian: load style from yaml
  • 18:02 ppchelko@tin: Finished deploy [restbase/deploy@fb306e3]: Logging improvements (duration: 15m 33s)
  • 17:59 jynus@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1123, Remove db1072 (duration: 01m 15s)
  • 17:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1072 (duration: 01m 20s)
  • 17:46 ppchelko@tin: Started deploy [restbase/deploy@fb306e3]: Logging improvements
  • 15:54 elukey: drain and reimage analytics1028 to Debian Jessie (Hadoop Journal node)
  • 15:00 elukey: rolling restart of Hadoop HDFS datanodes on analytics workers to pick up the new openjdk-8 security upgrades
  • 14:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1123 with low weight (duration: 01m 20s)
  • 14:38 elukey: rolling restart of Hadoop Yarn nodemanagers on analytics worker nodes for openjdk-8 security upgrades
  • 14:32 jynus: restarting db1123
  • 14:28 otto@tin: Started restart [eventlogging/eventbus@aa9eb2c]: apply log level changes
  • 14:27 ottomata: rolling restart of eventbus service to apply new log level settings
  • 14:07 elukey: restart hive/oozie Hadoop daemons on analytics1003 for openjdk-8 upgrades
  • 13:56 mutante: mw2139,mw2140,mw2144,mw2202 - reinstall with --no-verify - last few special cases that didnt have puppet certs or failed before - after that all appservers done
  • 13:46 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2138.codfw.wmnet
  • 13:43 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2137.codfw.wmnet
  • 13:42 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2136.codfw.wmnet
  • 13:41 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2135.codfw.wmnet
  • 13:31 elukey: rolling restart of kafka on kafka-jumbo1* for openjdk-8 security upgrades
  • 13:25 zeljkof: EU SWAT finished
  • 13:19 elukey: reimage analytics1029 to Debian Stretch
  • 13:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: cawiki: remove gendered namespace aliases, already on MW core (T113616) (duration: 01m 20s)
  • 13:06 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable TemplateStyles for nowiki (T193786) (duration: 01m 20s)
  • 12:21 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add db1123 (duration: 01m 19s)
  • 12:07 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1123 (duration: 01m 19s)
  • 12:04 Reedy: that was for "Add throttle rule for Netherlands Hackathon 2018 - Women Tech Storm"
  • 12:04 reedy@tin: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 48s)
  • 11:46 elukey: reimage analytics1030/31 to Debian Stretch
  • 11:07 volans: updated puppet compiler facts
  • 09:07 jynus: stop db1072 for maintenance
  • 08:10 jynus: shutdown and restart labsdb1010 for upgrade
  • 07:12 jynus: depool labsdb1010 from wikireplicas for maintenace
  • 03:04 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.2) (duration: 15m 35s)
  • 02:51 mutante: mw2137,mw2138,mw22139 - reinstall with strech (last message was wrong, already running)
  • 02:48 mutante: mw2136,mw2137,mw22138 - reinstall with stret h
  • 02:44 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2141.codfw.wmnet
  • 02:41 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2142.codfw.wmnet
  • 02:39 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2143.codfw.wmnet
  • 01:16 twentyafterfour@tin: Synchronized php-1.32.0-wmf.3/includes/libs/rdbms: sync https://gerrit.wikimedia.org/r/#/c/432297/ refs T194308 T191049 (duration: 01m 24s)
  • 00:33 mutante: mw2135, mw2136, mw2137 - reinstall with stretch, depooled, downtimed
  • 00:28 hoo@tin: Synchronized php-1.32.0-wmf.3/extensions/Wikibase/lib/WikibaseLib.php: Remove wgHooks entry for GalleryGetModes (T194316) (duration: 01m 16s)
  • 00:26 hoo@tin: Synchronized php-1.32.0-wmf.2/extensions/Wikibase/lib/WikibaseLib.php: Remove wgHooks entry for GalleryGetModes (T194316) (duration: 01m 20s)
  • 00:16 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Fix notices about missing index "max" (duration: 01m 20s)
  • 00:06 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES on arwiki (T192498) (duration: 01m 20s)
  • 00:01 twentyafterfour: no phabricator deployment this evening.

2018-05-09

  • 23:53 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES on cawiki, lvwiki, huwiki (T192501, T192499, T192496) (duration: 01m 29s)
  • 23:24 awight@tin: Finished deploy [ores/deploy@bf182e2]: Rollback ores1001 (duration: 00m 03s)
  • 23:24 awight@tin: Started deploy [ores/deploy@bf182e2]: Rollback ores1001
  • 23:13 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on some wikibooks wikis (T192821) (duration: 01m 21s)
  • 23:06 awight@tin: Finished deploy [ores/deploy@bf1e2b1]: ORES: drafttopic (duration: 25m 51s)
  • 23:02 mutante: mw2141,mw2143,mw2142 - reinstalling with stretch - mw2144: puppet cert not found
  • 22:59 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2147.codfw.wmnet
  • 22:58 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2146.codfw.wmnet
  • 22:56 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2145.codfw.wmnet
  • 22:40 awight@tin: Started deploy [ores/deploy@bf1e2b1]: ORES: drafttopic
  • 22:35 awight@tin: Finished deploy [ores/deploy@1b13ef1]: ORES: drafttopic (duration: 03m 15s)
  • 22:33 addshore@tin: Synchronized wmf-config/extension-list: extension-list Add WikibaseLexeme to extension-list (duration: 01m 19s)
  • 22:32 awight@tin: Started deploy [ores/deploy@1b13ef1]: ORES: drafttopic
  • 22:28 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Add default edit rate limit of 90 edits/minute for all users (except wikidata) (duration: 01m 19s)
  • 22:20 awight@tin: Finished deploy [ores/deploy@2a09939]: ORES: force git-lfs install (take 3) (duration: 03m 05s)
  • 22:18 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T184745 T191459 BETA ONLY Enable WikibaseLexeme on BETA wikidatawiki (duration: 01m 19s)
  • 22:17 awight@tin: Started deploy [ores/deploy@2a09939]: ORES: force git-lfs install (take 3)
  • 22:09 awight@tin: Finished deploy [ores/deploy@c0db102]: ORES: force git-lfs install (take 2) (duration: 03m 15s)
  • 22:08 addshore@tin: Synchronized wmf-config/Wikibase-labs.php: T184745 BETA ONLY WikibaseLexeme config (duration: 01m 20s)
  • 22:06 awight@tin: Started deploy [ores/deploy@c0db102]: ORES: force git-lfs install (take 2)
  • 22:06 addshore@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T184745 BETA ONLY WikibaseLexeme config (duration: 01m 19s)
  • 22:05 awight@tin: Finished deploy [ores/deploy@c0db102]: ORES: force git-lfs install (duration: 02m 50s)
  • 22:03 awight@tin: Started deploy [ores/deploy@c0db102]: ORES: force git-lfs install
  • 21:27 ejegg: re-started refund queue consumer
  • 21:25 ejegg: updated CiviCRM from ca8acdccf6 to 4c6a9c9c1c
  • 21:20 reedy@tin: Synchronized php-1.32.0-wmf.3/includes/DefaultSettings.php: Add default edit rate limit of 90 edits/minute for all users (duration: 01m 20s)
  • 21:18 reedy@tin: Synchronized php-1.32.0-wmf.2/includes/DefaultSettings.php: Add default edit rate limit of 90 edits/minute for all users (duration: 01m 20s)
  • 20:59 ejegg: disabled refund queue consumer
  • 20:30 maxsem@tin: Synchronized php-1.32.0-wmf.2/extensions/CongressLookup/: https://gerrit.wikimedia.org/r/#/c/432146/ (duration: 01m 20s)
  • 20:28 maxsem@tin: Synchronized php-1.32.0-wmf.3/extensions/CongressLookup/: https://gerrit.wikimedia.org/r/#/c/432146/ (duration: 01m 19s)
  • 20:21 arlolra: Updated Parsoid to 5ce2608 (T194081, T188118)
  • 20:17 arlolra@tin: Finished deploy [parsoid/deploy@181e3b1]: Updating Parsoid to 5ce2608 (duration: 08m 52s)
  • 20:16 ejegg: restarted refund queue consumer
  • 20:08 arlolra@tin: Started deploy [parsoid/deploy@181e3b1]: Updating Parsoid to 5ce2608
  • 20:01 twentyafterfour@tin: Synchronized php: group1 wikis to 1.32.0-wmf.3 (duration: 01m 20s)
  • 20:00 twentyafterfour: group1 to 1.32.0-wmf.3 refs T191049
  • 20:00 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.3
  • 20:00 ejegg: updated CiviCRM from 4ff35ad4 to ca8acdccf6
  • 19:53 ottomata: rolling restart eventbus service to deploy logstash config
  • 19:51 ejegg: updated SmashPig standalone from 99585c8084 to 2b4186715e
  • 19:51 otto@tin: Finished deploy [eventlogging/eventbus@aa9eb2c]: logstash T193230 (duration: 01m 17s)
  • 19:49 otto@tin: Started deploy [eventlogging/eventbus@aa9eb2c]: logstash T193230
  • 19:49 mutante: graphite1001/2001 - rm check_uwsgi-coal NRPE check config, reloading nagios-nrpe-server (T194283)
  • 19:48 otto@tin: Finished deploy [eventlogging/eventbus@aa9eb2c]: logstash T193230 (duration: 00m 15s)
  • 19:48 otto@tin: Started deploy [eventlogging/eventbus@aa9eb2c]: logstash T193230
  • 19:38 otto@tin: Finished deploy [eventlogging/eventbus@c70e8c5]: logstash - T193230 (duration: 03m 33s)
  • 19:36 mutante: graphite1001, graphite2001 - deleting uwsgi-coal and coal sytemd unit files; systemctl daemon-reload (T194283)
  • 19:35 otto@tin: Started deploy [eventlogging/eventbus@c70e8c5]: logstash - T193230
  • 19:26 mutante: mw2145, mw2146, mw2147 - reinstall with stretch, depooled, downtimed
  • 19:23 Krinkle: Stop and disable coal-web (uwsgi-coal) service on graphite1001/graphite2001 (T194283)
  • 19:23 Krinkle: Disable coal service on graphite1001/graphite2001 (T194283)
  • 18:41 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2212.codfw.wmnet
  • 18:39 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2213.codfw.wmnet
  • 18:37 sbisson@tin: Finished deploy [tilerator/deploy@a86f8f8]: Make tilerator store up to zoom 15 (duration: 06m 36s)
  • 18:37 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2214.codfw.wmnet
  • 18:31 sbisson@tin: Started deploy [tilerator/deploy@a86f8f8]: Make tilerator store up to zoom 15
  • 18:29 sbisson@tin: Finished deploy [kartotherian/deploy@ef61ad7]: Make kartotherian serve up to z15 (duration: 02m 20s)
  • 18:27 sbisson@tin: Started deploy [kartotherian/deploy@ef61ad7]: Make kartotherian serve up to z15
  • 18:22 imarlier@tin: Finished deploy [performance/coal@8e57e4a]: Deploy only to webperf (duration: 00m 06s)
  • 18:22 imarlier@tin: Started deploy [performance/coal@8e57e4a]: Deploy only to webperf
  • 18:19 moritzm: installing jenkins security updates on releases*
  • 18:04 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/432030/ (duration: 01m 21s)
  • 18:00 maxsem@tin: Finished scap: Deploy CongressLookup on testwiki T194230 (duration: 105m 19s)
  • 16:56 ottomata: disabled 0.9 MirrorMaker on kafka102[023], enabled 1.x MirrorMaker on kafka-jumbo*
  • 16:39 gehel: restarting blazegraph and updater on wdqs1003
  • 16:15 maxsem@tin: Started scap: Deploy CongressLookup on testwiki T194230
  • 15:56 urandom: starting revision cleanup job, wikipedia_T_mobile__ng_lead keyspace - T192689
  • 15:31 thcipriani: upgrading jenkins on contint2001/contint1001
  • 15:26 vgutierrez: Replacing lvs1003 with lvs1016 - T184293
  • 15:22 mutante: mw2212,mw2213,mw2214 - reinstall with stretch
  • 14:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@58935d5]: Allow protocol version negotiation. T167039 (duration: 00m 34s)
  • 14:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@58935d5]: Allow protocol version negotiation. T167039
  • 14:41 ppchelko@tin: Finished deploy [changeprop/deploy@e468d8e]: Allow protocol version negotiation. T167039 (duration: 00m 53s)
  • 14:40 ppchelko@tin: Started deploy [changeprop/deploy@e468d8e]: Allow protocol version negotiation. T167039
  • 14:27 ejegg: disabled refund queue consumer
  • 13:59 ottomata: beginning upgrade of Kafka main-eqiad cluster from 0.9.0.1 to 1.1.0 - T167039
  • 13:55 milimetric@tin: Finished deploy [analytics/refinery@a5a8cbc]: Renaming geoeditors druid datasource (duration: 05m 27s)
  • 13:49 milimetric@tin: Started deploy [analytics/refinery@a5a8cbc]: Renaming geoeditors druid datasource
  • 13:28 zeljkof: EU SWAT finished
  • 13:27 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable mapframe on all but a few wikis (T191585) (duration: 01m 20s)
  • 13:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove unused wgKartographerDfltStyle setting (T191655) (duration: 01m 20s)
  • 13:10 dcausse@tin: Synchronized wmf-config/InitialiseSettings.php: T192064 [cirrus] Increase the number of shards for wikidatawiki_content, enwiki_general (duration: 01m 20s)
  • 13:08 elukey: reimage analytics103[2,3] to Debian Stretch
  • 13:05 milimetric@tin: Finished deploy [analytics/refinery@640bc35]: Renaming geoeditors druid datasource (duration: 06m 11s)
  • 12:59 milimetric@tin: Started deploy [analytics/refinery@640bc35]: Renaming geoeditors druid datasource
  • 11:26 moritzm: installing wget security updates on Debian systems
  • 11:14 moritzm: reimage mw2206 (earlier reimage failed since the host lacked a puppet cert)
  • 11:03 moritzm: updated jenkins packages
  • 10:55 moritzm: reimaging mw2246 to stretch (video scaler with a deprecated one-off partman recipe)
  • 10:50 demon@tin: Synchronized wmf-config/missing.php: remove vendor dep (duration: 01m 20s)
  • 10:45 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 (duration: 01m 19s)
  • 10:40 demon@tin: Synchronized multiversion/getMWVersion: remove vendor dep (duration: 03m 27s)
  • 10:30 godog: cycle-load edac kernel modules for scb1002 to reset counters
  • 10:28 godog: cycle-load edac kernel modules for cp1068 to reset counters
  • 10:02 demon@tin: Synchronized docroot/search.wikimedia.org/index.php: improve 5xx/4xx error handling (duration: 01m 27s)
  • 09:29 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1064 with full weight (duration: 01m 27s)
  • 08:52 moritzm: reimaging mw1336, mw1337 (job runners) to stretch
  • 08:19 no_justification: gerrit: restarting for version bump 2.14.7 -> 2.14.8
  • 08:18 jynus: stop and upgrade db1053
  • 08:17 demon@tin: Finished deploy [gerrit/gerrit@c421c91]: 2.14.7 -> 2.14.8 (duration: 00m 11s)
  • 08:17 demon@tin: Started deploy [gerrit/gerrit@c421c91]: 2.14.7 -> 2.14.8
  • 07:12 moritzm: reimaging mw2152 to stretch (video scaler with a deprecated one-off partman recipe)
  • 07:10 moritzm: reimaging mw2162 to stretch (last jessie job runner in codfw)
  • 07:05 moritzm: reimaging mw1334, mw1335 (job runners) to stretch
  • 06:04 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2211.codfw.wmnet
  • 06:01 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2210.codfw.wmnet
  • 06:00 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2209.codfw.wmnet
  • 05:26 marostegui: Stop MySQL on db2092 to do some clean ups
  • 05:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 after alter table (duration: 01m 36s)
  • 05:19 marostegui: Stop slave on db1116:s3 to do some gtid cleanups and tests
  • 04:02 l10nupdate@tin: ResourceLoader cache refresh completed at Wed May 9 04:02:56 UTC 2018 (duration 7m 26s)
  • 03:55 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.3) (duration: 16m 34s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.2) (duration: 09m 06s)
  • 02:09 maxsem@tin: Synchronized wmf-config: Preparation for T194230 (duration: 01m 15s)
  • 02:06 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: Preparation for T194230 (duration: 01m 16s)
  • 02:02 maxsem@tin: Synchronized php-1.32.0-wmf.2/extensions/CongressLookup/: Preparation for T194230 (duration: 01m 17s)
  • 02:00 maxsem@tin: Synchronized php-1.32.0-wmf.3/extensions/CongressLookup/: Preparation for T194230 (duration: 01m 22s)
  • 01:36 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2211.codfw.wmnet
  • 01:36 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2210.codfw.wmnet
  • 01:36 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2209.codfw.wmnet
  • 01:36 XioNoX: progressively push updated BGP_sanitize_in prefix-length-range to routers - T190317
  • 01:33 mutante: mw2209,mw2210,mw2211 - reinstall wtih stretch
  • 01:27 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2204.codfw.wmnet
  • 01:26 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2203.codfw.wmnet
  • 01:22 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2208.codfw.wmnet
  • 01:20 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2207.codfw.wmnet
  • 01:19 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2205.codfw.wmnet
  • 00:54 ejegg: running refund queue consumer overtime
  • 00:35 maxsem@tin: Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/432022/ - noop in prod (duration: 01m 20s)

2018-05-08

  • 23:58 thcipriani@tin: Synchronized php-1.32.0-wmf.2/extensions/WikibaseQualityConstraints/src/ConstraintCheck/Helper/LoggingHelper.php: SWAT: Do not try to access null message message key T194140 (duration: 01m 32s)
  • 23:42 XioNoX: progressively push BGP_sanitize_in as-path too-many-hops to routers - T190317
  • 23:25 ejegg: updated CiviCRM from d9d412e496 to 4ff35ad4bb
  • 23:22 thcipriani@tin: Synchronized wmf-config: SWAT: Add string and external-id types to Wikibase indexing T163642 T99899 (duration: 01m 26s)
  • 23:12 XioNoX: lowering ospf metric of ulsfo-codfw to 390
  • 22:53 awight@tin: Finished deploy [ores/deploy@bf182e2]: Rollback ores1002 to master (duration: 00m 19s)
  • 22:52 awight@tin: Started deploy [ores/deploy@bf182e2]: Rollback ores1002 to master
  • 22:49 awight@tin: Finished deploy [ores/deploy@5b27205]: Deploy LFS files to ores1002 (duration: 01m 59s)
  • 22:47 awight@tin: Started deploy [ores/deploy@5b27205]: Deploy LFS files to ores1002
  • 22:35 XioNoX: remove PREFERRED-TRANSIT Tele2-DTAG from esams/knams routers
  • 22:05 XioNoX: progressively push updated BGP_sanitize_in bogon ASN filters to routers - T190317
  • 21:59 twentyafterfour: MediaWiki train for 1.32.0-wmf.3 group0 is complete. Will resume with group1 tomorrow, same bat time, same bat channel (refs T191049)
  • 21:45 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.32.0-wmf.3
  • 21:36 eileen: civicrm revision changed from 81e54c850d to d9d412e496, config revision is 96fbded693
  • 21:09 twentyafterfour@tin: Finished scap: testwikis wikis to 1.32.0-wmf.3 (duration: 114m 21s)
  • 20:59 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2204.codfw.wmnet
  • 20:58 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2203.codfw.wmnet
  • 20:24 milimetric@tin: Finished deploy [analytics/refinery@2a4633c]: Deploying renamed geowiki jobs as geoeditors (duration: 07m 07s)
  • 20:17 milimetric@tin: Started deploy [analytics/refinery@2a4633c]: Deploying renamed geowiki jobs as geoeditors
  • 20:02 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2201.codfw.wmnet
  • 20:01 mutante: mw2205,mw2206,mw2207 - reinstalling with stretch - mw2202 - wmf-auto-reimage failed: Timeout of 60 minutes reached waiting for reboot
  • 19:15 twentyafterfour@tin: Started scap: testwikis wikis to 1.32.0-wmf.3
  • 19:14 twentyafterfour: testwikis to 1.32.0-wmf.3 - https://gerrit.wikimedia.org/r/#/c/431821/ refs T191049
  • 19:11 twentyafterfour: updated mediawiki changelog https://www.mediawiki.org/wiki/MediaWiki_1.32/wmf.3/Changelog refs T191049
  • 18:55 mutante: mw2202, mw2203, mw2204 - reinstall with stretch
  • 18:52 marostegui: Manually fail disk #7 on db1073 to get it replaced
  • 18:22 mutante: mwmaint1001 - rebooting
  • 18:18 twentyafterfour: Branching MediaWiki master to wmf/1.32.0-wmf.3 refs T191049
  • 18:08 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2252.codfw.wmnet
  • 18:07 andrew@tin: Finished deploy [horizon/deploy@9245ca9]: rolling out member dashboard (duration: 03m 18s)
  • 18:06 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2251.codfw.wmnet
  • 18:03 andrew@tin: Started deploy [horizon/deploy@9245ca9]: rolling out member dashboard
  • 17:00 bawolff: Clear botpassword throttle for User:TaxonBot (T194160)
  • 16:45 thcipriani@tin: Synchronized README: Testing Scap 3.8.1-1 (duration: 01m 02s)
  • 16:43 godog: upload scap 3.8.1-1 - T127762
  • 16:40 XioNoX: re-pooling lvs2001 - T193677
  • 16:36 mutante: mwmaint1001 - reinstalling one more time after proxysql issues are resolved, PXE booting (T192092)
  • 16:27 mutante: mw2251,mw2252,mw2201 - reinstall with stretch
  • 16:22 herron: cleared low count edac counters on hosts mw2205 dbstore1002 db1051 elastic1029 T183177
  • 16:19 urandom: force (split) compaction of wikipedia_T_mobile__ng_lead.data, restbase1016 - T192689
  • 16:15 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2223.codfw.wmnet
  • 16:14 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2222.codfw.wmnet
  • 16:10 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2215.codfw.wmnet
  • 16:09 XioNoX: failing traffic over lvs2004 - T193677
  • 16:04 ppchelko@tin: Finished deploy [changeprop/deploy@e468d8e]: Allow protocol version negotiation. Codfw only. T167039 (duration: 01m 03s)
  • 16:03 ppchelko@tin: Started deploy [changeprop/deploy@e468d8e]: Allow protocol version negotiation. Codfw only. T167039
  • 16:01 ppchelko@tin: Finished deploy [cpjobqueue/deploy@58935d5]: Allow protocol version negotiation. Codfw only. T167039 (duration: 00m 42s)
  • 16:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@58935d5]: Allow protocol version negotiation. Codfw only. T167039
  • 15:53 mutante: switching performance.wikimedia.org from graphite to webperf backends - running puppet on cache::misc servers (T158837)
  • 15:46 demon@tin: Pruned MediaWiki: 1.32.0-wmf.1 [keeping static files] (duration: 01m 47s)
  • 15:30 XioNoX: starting pybal on lvs2001 - T193677
  • 15:26 godog: (un)load edac kernel modules on thumbor1004 to test resetting counters - T183177
  • 15:09 XioNoX: stopping pybal on lvs2001 - T193677
  • 15:06 ottomata: beginnng Kafka upgrade of main-codfw: T167039
  • 14:53 XioNoX: re-enable pybal on lvs2004 - T193677
  • 14:48 XioNoX: disabling pybal on lvs2004 - T193677
  • 14:37 mutante: LDAP: added 'sbailey' to group 'wmf' (T194091)
  • 14:19 ppchelko@tin: Started restart [changeprop/deploy@7e86531]: Restart changeprop to try forcing it rebalancing topics
  • 14:15 mutante: mw2215,mw2222,mw2223 - reinstalling with stretch
  • 13:43 zeljkof: EU SWAT finished
  • 13:42 zfilipin@tin: Synchronized php-1.32.0-wmf.2/extensions/Translate: SWAT: Refactor TranslationUpdateJob to use only primitive types for parameters (T192111) (duration: 01m 11s)
  • 13:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable maps i18n everywhere (T191655) (duration: 01m 00s)
  • 13:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AdvancedSearch BetaFeature on all wikis (T193182) (duration: 01m 00s)
  • 13:02 marostegui: Manually fail disk #9 on db1073 to get it replaced
  • 12:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1055 (duration: 00m 59s)
  • 12:19 moritzm: reimaging mw2159, mw2160, mw2161 (job runners) to stretch
  • 12:18 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1055 (duration: 00m 59s)
  • 12:17 moritzm: upgrading app servers in beta to wikidiff 1.6.0 (T190717)
  • 12:16 moritzm: upgrading app servers in beta to
  • 12:02 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1064 with low load (duration: 00m 59s)
  • 11:36 marostegui: Deploy schema change on db1103:3314 - T191519 T188299 T190148
  • 11:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 for alter table (duration: 00m 59s)
  • 11:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Really depool db2092 (duration: 00m 53s)
  • 10:29 moritzm: reimaging mw1347, mw1348 (API servers) to stretch (last two remaining API servers in eqiad)
  • 10:22 jynus: stop mariadb on db1055 to clone it to db1064
  • 10:15 moritzm: reimaging mw1310, mw1311 (job runners) to stretch
  • 09:58 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1055 (duration: 00m 54s)
  • 09:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1121 after alter table (duration: 01m 00s)
  • 09:20 elukey: forced a BBU re-learn cycle on analytics1032
  • 09:17 gehel: reducing replication factor on cassandra v3 (unused) keyspace for maps
  • 08:56 moritzm: reimaging mw1345, mw1346 (API servers) to stretch
  • 08:30 moritzm: reimaging mw2156, mw2157, mw2158 (job runners) to stretch
  • 08:27 moritzm: reimaging mw1308, mw1309 (job runners) to stretch
  • 08:03 marostegui: Stop MySQL on db1116 to transfer its content to db2092 - T190704
  • 07:59 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2092 T190704 (duration: 00m 57s)
  • 07:53 elukey: second attempt to remove the cassandra-metrics-collector (+ cleanup) from aqs*
  • 07:30 jynus: cleaning up maintenance hosts (terbium, etc.) from tendril maintenance files
  • 06:51 marostegui: Stop MySQL on db1060 as it will be decommissioned - T193732
  • 06:50 moritzm: reimaging mw1313, mw1343, mw1344 to stretch
  • 06:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1060 from config - T193732 (duration: 01m 01s)
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1060 from config - T193732 (duration: 00m 59s)
  • 06:05 marostegui: Read_only=off on db1069 to finish with the x1 failover
  • 06:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Promote db1069 as new x1 master (duration: 01m 00s)
  • 06:00 marostegui: Set db1055 ready only
  • 06:00 marostegui: Start x1 failover
  • 05:41 marostegui: Move db2034 under db1069 for x1 failover - T186320
  • 05:36 marostegui: Move dbstore1002:x1 under db1069 for x1 failover - T186320
  • 05:29 marostegui: Disable puppet on db1055 and db1069 before x1 failover - T186320
  • 05:28 marostegui: Disable gtid on db1069 an db2034 before x1 failover - T186320
  • 05:26 marostegui: Deploy schema change on db1121 with replication (this will generate lag on labs on s4) - T191519 T188299 T190148
  • 05:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1121 for alter table (duration: 01m 00s)
  • 05:19 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011 - https://phabricator.wikimedia.org/T174047
  • 05:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 after alter table (duration: 01m 00s)
  • 04:27 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2221.codfw.wmnet
  • 04:26 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2220.codfw.wmnet
  • 04:24 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2219.codfw.wmnet
  • 02:33 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.2) (duration: 05m 45s)
  • 00:12 ejegg: updated CiviCRM from 9752607052 to 81e54c850d

2018-05-07

  • 23:41 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2218.codfw.wmnet
  • 23:39 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2217.codfw.wmnet
  • 23:39 mutante: mw2219,mw2220,mw2221 - reinstall with stetch
  • 23:37 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2216.codfw.wmnet
  • 22:31 bstorm_: labsdb1009,labsdb1010,labsdb1011 are now on up-to-date views per T174047
  • 22:23 ppchelko@tin: Started restart [changeprop/deploy@7e86531]: Restart changeprop to try forcing it rebalancing topics
  • 21:28 mutante: mw2216,mw2217,mw2218 - wmf-auto-reimage --conftool , reinstall with stretch
  • 21:25 XioNoX: re-pool eqsin - T193897
  • 20:48 imarlier@tin: Started restart [performance/coal@50fe0dd]: Restart coal-web service everywhere, hopefully
  • 20:48 arlolra: Updated Parsoid to 6e38948 (T192909)
  • 20:41 arlolra@tin: Finished deploy [parsoid/deploy@cd5e875]: Updating Parsoid to 6e38948 (duration: 12m 25s)
  • 20:30 otto@tin: Finished deploy [statsv/statsv@c186340]: Configure api.version via CLI opt -- prep for Kafka main upgrade T167039 (duration: 00m 05s)
  • 20:30 otto@tin: Started deploy [statsv/statsv@c186340]: Configure api.version via CLI opt -- prep for Kafka main upgrade T167039
  • 20:29 arlolra@tin: Started deploy [parsoid/deploy@cd5e875]: Updating Parsoid to 6e38948
  • 20:25 bsitzmann@tin: Finished deploy [mobileapps/deploy@e20f23d]: Update mobileapps to c1f4de6 (T191538) (duration: 06m 09s)
  • 20:19 bsitzmann@tin: Started deploy [mobileapps/deploy@e20f23d]: Update mobileapps to c1f4de6 (T191538)
  • 19:58 XioNoX: removing onboard ports license from cr1-eqsin config - T193897
  • 17:34 bawolff@tin: Synchronized php-1.32.0-wmf.2/extensions/LoginNotify/includes/Hooks.php: https://gerrit.wikimedia.org/r/#/c/431611/ Do not send loginnotify emails for throttled logins (duration: 01m 08s)
  • 17:29 cmjohnson1: updating f/w lvs1016
  • 17:08 gehel@tin: Finished deploy [wdqs/wdqs@bd4b3ed]: new wdqs GUI, updater and blazegraph (duration: 05m 18s)
  • 17:03 gehel@tin: Started deploy [wdqs/wdqs@bd4b3ed]: new wdqs GUI, updater and blazegraph
  • 16:57 elukey: executed sudo megacli -AdpBbuCmd -BbuLearn -aALL -NoLog on analytics1032 - BBU alerts flapping
  • 16:37 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 - https://phabricator.wikimedia.org/T174047
  • 16:26 sbisson@tin: Finished deploy [kartotherian/deploy@9935fdb]: Kartotherian: remove temporary and unused osm-intl-i18n source (duration: 03m 28s)
  • 16:23 sbisson@tin: Started deploy [kartotherian/deploy@9935fdb]: Kartotherian: remove temporary and unused osm-intl-i18n source
  • 15:34 marostegui: Deploy schema change on db1097:3314 - T191519 T188299 T190148
  • 15:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 for alter table (duration: 01m 00s)
  • 15:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 after alter table (duration: 00m 57s)
  • 14:59 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010 - https://phabricator.wikimedia.org/T174047
  • 14:30 imarlier@tin: Finished deploy [performance/coal@50fe0dd]: Coal version that uses the graphite API to fetch data, instead of reading directly from whisper files (duration: 00m 13s)
  • 14:30 imarlier@tin: Started deploy [performance/coal@50fe0dd]: Coal version that uses the graphite API to fetch data, instead of reading directly from whisper files
  • 14:13 herron: upgraded prometheus-jmx-exporter to 0.3.0-1 on puppetdb servers
  • 14:12 herron: puppetdb updates complete — re-enabling puppet agents
  • 14:01 herron: temporarily disabling puppet agents for puppetdb security update
  • 13:36 zeljkof: EU SWAT finished
  • 13:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgKartographerEnableMapFrame to true for mrwiki (T193371) (duration: 01m 00s)
  • 13:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make test and test2 using maps i18n correctly (T191655) (duration: 01m 00s)
  • 12:55 sbisson@tin: Finished deploy [kartotherian/deploy@425f279]: Kartotherian: stop using tilerator_storage_id; Add babel upstream of osm-intl (duration: 05m 41s)
  • 12:50 sbisson@tin: Started deploy [kartotherian/deploy@425f279]: Kartotherian: stop using tilerator_storage_id; Add babel upstream of osm-intl
  • 12:27 moritzm: reimaging mw1333 to stretch (last app server in eqiad)
  • 11:28 moritzm: installing Java security updates on elastic* hosts
  • 10:59 moritzm: reimaging mw1330, mw1331, mw1332 (app servers) to stretch
  • 10:50 moritzm: depooled service nginx for mw1221-mw1231 (API servers)
  • 10:10 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 10:09 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 00s)
  • 09:52 moritzm: reimaging mw1305, mw1306 (job runners) to stretch
  • 09:52 elukey: stop graphite cassandra-metrics-collector on aqs* (touch /etc/cassandra-metrics-collector/disable)
  • 09:46 moritzm: rolling restart of Kibana logstash nodes to pick up Java security updates
  • 09:41 marostegui: Manually enable innodb_strict_mode on db1084 - T150949
  • 09:30 ema: cp-text/upload: start varnish upgrades to 5.1.3-1wm8 T192368
  • 09:30 marostegui: Deploy schema change on db1084 - T191519 T188299 T190148
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 for alter table (duration: 01m 03s)
  • 09:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 after alter table (duration: 01m 02s)
  • 09:02 moritzm: reimaging mw1275 (T192902)
  • 08:58 mobrovac@tin: Started restart [recommendation-api/deploy@ac66089]: Use the internal WDQS cluster LVS - T190266
  • 08:56 elukey: drain + reimage analytics103[7,8] to Debian Stretch
  • 08:30 godog: eqiad-prod: more weight to ms-be104[0-3] - T190081
  • 08:19 moritzm: rolling restart of logstash to pick up Java security updates
  • 08:19 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010 - https://phabricator.wikimedia.org/T174047
  • 08:04 gehel: restart cassandra on maps* for JVM upgrade
  • 07:59 gehel: restart cassandra on maps-test* for JVM upgrade
  • 07:53 moritzm: installing libdatetime-timezone-perl stable update for jessie/stretch
  • 07:38 gehel: restart elasticsearch on relforge for JVM upgrade
  • 07:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: db1074 is now db1102's master - T193732 (duration: 00m 59s)
  • 07:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 - T193732 (duration: 00m 59s)
  • 07:28 marostegui: Change master from db1102:s2 from db1060 to db1074
  • 07:19 marostegui: Stop replication in sync on db1060 and db1074 - T193732
  • 07:19 moritzm: reimaging mw1303, mw1304 (job runners) to stretch
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 - T193732 (duration: 00m 59s)
  • 07:06 moritzm: reimaging mw1233, mw1234, mw1235 (API servers) to stretch
  • 07:02 moritzm: reimaging mw1327, mw1328, mw1329 (app servers) to stretch
  • 05:37 marostegui: Unused databases devwikiinternal and rel13testwiki from s3 - T118764
  • 05:27 marostegui: Deploy schema change on db1081 - T191519 T188299 T190148
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 for alter table (duration: 01m 01s)
  • 05:15 marostegui: Deploy schema change on s6 primary master db1061 - T191519 T188299 T190148
  • 03:06 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.2) (duration: 11m 22s)

2018-05-05

  • 03:36 XioNoX: commenting rigel out from smokeping
  • 02:16 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2191.codfw.wmnet
  • 02:03 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2193.codfw.wmnet
  • 01:37 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2187.codfw.wmnet
  • 00:35 Krinkle: Delete navtiming 'mediaWikiLoadComplete' metrics (T160315)
  • 00:26 Krinkle: Purging values for navtiming 'mediaWikiLoadEnd' sub-properties from 2018-04-21 to 2018-05-04T04:40:00 (T160315)

2018-05-04

  • 22:50 mutante: mwmaint1001 - now using mw-maintenance role, upcoming terbium replacement
  • 22:49 mutante: mw2187 - scap proxy - reinstalling with stretch
  • 22:46 mutante: mw2191, mw2193 - wmf-auto-reimage with --no-verify because puppet certs didnt exist
  • 22:02 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2199.codfw.wmnet
  • 22:00 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2198.codfw.wmnet
  • 21:55 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2197.codfw.wmnet
  • 19:50 bblack: rebooting lvs1016 (downtimed, also new and not in service!)
  • 19:02 sbisson@tin: Finished deploy [kartotherian/deploy@8e6b35b]: Use new keyspace (v4) for both i18n and non-i18n sources (duration: 03m 57s)
  • 18:58 sbisson@tin: Started deploy [kartotherian/deploy@8e6b35b]: Use new keyspace (v4) for both i18n and non-i18n sources
  • 18:38 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2196.codfw.wmnet
  • 18:36 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2194.codfw.wmnet
  • 18:35 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2195.codfw.wmnet
  • 18:16 mutante: mw2197,mw2198,mw2199 - reinstall with stretch
  • 18:12 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2192.codfw.wmnet
  • 18:05 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2190.codfw.wmnet
  • 18:03 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2189.codfw.wmnet
  • 17:16 XioNoX: depolled eqsin
  • 16:57 bawolff: adjusted login throttling code (T193762)
  • 16:50 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2196.codfw.wmnet
  • 16:50 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2194.codfw.wmnet
  • 16:50 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2195.codfw.wmnet
  • 16:50 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw2194.codfw.wmnet
  • 16:49 dzahn@neodymium: conftool action : set/pooled=no; selector: name=mw2196.codfw.wmnet
  • 16:25 XioNoX: adding BGP graceful shutdown to routers - T190323
  • 16:19 bawolff: Logging adjustment in mediawiki for T193762
  • 15:57 arturo: enabled puppet in labcontrol1001, labnodepol100[1-2] and labtestvirt10[01-22] after patches deployed for T193657
  • 15:16 mutante: mw2194, mw2195, mw2196 - reinstall with stretch - mw2193 - puppet cert not found
  • 15:06 arturo: disabling puppet in labnodepool100[1-2] for T193657
  • 15:05 arturo: disabling puppet in labcontrol1001 for T193657
  • 14:52 arturo: disabling puppet in labvirt10[01-22] to deploy https://gerrit.wikimedia.org/r/#/c/430581/ and https://gerrit.wikimedia.org/r/#/c/430614/ T193657
  • 14:05 mutante: mw2189, mw2190, mw2192 - reinstall with stretch, mw2191 - puppet cert not found
  • 13:33 marostegui: Manually enable innodb_strict_mode on labsdb1009 - T150949
  • 12:09 hashar: restarted Jenkins on contint1001 (java update)
  • 12:04 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2081 (duration: 00m 59s)
  • 11:22 jynus: stopping db1056 and moving it to spare
  • 10:55 marostegui: Manually enable innodb_strict_mode just on dbstore2001:3315 - T150949
  • 10:51 moritzm: reimaging mw2153, mw2154, mw2155 (job runners) to stretch
  • 09:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1056 (duration: 01m 00s)
  • 09:39 moritzm: uploaded openjdk-8 8u171-b11 for jessie-wikimedia to apt.wikimedia.org
  • 09:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully pool db1119 in s1 API (duration: 00m 59s)
  • 09:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase API traffic for db1119 (duration: 00m 59s)
  • 08:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1119 (duration: 01m 07s)
  • 08:45 moritzm: reimaging mw2249, mw2250, mw2253 (job runners) to stretch
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1119 in s1 API (duration: 00m 59s)
  • 08:30 mobrovac@tin: Finished deploy [cpjobqueue/deploy@193cf6f]: Config: Exclude refreshLinks from the RegEx rule (duration: 00m 47s)
  • 08:29 mobrovac@tin: Started deploy [cpjobqueue/deploy@193cf6f]: Config: Exclude refreshLinks from the RegEx rule
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool recently new cloned db1119 (duration: 01m 00s)
  • 07:08 moritzm: reimaging mw2243, mw2247, mw2248 (job runners) to stretch
  • 07:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db1119 to the config (duration: 01m 20s)
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1119 to the config (duration: 01m 06s)
  • 06:42 marostegui: Stop MySQL on db1066 to clone db1119 - T192979
  • 06:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T192979 (duration: 01m 11s)
  • 05:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify that db1060 will be decommissioned (duration: 00m 53s)
  • 05:21 marostegui: Deploy schema change on dbstore1002:s4 - T191519 T188299 T190148
  • 04:24 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.2
  • 04:17 demon@tin: Synchronized multiversion/getMWVersion: clean up getRealmSpecificFilename() (duration: 01m 07s)
  • 04:07 demon@tin: Synchronized wmf-config/InitialiseSettings.php: disabling LQT on a few closed/unloved testwikis (duration: 01m 11s)
  • 03:55 demon@tin: Synchronized scap/plugins: No-op plugin style fixes (duration: 01m 11s)
  • 00:55 niharika29@tin: Synchronized wmf-config/CommonSettings.php: Temporary low-level activation of Eventlogging impression data for testing T183978 (duration: 01m 16s)
  • 00:50 niharika29@tin: Synchronized wmf-config/CommonSettings.php: Temporarily add a higher threshold to trigger login attempt notices T193762 (duration: 01m 17s)
  • 00:01 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.2

2018-05-03

  • 23:56 demon@tin: Synchronized php: symlink bump (duration: 01m 16s)
  • 23:43 demon@tin: Synchronized scap/plugins/: cleanup, no-op (duration: 01m 17s)
  • 23:16 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2186.codfw.wmnet
  • 23:13 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2185.codfw.wmnet
  • 23:12 krinkle@tin: Synchronized php-1.32.0-wmf.2/extensions/NavigationTiming/: If293a156ca / T193570 (duration: 01m 16s)
  • 23:10 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2188.codfw.wmnet
  • 23:09 krinkle@tin: Synchronized php-1.32.0-wmf.1/extensions/NavigationTiming/: If293a156cac / T193570 (duration: 01m 17s)
  • 22:52 aaron@tin: Finished scap: Deploy db9acea (bug T193668) (duration: 103m 50s)
  • 22:21 ejegg: updated CiviCRM from 0fdef242a3 to 9752607052
  • 21:21 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2183.codfw.wmnet
  • 21:18 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2184.codfw.wmnet
  • 21:08 aaron@tin: Started scap: Deploy db9acea (bug T193668)
  • 20:34 XioNoX: started peering/transit with Deutsche Telekom on cr2-esams
  • 20:30 mutante: mw1297 - reinstalling as mwmaint1001
  • 20:09 krinkle@tin: Synchronized php-1.32.0-wmf.1/extensions/NavigationTiming: I1e7f091cba1 (duration: 01m 18s)
  • 20:00 mutante: mw2185,mw2186,mw2188 - reinstall with stretch
  • 19:53 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2254.codfw.wmnet
  • 19:45 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2240.codfw.wmnet
  • 19:34 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2231.codfw.wmnet
  • 19:32 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2229.codfw.wmnet
  • 19:18 mutante: mw1297 - puppet node clean, puppet node deactivate - renaming to mwmaint1001 (T192185)
  • 19:02 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: CentralNotice EventLogging banner impression data test T183978 (duration: 01m 04s)
  • 18:59 thcipriani@tin: Synchronized php-1.32.0-wmf.2/extensions/NavigationTiming/modules/ext.navigationTiming.js: SWAT: Emit SaveTiming without relying on getNavTiming() T193693 (duration: 01m 16s)
  • 18:49 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2240.codfw.wmnet
  • 18:48 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2229.codfw.wmnet
  • 18:48 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mw2231.codfw.wmnet
  • 18:44 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ULS webfonts by default at Bengali Wikisource T193367 (duration: 01m 18s)
  • 18:09 mutante: mw2229,mw2231,mw2240 - wmf-auto-reimage with --new switch because their puppet cert wasn't found on puppetmaster, treated as new hosts that didnt exist before
  • 17:59 mutante: mw2254,mw2183,mw2184 - wmf-auto-reimage with stretch and raid/lvm
  • 17:36 ejegg: updated CiviCRM from 401344fb30 to 0fdef242a3
  • 16:34 Jeff_Green: authdns-update to add frbast.wikimedia.org service alias
  • 15:04 imarlier@tin: Finished deploy [performance/coal@762d160]: verify coal is deploying properly after shutdown on graphite hosts (duration: 00m 14s)
  • 15:04 imarlier@tin: Started deploy [performance/coal@762d160]: verify coal is deploying properly after shutdown on graphite hosts
  • 14:47 marostegui: Manually set offline disk #1 on db1063 so it can be replaced
  • 14:44 imarlier@tin: Started deploy [performance/coal@bd7568a]: verify coal is deploying properly after shutdown on graphite hosts
  • 14:13 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1056 (duration: 01m 17s)
  • 14:00 ema: cp3030 (text): upgrade varnish to 5.1.3-1wm8 T192368
  • 13:21 addshore: swat done
  • 13:19 addshore@tin: Synchronized wmf-config/: Switch to extension.json for Wikidata.org (duration: 01m 19s)
  • 13:13 addshore@tin: Synchronized wmf-config/: Switch to extension.json for PropertySuggester (duration: 01m 35s)
  • 13:05 godog: stop and mask coal service on graphite hosts - T186774
  • 13:04 gehel: rolling restart of wdqs for jvm upgrade
  • 13:03 marostegui: Deploy schema change on s4 codfw master db2051 with replication (this will generate lag on codfw) - T191519 T188299 T190148
  • 12:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Revert: Clarify that db1060 is running an alter table (duration: 01m 17s)
  • 12:26 moritzm: installing openjdk-8 security updates on stretch-based Hadoop workers
  • 11:57 moritzm: reimaging mw1301, mw1302 (job runners) to stretch
  • 10:51 moritzm: reimaging mw1319, mw1325, mw1326 (app servers) to stretch
  • 09:29 moritzm: reimaging mw1227, mw1231, mw1232 (API servers) to stretch
  • 09:09 gehel: rolling restart of elasticsearch completed - T191543 / T191236
  • 09:06 moritzm: reimaging mw1300 (job runner) to stretch
  • 08:33 moritzm: installing Java security updates on wdqs*
  • 08:31 addshore@tin: Synchronized php-1.32.0-wmf.2/extensions/WikimediaEvents/WikimediaEventsHooks.php: T191500 Update campaign prefix for onBeforeInitializeWMDECampaign hook (duration: 01m 16s)
  • 08:29 addshore@tin: Synchronized php-1.32.0-wmf.1/extensions/WikimediaEvents/WikimediaEventsHooks.php: T191500 Update campaign prefix for onBeforeInitializeWMDECampaign hook (duration: 01m 17s)
  • 08:27 mobrovac@tin: Finished deploy [changeprop/deploy@7e86531]: Bug fix: Resubscribe to the proper list of topics on metadata change (duration: 01m 12s)
  • 08:26 mobrovac@tin: Started deploy [changeprop/deploy@7e86531]: Bug fix: Resubscribe to the proper list of topics on metadata change
  • 08:18 mobrovac@tin: Finished deploy [cpjobqueue/deploy@5c1dcb9]: Bug fix: Resubscribe to the proper list of topics on metadata change (duration: 00m 54s)
  • 08:17 mobrovac@tin: Started deploy [cpjobqueue/deploy@5c1dcb9]: Bug fix: Resubscribe to the proper list of topics on metadata change
  • 08:13 ema: cp-misc: upgrade varnish to 5.1.3-1wm8 T192368
  • 08:08 godog: eqiad-prod: more weight to ms-be104[0-3] - T190081
  • 07:54 moritzm: reimaging mw1284, mw1289, mw1290 (API servers) to stretch
  • 07:39 moritzm: reimaging mw1256, mw1257, mw1258 (app servers) to stretch
  • 07:26 marostegui: Drop table flaggedrevs from eswikibooks - T193676
  • 07:11 moritzm: reimaging mwdebug1002 to stretch
  • 05:59 marostegui: Drop mostly empty flagged* tables from metawiki (s7) - T193390
  • 05:57 elukey: reimage analytics10[39,40] to Debian Stretch
  • 05:31 marostegui: Drop empty flagged* tables from eswiki (s7) - T193678
  • 05:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify that db1060 is running an alter table (duration: 01m 15s)
  • 05:22 marostegui: Deploy schema change on db1060 with replication (this will generate lag on labs - s2) - T191519 T188299 T190148
  • 03:32 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2182.codfw.wmnet
  • 03:28 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2181.codfw.wmnet
  • 03:26 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2180.codfw.wmnet
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.1) (duration: 07m 20s)
  • 00:33 krinkle@tin: Synchronized php-1.32.0-wmf.1/extensions/NavigationTiming/modules/: Ie77e77de3b8 (duration: 01m 18s)
  • 00:20 mutante: mw2180,mw2181,mw2182 - reinstalling with stretch (in case there are alerts that's why)
  • 00:02 twentyafterfour: no phabricator upgrade tonight.

2018-05-02

  • 23:39 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2179.codfw.wmnet
  • 23:35 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2178.codfw.wmnet
  • 23:29 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2177.codfw.wmnet
  • 23:17 catrope@tin: Synchronized php-1.32.0-wmf.2/extensions/Kartographer/: Add maintenance script to purge pages with map tags (T193525) (duration: 01m 18s)
  • 22:24 XioNoX: Re-enabled the link between fasw-eqiad and pfw3b-eqiad (backup) - T192104
  • 22:18 ebernhardson: start reindex of viwiki on eqiad elasticsearch, failed on last run due to unrelated issues
  • 22:17 XioNoX: Disabling the link between fasw-eqiad and pfw3b-eqiad (backup) - T192104
  • 22:17 XioNoX: Re-enabled the link between fasw-codfw and pfw3b-codfw (backup) - T192104
  • 21:42 XioNoX: Disabling the link between fasw-codfw and pfw3b-codfw (backup) - T192104
  • 21:33 XioNoX: failing-over RG1 to node0 on pfw3-codfw - T192104
  • 19:54 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2173.codfw.wmnet
  • 19:38 mutante: mw2173 - scap pull (wasn't pooled but should have, bring up to date)
  • 19:25 thcipriani@tin: rebuilt and synchronized wikiversions files: revert group1 to 1.32.0-wmf.2
  • 19:23 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.32.0-wmf.2
  • 19:18 mutante: mw2177, mw2178, mw2179 - reinstalling with stretch
  • 19:14 ppchelko@tin: Finished deploy [restbase/deploy@1093d1d]: Sample log action api 4xx with 1% probability (duration: 16m 18s)
  • 19:10 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2176.codfw.wmnet
  • 19:08 dzahn@neodymium: conftool action : set/pooled=yes; selector: name=mw2175.codfw.wmnet
  • 18:58 ppchelko@tin: Started deploy [restbase/deploy@1093d1d]: Sample log action api 4xx with 1% probability
  • 17:59 mutante: mw2174 - repooled
  • 17:58 imarlier@tin: Started deploy [performance/coal@bd7568a]: deploy coal to webperf1001
  • 17:55 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on wikis with <100 high-prio issues (T192299) (duration: 01m 17s)
  • 17:48 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on metawiki (T192386) (duration: 01m 17s)
  • 17:40 imarlier@tin: Finished deploy [performance/coal@bd7568a]: deploy coal to webperf1001 (duration: 00m 06s)
  • 17:40 imarlier@tin: Started deploy [performance/coal@bd7568a]: deploy coal to webperf1001
  • 17:31 catrope@tin: Synchronized php-1.32.0-wmf.2/extensions/MobileFrontend/: T193564 (duration: 01m 20s)
  • 17:29 catrope@tin: Synchronized php-1.32.0-wmf.2/extensions/Kartographer/: Add missing util dependency (duration: 01m 14s)
  • 17:12 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgCiteResponsiveReferences on kowiki (T193491) (duration: 01m 17s)
  • 16:35 bawolff: run recountCategories.php on huwiki T169964
  • 16:29 gehel: restarting blazegraph to increase TasksMax
  • 15:18 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098 (duration: 01m 17s)
  • 15:07 joal@tin: Finished deploy [analytics/refinery@318d449]: Regular weekly deploy (duration: 08m 46s)
  • 14:58 joal@tin: Started deploy [analytics/refinery@318d449]: Regular weekly deploy
  • 14:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add and pool db1121 (duration: 01m 17s)
  • 14:33 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1121 (duration: 01m 16s)
  • 14:23 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: Update comment next to WMDE wmgMonologChannels entry T191500 (duration: 01m 17s)
  • 13:58 gilles: End of mid-day EU SWAT
  • 13:49 ottomata: beginning upgrade of kafka-jumbo brokers from 1.0.0 -> 1.1.0 : T193495
  • 13:47 vgutierrez: Update puppet compiler facts
  • 13:40 vgutierrez: Repool lvs2001 - T191897
  • 13:37 gilles@tin: Synchronized wmf-config/InitialiseSettings.php: T187299 Add performance perception QuickSurvey definition (duration: 01m 17s)
  • 13:29 elukey: upgrade zookeeper to 3.4.9 on druid100[4-6] (wikistats 2 backend) - T164008
  • 13:20 elukey: restart druid broker on druid100[1-3] to enable the 'druid.sql.enable' feature
  • 13:15 gilles: T193225 mwscript namespaceDupes.php --wiki=euwikisource --fix
  • 13:13 gilles@tin: Synchronized wmf-config/InitialiseSettings.php: T193225 Add Author namespace on eu.wikisource (duration: 01m 20s)
  • 13:05 gilles: Starting mid-day EU SWAT
  • 13:01 moritzm: re-attempting to reimage mw1250, mw1254, mw1255 (app servers) to stretch, those ran into a timeout earlier which is now fixed in the reimage script
  • 12:48 moritzm: reimaging mw1340, mw1341, mw1342 (API servers) to stretch
  • 12:17 vgutierrez: Depool and reimage lvs2001 as stretch - T191897
  • 11:41 vgutierrez: Repool lvs2002 - T191897
  • 11:12 jynus: stopping db1064 for cloning to db1121 (will create temporary lag on commons wikireplicas)
  • 11:02 kartik@tin: Finished deploy [cxserver/deploy@0aa3532]: Update cxserver to a20bf75 (duration: 06m 01s)
  • 10:56 kartik@tin: Started deploy [cxserver/deploy@0aa3532]: Update cxserver to a20bf75
  • 10:48 moritzm: reimaging mw2200 to stretch
  • 10:30 moritzm: installing openjdk-8 security updates on stat hosts
  • 10:25 vgutierrez: Depool and reimage lvs2002 as stretch - T191897
  • 10:21 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 (duration: 01m 17s)
  • 10:17 vgutierrez: Repool lvs2003 - T191897
  • 09:10 vgutierrez: Depool lvs2003 and reimage as stretch - T191897
  • 08:54 moritzm: reimaging mw1250, mw1254, mw1255 (app servers) to stretch
  • 08:36 moritzm: reimaging mw1228, mw1229, mw1230 to stretch (those were logged to SAL before, but failed with IPMI issues before)
  • 08:22 ema: varnish 5.1.3-1wm8 uploaded to apt.w.o T192368
  • 08:11 elukey: upgrading Druid to 0.10 on druid100[4-6] (wikistats 2 backend) - T164008
  • 07:42 elukey: remove openjdk-7 related packages from druid100[1-3] after zookeeper upgrade
  • 07:36 gehel: elasticsearch eqiad rolling restart for plugin update and NUMA config - T191543 / T191236
  • 07:31 elukey: upgrade zookeeper on druid100[1-3] to 3.4.9 - T164008
  • 07:27 jynus: restart db1098 for upgrade and validation
  • 02:44 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.1) (duration: 07m 11s)
  • 00:17 ebernhardson: start reindex for commonswiki, eqiad elasticsearch, commonswiki_general appears to have failed previous reindex

2018-05-01

  • 22:29 ejegg: updated CiviCRM from 46883844a3 to 401344fb30
  • 22:19 andrewbogott: rebooting labnet1002 for T193579
  • 22:04 krinkle@tin: Synchronized php-1.32.0-wmf.1/extensions/EducationProgram/resources/: I7ca59823ffbf2 (duration: 01m 16s)
  • 22:01 awight@tin: Finished deploy [ores/deploy@bf182e2]: Rollback ores1001 to master (duration: 01m 13s)
  • 22:00 awight@tin: Started deploy [ores/deploy@bf182e2]: Rollback ores1001 to master
  • 21:57 krinkle@tin: Synchronized php-1.32.0-wmf.1/includes/libs/rdbms/: Iba663c (duration: 01m 19s)
  • 21:54 awight@tin: Finished deploy [ores/deploy@52347e0]: Test LFS deployment for ORES; T180627 (duration: 03m 21s)
  • 21:50 awight@tin: Started deploy [ores/deploy@52347e0]: Test LFS deployment for ORES; T180627
  • 21:48 awight@tin: Finished deploy [ores/deploy@4601497]: Test LFS deployment for ORES; T180627 (duration: 00m 26s)
  • 21:48 awight@tin: Started deploy [ores/deploy@4601497]: Test LFS deployment for ORES; T180627
  • 21:48 awight@tin: Started deploy [ores/deploy@4601497]: Test LFS deployment for ORES; T180627
  • 21:44 demon@tin: Finished scap: group0 to wmf.2 (duration: 68m 18s)
  • 21:42 XioNoX: re-enabling eqsin-codfw link
  • 21:23 urandom: restbase: begin culling leaked revisions, enwiki_T_mobile__ng_{lead,remaining} - T192689
  • 20:43 urandom: restbase: begin culling leaked revisions, commons_T_mobile__ng_remaining - T192689
  • 20:35 demon@tin: Started scap: group0 to wmf.2
  • lunch: update stale apifeatureusage-search-svc-eqiad-wmnet template in eqiad elasticsearch and delete unused apifeatureusage template
  • 20:28 urandom: restbase: begin culling leaked revisions, commons_T_mobile__ng_lead - T192689
  • 20:06 demon@tin: Pruned MediaWiki: 1.31.0-wmf.27 (duration: 03m 08s)
  • 20:00 ottomata: rolling restart of eventbus to apply new logstash formatter version T193230
  • 19:59 demon@tin: Pruned MediaWiki: 1.31.0-wmf.30 [keeping static files] (duration: 01m 43s)
  • 19:49 urandom: restbase: begin culling leaked revisions, others_T_mobile__ng_remaining - T192689
  • 19:27 ottomata: rolling restart of eventbus to apply logstash tag https://phabricator.wikimedia.org/T193230
  • 19:17 mutante: mw2174,mw2175,mw2176 ff - reinstalling with wmf-auto-reimage to stretch
  • 19:00 ottomata: rolling restart of eventlogging-service-eventbus to apply logstash logging configs - T193230
  • 18:53 ariel@tin: Finished deploy [dumps/dumps@5438d41]: keep running even if file we want to report on is moved/gone (duration: 00m 04s)
  • 18:53 ariel@tin: Started deploy [dumps/dumps@5438d41]: keep running even if file we want to report on is moved/gone
  • 17:40 XioNoX: disabling eqsin<->codfw link for high packet loss on link
  • 17:39 otto@tin: Finished deploy [eventlogging/eventbus@c70e8c5]: remove occasional logging of request.body in prep for T193230 (duration: 02m 29s)
  • 17:36 otto@tin: Started deploy [eventlogging/eventbus@c70e8c5]: remove occasional logging of request.body in prep for T193230
  • 16:15 ebernhardson: T192972 change eqiad elasticsearch disk watermarks from 85/85 to 80/80 to match disk space alerts
  • 16:15 herron: manually kicked off mirror@sodium:~$ /usr/local/sbin/update-ubuntu-mirror to clear ubuntu mirror out of sync alert
  • 15:50 urandom: restbase: begin culling leaked revisions, others_T_mobile__ng_lead -- T192689
  • 15:37 ejegg: updated SmashPig standalone from a4de12d415 to 99585c8084
  • 13:35 herron: restarted hhvm on mw1233
  • 02:47 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.1) (duration: 07m 04s)
  • 01:20 ejegg: re-enabled fundraising jobs
  • 01:01 ejegg: updated CiviCRM from 47197006d5 to 46883844a3
  • 00:51 eileen: email sent advising of minor interuption
  • 00:50 ejegg: disabled fundraising jobs for db update
  • 00:21 ebernhardson: increase cluster.routing.allocation.balance.threshold from 1.0 to 1.5 for eqiad elasticsearch cluster to reduce rebalancing agressiveness

2018-04-30

  • 23:38 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.DiffPage.init.js: T192755 (duration: 00m 59s)
  • 23:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Set $wgKartographerUsePageLanguage to false everywhere (T192955) (duration: 00m 59s)
  • 23:33 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/AbuseFilter/includes/AbuseFilter.php: Fix notices when disallowing edits (duration: 00m 59s)
  • 23:21 catrope@tin: Synchronized wmf-config/: USe internal cluster for SPARQL services (T192942) (duration: 01m 02s)
  • 23:14 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Config cleanup patches from SWAT (duration: 01m 00s)
  • 23:05 mutante: ores1001: rm -rf /srv/deployment/ores/venv/ (T193422)
  • 21:46 ebernhardson: T192972 increase eqiad elasticsearch disk watermarks from 75/80 to 85/85
  • 20:27 arlolra: Updated Parsoid to 50b0588 (T186358, T191700, T192909)
  • 20:22 awight@tin: Started deploy [ores/deploy@bf182e2]: ORES: Include bot edits in precaching wikidata itemquality; T187927
  • 20:21 arlolra@tin: Finished deploy [parsoid/deploy@d8d7b42]: Updating Parsoid to 50b0588 (duration: 09m 46s)
  • 20:20 awight@tin: Finished deploy [ores/deploy@5b27205]: Rollback ores1001 to master (duration: 02m 56s)
  • 20:19 bsitzmann@tin: Finished deploy [mobileapps/deploy@d3724d2]: Update mobileapps to cc00cae (T191869) (duration: 07m 32s)
  • 20:17 awight@tin: Started deploy [ores/deploy@5b27205]: Rollback ores1001 to master
  • 20:11 bsitzmann@tin: Started deploy [mobileapps/deploy@d3724d2]: Update mobileapps to cc00cae (T191869)
  • 20:11 arlolra@tin: Started deploy [parsoid/deploy@d8d7b42]: Updating Parsoid to 50b0588
  • 20:09 awight@tin: Finished deploy [ores/deploy@4601497]: Trial LFS deployment to ORES canary; T181678 (take 2) (duration: 02m 10s)
  • 20:06 awight@tin: Started deploy [ores/deploy@4601497]: Trial LFS deployment to ORES canary; T181678 (take 2)
  • 20:06 ppchelko@tin: Finished deploy [changeprop/deploy@8cd45ed]: Don't filter bots from the ORES stream T187927 (duration: 01m 15s)
  • 20:05 ppchelko@tin: Started deploy [changeprop/deploy@8cd45ed]: Don't filter bots from the ORES stream T187927
  • 19:10 awight@tin: Finished deploy [ores/deploy@25579e7]: Trial LFS deployment to ORES canary; T181678 (duration: 02m 06s)
  • 19:08 awight@tin: Started deploy [ores/deploy@25579e7]: Trial LFS deployment to ORES canary; T181678
  • 19:01 mutante: hafnium - sudo service navtiming stop; sudo service statsv stop - downtimed in icinga, decom
  • 18:27 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: T191584 (duration: 01m 00s)
  • 18:23 awight@tin: Finished deploy [ores/deploy@5b27205]: Rollback ORES canary to master (duration: 00m 21s)
  • 18:22 awight@tin: Started deploy [ores/deploy@5b27205]: Rollback ORES canary to master
  • 18:17 catrope@tin: Synchronized php-1.32.0-wmf.1/extensions/CodeMirror/resources/modules/ve-cm/ve.ui.CodeMirrorAction.js: T191923 (duration: 01m 00s)
  • 18:16 ottomata: starting rolling reimage of kafka main-eqiad brokers kafka100[123] - T192832
  • 18:06 awight@tin: Finished deploy [ores/deploy@46824bb]: Canary-only test deployment for ORES + git-lfs, T181678 (take 2) (duration: 01m 58s)
  • 18:04 awight@tin: Started deploy [ores/deploy@46824bb]: Canary-only test deployment for ORES + git-lfs, T181678 (take 2)
  • 17:41 awight@tin: Finished deploy [ores/deploy@8c586ab]: Canary-only test deployment for ORES + git-lfs, T181678 (duration: 01m 59s)
  • 17:39 ariel@tin: Finished deploy [dumps/dumps@8398f53]: write checksums of dump files into seperate hashfiles, reusing their contents as appropriate (duration: 00m 03s)
  • 17:39 ariel@tin: Started deploy [dumps/dumps@8398f53]: write checksums of dump files into seperate hashfiles, reusing their contents as appropriate
  • 17:39 awight@tin: Started deploy [ores/deploy@8c586ab]: Canary-only test deployment for ORES + git-lfs, T181678
  • 17:26 gehel: restart blazegraph and updater on wdqs1003 to activate UseNUMA -T193365
  • 17:15 gehel@tin: Finished deploy [wdqs/wdqs@2579bfa]: deploying wdqs gui (duration: 04m 16s)
  • 17:11 gehel@tin: Started deploy [wdqs/wdqs@2579bfa]: deploying wdqs gui
  • 17:10 gehel: removing stale scap log for wdqs on tin.eqiad.wmnet
  • 16:50 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch LocalRenameUserJob to EventBus for all wikis - T193254 T190327 (duration: 00m 59s)
  • 16:50 ppchelko@tin: Finished deploy [cpjobqueue/deploy@01630f2]: Switch LocalRenameUserJob for all wikis. T193254 (duration: 00m 49s)
  • 16:49 ppchelko@tin: Started deploy [cpjobqueue/deploy@01630f2]: Switch LocalRenameUserJob for all wikis. T193254
  • 15:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 and db1069 with full weight (duration: 00m 59s)
  • 15:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1056 and db1069 with low load (duration: 00m 59s)
  • 14:31 jynus: shutting down db1056 for upgrade/maintenance and cloning
  • 14:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Move db1069 from s7 to x1, depool db1056 (duration: 00m 59s)
  • 14:27 elukey: upgrade druid on druid100[1-3] from 0.9.2 to 0.10
  • 14:26 marostegui: Power off db2081 for HW maintenance - T193325
  • 14:17 gehel: rolling restart blazegraph on all wdqs nodes for new configuration - T192759
  • 13:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1076 after alter table (duration: 00m 59s)
  • 13:40 zeljkof: EU SWAT finished
  • 13:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow bureaucrats to remove flood group for real, allow flooders to strip the group from them (T193350) (duration: 00m 59s)
  • 13:30 zfilipin@tin: Synchronized php-1.32.0-wmf.1/extensions/AbuseFilter: SWAT: Dont use an empty string for block parameters (T189681) (duration: 01m 02s)
  • 13:30 marostegui: Poweroff db1098 for HW maintenance - T193331
  • 13:26 marostegui: Stop MySQL on db1098 - T193331
  • 13:21 ottomata: beginning rolling reimage of kafka200[23] to stretch T192832
  • 13:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RCPatrol in cswiki (T193242) (duration: 00m 59s)
  • 13:16 marostegui: Drop unusued _old tables from a few wikis - https://phabricator.wikimedia.org/T54932#4167221
  • 13:13 gehel: restarting elasticsearch codfw rolling restart for plugin update and NUMA config - T191543 / T191236
  • 13:11 elukey: reimage analytics1049 and 1050 to Debian Stretch
  • 13:09 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Datetime Selector on Special:Block on all wikis except Meta, MediaWiki, and German Wikipedia (T192962) (duration: 01m 00s)
  • 12:48 arturo: aborrero@labtestnet2001:~ $ sudo rm /var/log/upstart/nova-api.log.1 <--- disk full, logrotate refuses to work bc that
  • 10:34 vgutierrez: Updating puppet compiler facts
  • 10:30 vgutierrez: Repool (Re-enable BGP) lvs3001 - T191897
  • 10:06 elukey: restart hdfs namenode on analytics1002 to pick up new heap settings (last step of the maintenance)
  • 10:00 elukey: set analytics1001 as active HDFS Namenode using manual failover
  • 09:50 elukey: restart HDFS Namenode on analtics1001 (current standby) again with Xmx/Xms set to 8g
  • 09:47 elukey: restart HDFS Namenode on analtics1001 (current standby)
  • 09:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060, fully pool db1090 (duration: 00m 59s)
  • 09:15 ariel@tin: Finished deploy [dumps/dumps@a6baf69]: do not update existing rss feed file if the dump job it covers is more recent than the one for which a feed is requested (duration: 00m 04s)
  • 09:15 ariel@tin: Started deploy [dumps/dumps@a6baf69]: do not update existing rss feed file if the dump job it covers is more recent than the one for which a feed is requested
  • 09:03 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 00m 59s)
  • 09:01 vgutierrez: Depool and reimage lvs3001 as stretch - T191897
  • 08:39 marostegui: Deploy schema change on db1076 - T191519 T188299 T190148
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for alter table (duration: 00m 59s)
  • 08:38 elukey: restart HDFS namenode on analytics1001 (standby master) to pick up new JVM settings - T193257
  • 08:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 after alter table (duration: 01m 00s)
  • 08:23 godog: swift eqiad-prod more weight to ms-be104[0-3] - T191896
  • 08:16 elukey: force a manual failover of the HDFS Namenode from analytics1001 to analytics1002 to test new GC Settings - T193257
  • 08:15 vgutierrez: Repool (Re-enable BGP) in lvs3002 - T191897
  • 08:02 jynus: stopping replication on both db1090 db instances to finish maintenance
  • 07:33 jynus: restarting dbstore1001@s1 to apply config change
  • 07:31 elukey: restart HDFS namenode on analytics1002 (standby master) to pick up new JVM settings - T193257
  • 07:06 marostegui: Restart replication on db1095:s3
  • 07:05 marostegui: Temporary stop replication on db1095:s3
  • 06:48 vgutierrez: Depool and reimage lvs3002 - T191897
  • 06:11 marostegui: Drop table edit_page_tracking from s3 - T57385
  • 06:04 marostegui: Drop table edit_page_tracking from s2 - T57385
  • 05:59 marostegui: Drop table edit_page_tracking from s1 - T57385
  • 05:50 marostegui: Drop table edit_page_tracking from s4, s5 and s7 - T57385
  • 05:47 marostegui: Drop table edit_page_tracking from s6 - T57385
  • 05:28 marostegui: Deploy schema change on db1074 - T191519 T188299 T190148
  • 05:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for alter table (duration: 01m 09s)
  • 02:57 l10nupdate@tin: scap sync-l10n completed (1.32.0-wmf.1) (duration: 08m 18s)

2018-04-29

  • 17:46 brion: rebuilding image metadata for PDFs on commons on terbium

2018-04-28

  • 23:42 volans@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098 (crashed) (duration: 01m 01s)
  • 15:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2081, crashed (duration: 01m 00s)
  • 05:19 apergos: reimaged snapshot1005 to stretch

2018-04-27

  • 22:45 mutante: m2171,mw2172,mw2173 ff. - reinstalling with stretch and raid1-LVM
  • 22:07 hashar: Running quibble-vendor-mysql-php70-docker against ~ 900 MediaWiki extensions. Triggered with a custom gear-client.py script from contint1001. PID 29710
  • 19:58 tgr: T193254 ran fixStuckGlobalRename.php for: Aliya klein Hasselb Husseinzadeh02 Jswf845 Lorraine Fgr Mikeypugs0134 Ncanty STEEEPGlobal Sunlight me THOR Global Defense Group TPBox Zenas Gao אֲבִי גְדוֹר ぽっぽ大将軍
  • 18:16 mutante: mw2167,mw2168,mw2169 - reinstalling with stretch and raid1-lvm
  • 16:26 imarlier@tin: Finished deploy [performance/navtiming@c059a60]: Deploying navtiming.py with support for enable/disable via etcd (duration: 00m 05s)
  • 16:26 imarlier@tin: Started deploy [performance/navtiming@c059a60]: Deploying navtiming.py with support for enable/disable via etcd
  • 16:19 imarlier@tin: Finished deploy [statsv/statsv@d5108c4]: Update statsv to force the Kafka broker API version (duration: 00m 05s)
  • 16:19 imarlier@tin: Started deploy [statsv/statsv@d5108c4]: Update statsv to force the Kafka broker API version
  • 14:23 anomie: Running populateRevisionLength.php on group 2 for T192189
  • 13:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 after alter table (duration: 00m 59s)
  • 11:41 moritzm: reimaging mwdebug2002 to stretch
  • 11:21 Amir1: ladsgroup@terbium:/var/log/wikidata$ mwscript updateCollation.php --wiki=fawiki --previous-collation=xx-uca-fa
  • 11:13 moritzm: installing uwsgi/Django security updates on graphite hosts in eqiad
  • 10:39 moritzm: installing uwsgi/Django security updates on graphite2001
  • 09:53 moritzm: reimaging mwdebug1001 to stretch
  • 08:58 elukey: reimage analytics10[51,53] to Debian Stretch
  • 08:46 moritzm: installing mysql 5.5 security update (distro-packaged version) on trusty
  • 08:14 moritzm: reimaging mwdebug2001 to stretch
  • 07:32 godog: swift eqiad-prod more weight to ms-be104[0-3] - T190081
  • 05:31 marostegui: Deploy schema change on db1105:3312 - T191519 T188299 T190148
  • 05:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 for alter table (duration: 00m 59s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113 after alter table (duration: 01m 10s)
  • 05:08 cwd: killed some dedupe queries on staging that were causing alerts

2018-04-26

  • 23:31 reedy@tin: Synchronized php-1.32.0-wmf.1/extensions/PdfHandler/: (no justification provided) (duration: 01m 00s)
  • 23:16 reedy@tin: Synchronized php-1.32.0-wmf.1/extensions/UploadWizard/: (no justification provided) (duration: 01m 00s)
  • 23:10 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 00s)
  • 22:44 ebernhardson: start test measuring elasticsearch master mutation latency in codfw
  • 22:38 Jeff_Green: deployed DNS update for frbast1001.wikimedia.org
  • 22:21 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/429100/ (duration: 01m 00s)
  • 22:11 maxsem@tin: Finished scap: Deploy ACW to test wikis, https://gerrit.wikimedia.org/r/429017 / T192455 (duration: 57m 06s)
  • 21:14 maxsem@tin: Started scap: Deploy ACW to test wikis, https://gerrit.wikimedia.org/r/429017 / T192455
  • 21:13 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/429017 (duration: 00m 59s)
  • 21:05 maxsem@tin: Synchronized php-1.32.0-wmf.1/extensions/ArticleCreationWorkflow/: https://gerrit.wikimedia.org/r/#/c/429111/ (duration: 01m 00s)
  • 20:29 hashar: contint1001: cleaned up old Docker images produced by docker-pkg
  • 20:09 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.1
  • 18:12 ottomata: reimaging (some?) kafka200* codfw main kafka nodes to stretch T192832
  • 17:27 awight@tin: Finished deploy [ores/deploy@5b27205]: ORES: update to revscoring 2.2.2, T192917 (duration: 21m 20s)
  • 17:09 ottomata: applying compression_type=snappy to eventbus service kafka producer
  • 17:05 awight@tin: Started deploy [ores/deploy@5b27205]: ORES: update to revscoring 2.2.2, T192917
  • 17:00 moritzm: installing systemd SUA update for stretch
  • 16:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Fix comment, test scap (duration: 01m 12s)
  • 16:03 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Use EventBus for most jobs for test wikis - T190327 (duration: 01m 15s)
  • 16:03 ppchelko@tin: Finished deploy [cpjobqueue/deploy@bf34e00]: Enable all jobs for test, test2, testwikidata and mediawiki. T190327 (duration: 00m 51s)
  • 16:02 ppchelko@tin: Started deploy [cpjobqueue/deploy@bf34e00]: Enable all jobs for test, test2, testwikidata and mediawiki. T190327
  • 15:37 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add db1090 as multiinstance (duration: 01m 16s)
  • 15:36 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1090 as multiinstance (duration: 01m 17s)
  • 15:18 mutante: added LDAP user tschumann to "nda" group (T192549)
  • 14:53 ppchelko@tin: Finished deploy [changeprop/deploy@f2f7a84]: Commit offsets for non matched messages from time to time. (duration: 01m 26s)
  • 14:51 ppchelko@tin: Started deploy [changeprop/deploy@f2f7a84]: Commit offsets for non matched messages from time to time.
  • 14:26 anomie: Running populateRevisionLength.php on group 1 for T192189
  • 14:25 jynus: stop db1069 for cloning it away
  • 13:58 marostegui: Compress enwiki on db1116:3311 - T190704
  • 13:55 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069, repool db1086 (duration: 01m 16s)
  • 13:35 zeljkof: EU SWAT finished
  • 13:31 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change chapcomwikis logo, add HD logo for chapcomwiki (T193024) (duration: 01m 16s)
  • 13:30 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change chapcomwikis logo, add HD logo for chapcomwiki (T193024) (duration: 01m 16s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add all Hindi projects plus meta as import sources for hiwikimedia (T188366) (duration: 01m 17s)
  • 13:09 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Fix pixelization of new wiki logos (T193028) (duration: 01m 17s)
  • 12:53 marostegui: Deploy schema change on db1113:3312 - T191519 T188299 T190148
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113 for alter table (duration: 01m 33s)
  • 12:51 gehel: reindexing lost updates on elasticsearch - T193112
  • 12:04 mobrovac@tin: Finished deploy [cpjobqueue/deploy@7fbb152]: Support the exclude_topics config stanza (duration: 01m 12s)
  • 12:03 mobrovac@tin: Started deploy [cpjobqueue/deploy@7fbb152]: Support the exclude_topics config stanza
  • 10:35 moritzm: reimaging mw1312 mw1317, mw1339 (API servers) to stretch
  • 10:29 moritzm: reimaging mw1269, mw1323, mw1324 (app servers) to stretch
  • 09:57 marostegui: Drop prefswitch_survey on s1 - T173439
  • 09:50 godog: eqiad-prod: more weight to ms-be104[0-3] for container/account - T190081
  • 09:45 marostegui: Drop prefswitch_survey on s3 - T173439
  • 09:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 with low load (duration: 01m 16s)
  • 09:30 marostegui: Drop prefswitch_survey on s7 - T173439
  • 09:16 marostegui: Drop prefswitch_survey on s2 - T173439
  • 09:15 mark: Temp disabling cr1-ulsfo:xe-1/2/0 (Chicago transport) due to stability issues
  • 09:13 marostegui: Drop prefswitch_survey on s4 - T173439
  • 09:02 marostegui: Drop prefswitch_survey on s5 and s6 - T173439
  • 09:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 (duration: 01m 16s)
  • 08:51 moritzm: reimaging mw1320, mw1321, mw1322 (app servers) to stretch
  • 08:32 moritzm: re-attempt reimage of mw1246 (failed yesterday with an error on the puppetmaster, testing whether this can be reproduced)
  • 08:24 jynus: stop and upgrade db1109
  • 07:58 marostegui: Deploy schema change on db1090 - T191519 T188299 T190148
  • 07:45 jynus: stopping db1090 mariadb instance to move its path, port and socket
  • 07:21 gehel: restarting redis masters in codfw - T193112
  • 07:20 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090, pool db1122 with full weight (duration: 01m 23s)
  • 07:16 gehel: re-enabling puppet on rdb2* - T193112
  • 06:19 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=elasticsearch
  • 05:18 marostegui: Deploy schema change on dbstore1002:s2 - T191519 T188299 T190148
  • 04:39 ebernhardson: unfreeze writes to elasticsearch codfw cluster
  • 03:54 _joe_: stopping redis replication from eqiad to codfw for the jobqueue cluster, we have an issue ongoing with CirrusSearch jobs and replication is broken
  • 03:41 ejegg: re-enabled ingenico recurring charge job
  • 02:05 mutante: mw2163 through mw2166: since the wmf-auto-reimage failed after OS but before puppet run due to "Failed to puppet_generate_certs" i manually logged in with install-console and signed puppet certs (T174431)

2018-04-25

  • 22:55 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Undeploy GlobalPreferences T184121 (duration: 01m 16s)
  • 22:21 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy GlobalPreferences T189806 (duration: 01m 18s)
  • 21:03 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.1
  • 21:01 demon@tin: Synchronized php: symlink bump (duration: 01m 16s)
  • 20:58 hasharAway: on tin: rebased php-1.31.0-wmf.30 for https://gerrit.wikimedia.org/r/#/c/429018/
  • 20:21 XioNoX: remove test VIP for eqiad ping offload server - T190090
  • 20:18 bsitzmann@tin: Finished deploy [mobileapps/deploy@5a4a282]: Config: Start up to 4 workers in parallel during start-up (duration: 06m 48s)
  • 20:11 bsitzmann@tin: Started deploy [mobileapps/deploy@5a4a282]: Config: Start up to 4 workers in parallel during start-up
  • 19:39 otto@tin: Finished deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/ (duration: 01m 45s)
  • 19:37 otto@tin: Started deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 19:12 urandom: altering timeline tables for 6 month TTL -- T192689
  • 19:11 otto@tin: Finished deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/ (duration: 00m 11s)
  • 19:11 otto@tin: Started deploy [eventlogging/eventbus@f0783bb]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 19:09 otto@tin: Started deploy [eventlogging/eventbus@f562c1b]: Fix for logging error https://gerrit.wikimedia.org/r/#/c/428982/
  • 18:55 imarlier@tin: Finished deploy [performance/coal@1e79c79]: deploy fix for coal-web (duration: 00m 06s)
  • 18:55 imarlier@tin: Started deploy [performance/coal@1e79c79]: deploy fix for coal-web
  • 18:16 ejegg: updated CiviCRM from 219798b2c5 to 47197006d5
  • 17:35 urandom: starting cleanups on row 'a' Cassandra nodes -- T189822
  • 17:33 mepps: update civicrm from 6ddeb167ec to 219798b2c5
  • 17:20 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change fawiki uca to the right one (duration: 01m 17s)
  • 17:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on frwikiquote T192301 (duration: 01m 17s)
  • 17:00 mutante: powercycling wdqs1004
  • 16:09 mutante: re-imaging mw2258, mw2163, mw2164 ff.
  • 15:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db1122, db1090 with low load (duration: 01m 14s)
  • 15:22 anomie: Running populateRevisionLength.php on group 0 for T192189
  • 15:05 ottomata: temp disabling puppet, applying ipv6 mapped on kafka200*
  • 15:04 andrewbogott: adding labvirt1016 to the nova-compute scheduling pool
  • 14:37 elukey: restart hive-server2 on analytics1003 to pick up settings in https://gerrit.wikimedia.org/r/428919
  • 14:34 akosiaris: reboot bohrium T150532
  • 14:33 ema: cp3030: upgrade varnish to 5.1.3-1wm7 T192368
  • 14:12 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1002 T193025 (duration: 01m 16s)
  • 13:57 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: pool poolcounter1003 T187297 (duration: 01m 16s)
  • 13:53 Amir1: EU SWAT is done!
  • 13:53 ladsgroup@tin: Synchronized php-1.32.0-wmf.1/extensions/Wikibase/lib/includes/Changes: Make sure statements in EntityDiffChangedAspects are not passed around as stdClass (T192085) (duration: 01m 16s)
  • 13:49 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: repool poolcounter1001 T150532 (duration: 01m 16s)
  • 13:43 ladsgroup@tin: Synchronized php-1.31.0-wmf.30/extensions/Wikibase/lib/includes/Changes: Make sure statements in EntityDiffChangedAspects are not passed around as stdClass (T192085) (duration: 01m 17s)
  • 13:40 akosiaris: reboot poolcounter1001 for T150532
  • 13:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Mapframe for bgwiki (T192895) (duration: 01m 15s)
  • 13:23 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for cswiki Wikipedia event (T192898) (duration: 01m 16s)
  • 13:19 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1001 T150532 (duration: 01m 17s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for cswiki Wikipedia event (T192898) (duration: 01m 16s)
  • 13:12 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Remove xx-uca-fa for Persian Wikis except Wikipedia (duration: 01m 17s)
  • 13:06 marostegui: Deploy schema change on s2 codfw master (db2035) - this will generate lag on codfw - T191519 T188299 T190148
  • 12:55 gehel: starting elasticsearch codfw rolling restart for plugin update and NUMA config - T191543 / T191236
  • 12:47 akosiaris: reboot puppetdb1001 for T150532
  • 12:08 moritzm: reimaging mw1251, mw1252, mw1253 (app servers) to stretch
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Add db1122 (duration: 01m 16s)
  • 11:29 jynus@tin: Synchronized wmf-config/db-codfw.php: Add db1122 (duration: 03m 24s)
  • 11:19 moritzm: reimaging mw1228, mw1229, mw1230 (API servers) to stretch
  • 10:29 jynus: stopping replication, running optimize table on dbstore2001:s8
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 (duration: 01m 16s)
  • 09:58 elukey: reimage analytics106[1,2] to Debian Stretch
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1085 after alter table (duration: 01m 30s)
  • 09:09 jynus: stopping db1090 for maintenance
  • 08:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 01m 17s)
  • 08:38 marostegui: Drop user_old and user_temp tables from s3 - T172664
  • 08:23 godog: eqiad-prod: add ms-be104[0-3] with minimal weight - T190081
  • 08:23 moritzm: reimaging mw1247, mw1248, mw1249 (app servers) to stretch
  • 07:35 marostegui: Deploy schema change on db1085 with replication (this will generate lag on labsdb hosts on s6) - T191519 T188299 T190148
  • 07:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 01m 16s)
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3316 after alter table (duration: 01m 16s)
  • 07:05 akosiaris: starting a very slow rolling reboot of all VMs on codfw ganeti cluster, row_C nodegroup, excluding poolcounter1001 and puppetdb1001. T150532
  • 06:53 moritzm: reimaging mw1314, mw1315, mw1316 (API servers) to stretch
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 21s)
  • afk: disabled ingenico recurring donation charge job
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.30) (duration: 07m 23s)
  • 02:52 ejegg: turned fundraising queue consumers back on
  • 01:31 ejegg: disabled fundraising queue consumer jobs
  • 00:31 demon@tin: Synchronized multiversion/defines.php: rm unused defines (duration: 01m 16s)

2018-04-24

  • 23:33 legoktm@tin: Synchronized php-1.32.0-wmf.1/extensions/Kartographer/includes/Tag/MapFrame.php: MapFrame: Allow lang="local" to be passed (duration: 01m 17s)
  • 23:29 urandom: starting Cassandra bootstrap, restbase1010-c -- T189822
  • 23:08 mutante: mw2242.codfw , mw2255.codfw et al.. more stretch reinstalls going on
  • 23:04 demon@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: unbreak multiversion loading for a totally useless script (duration: 01m 16s)
  • 22:55 legoktm@tin: Synchronized wmf-config/InitialiseSettings.php: touch (duration: 01m 18s)
  • 22:53 legoktm@tin: Synchronized wmf-config/CommonSettings.php: Fix wgTidyConfig and restore proper tidy & Remex config - T192855 (duration: 01m 16s)
  • 21:56 mutante: adding LDAP user 'bitpogo' to group 'wmde' (T191523)
  • 21:23 ejegg: re-enabled recurring donations queue consumer
  • 20:55 demon@tin: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.1
  • 20:27 urandom: starting Cassandra bootstrap, restbase1010-b -- T189822
  • 20:23 Dereckson: Run namespaceDupes on gorwiki
  • 20:03 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for all wikis but wikitech - T191464 (duration: 01m 26s)
  • 19:53 bblack: prometheus-fail switched to UNKNOWNs for now in https://gerrit.wikimedia.org/r/#/c/428725/ - may want to look at this further later, intent is to reduce odds of debilitating ops spam for the evening.
  • 19:49 elukey: re-enable ircecho
  • 19:40 demon@tin: Finished scap: bootstrap 1.32.0-wmf.1 (duration: 106m 55s)
  • 19:36 elukey: stop ircecho on einstenium - icinga shower
  • 19:17 jgleeson: Updating civicrm from 142edbb90b to 6ddeb167ec
  • 18:54 ottomata: temp disabling puppet and applying profile::kafka::broker on kafka100* T192831
  • 17:53 demon@tin: Started scap: bootstrap 1.32.0-wmf.1
  • 17:52 gehel: restarting wdqs-updater on all nodes for prometheus jmx exporter update - T192768
  • 17:51 andrew@tin: Synchronized wmf-config/db-eqiad.php: Renaming 'm5' section to 'wikitech' for T189542, two of two (duration: 00m 59s)
  • 17:49 andrew@tin: Synchronized wmf-config/db-codfw.php: Renaming 'm5' section to 'wikitech' for T189542, one of two (duration: 00m 59s)
  • 17:42 ottomata: temp disabling puppet on kafka200* to apply profile::kafka::broker in main-codfw T192831
  • 17:39 demon@tin: Pruned MediaWiki: 1.31.0-wmf.29 [keeping static files] (duration: 06m 28s)
  • 17:35 XioNoX: removing firewall block on cr1/2-codfw - T175361
  • 17:35 XioNoX: removing firewall block on cr1-eqdfw - T175361
  • 17:29 bstorm_: added MCR tables to labsdb1009 (slots, slot_roles, content_models, content)
  • 17:04 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% [deploy to restbase1010] - T192689 (duration: 02m 04s)
  • 17:02 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% [deploy to restbase1010] - T192689
  • 17:01 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689 (duration: 05m 27s)
  • 16:57 urandom: starting Cassandra bootstrap, restbase1010-a -- T189822
  • 16:55 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689
  • 16:52 mobrovac@tin: Finished deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689 (duration: 11m 40s)
  • 16:45 marostegui: Deploy schema change on db1113:3316 - T191519 T188299 T190148
  • 16:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3316 for alter table (duration: 00m 58s)
  • 16:40 mobrovac@tin: Started deploy [restbase/deploy@fbce520]: Set the delete probability to 100% - T192689
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3316 after alter table (duration: 00m 58s)
  • 16:30 elukey: restart hadoop hdfs journalnode on analytics1035/52 to pick up prometheus jmx settings
  • 16:11 elukey: restart hadoop-hdfs-journalnode on analytics1028 to pick up prometheus monitoring
  • 16:10 bstorm_: Added views for new MCR tables on labsdb1011 (slots, slot_roles, content and content_models)
  • 16:08 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011 - https://phabricator.wikimedia.org/T184446
  • 15:59 godog: reimage restbase1010 after ssd swap - T189822
  • 15:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with full weight (duration: 00m 58s)
  • 14:41 elukey: restart hadoop hdfs journalnode on analytics1028 to pick up jmx settings
  • 14:40 sbisson@tin: Finished deploy [kartotherian/deploy@86da82d]: Deploy latest kartotherian with updated fallbacks and support lang=local (duration: 06m 29s)
  • 14:34 sbisson@tin: Started deploy [kartotherian/deploy@86da82d]: Deploy latest kartotherian with updated fallbacks and support lang=local
  • 14:02 Amir1: EU SWAT is done
  • 14:01 hoo@tin: Synchronized wmf-config/abusefilter.php: Grant Meta-Wiki sysops the ability to edit global abusefilter rules (T192722) (duration: 00m 59s)
  • 13:58 hoo@tin: Synchronized wmf-config/: Properly set default for $wmgWikibaseSiteGroup (T188456) (duration: 00m 58s)
  • 13:56 hoo@tin: Synchronized wmf-config/: Properly set default for $wmgWikibaseSiteGroup (T188456) (duration: 01m 00s)
  • 13:43 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Increase the timespan of rate limit in wikidata from 1m to 5m (T192690) (duration: 00m 58s)
  • 13:37 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up old config for logging autopatrol actions (T184485) (duration: 00m 58s)
  • 13:28 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Add badge for good lists (T190976) (duration: 00m 55s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for IndigenizeWikipedia event, clean obsolete rules (T192827) (duration: 00m 58s)
  • 13:06 hoo@tin: Synchronized wmf-config/InitialiseSettings.php: Set default for $wmgWikibaseSiteGroup (T188456) (duration: 00m 59s)
  • 12:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 58s)
  • 12:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1110 load (duration: 00m 58s)
  • 12:28 elukey: cleanup /home/elukey/zookeeper backup files (taken before the 3.4.9 migration) on conf*
  • 12:13 marostegui: Deploy schema change on db1098:3316 - T191519 T188299 T190148
  • 12:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 58s)
  • 12:10 elukey: reimage analytics106[34] to Debian Stretch
  • 12:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 after alter table (duration: 00m 58s)
  • 11:54 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 with low load (duration: 00m 59s)
  • 11:44 moritzm: reimaging mw1241, mw1242, mw1243 (app servers) to stretch
  • 10:58 moritzm: reimaging mw1224, mw1225, mw1226 (API servers) to stretch
  • 10:50 elukey: reimage analytics106[56] to Debian Stretch
  • 10:49 arturo: enable puppet in labtestcontrol2001 to sync with repo changes
  • 10:39 akosiaris: starting a very slow rolling reboot of all VMs on codfw ganeti cluster T150532
  • 10:39 akosiaris: upgrade to qemu 2.8 on codfw ganeti cluster. T150532
  • 10:31 jynus: stop and reimage db1110
  • 10:01 apergos: reimaged snapshot1001 for testing with php7/stretch
  • 09:39 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 00m 58s)
  • 09:28 marostegui: Deploy schema change on db1088 - T191519 T188299 T190148
  • 09:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 58s)
  • 09:25 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #3 - T192689 T190846 (duration: 04m 30s)
  • 09:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 after alter table (duration: 03m 06s)
  • 09:21 moritzm: reimaging mw1221, mw1222, mw1223 (API servers) to stretch
  • 09:21 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #3 - T192689 T190846
  • 09:21 moritzm: reimaging mw1221, mw1222, mw1223 (app servers) to stretch
  • 09:21 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #2 - T192689 T190846 (duration: 03m 03s)
  • 09:18 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points, take #2 - T192689 T190846
  • 09:17 mobrovac@tin: Finished deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points - T192689 T190846 (duration: 13m 13s)
  • 09:12 moritzm: reimaging mw1273, mw1274, mw1275 (app servers) to stretch
  • 09:03 mobrovac@tin: Started deploy [restbase/deploy@1661f69]: Increase the deletion probability to 50% and expose the CSS end points - T192689 T190846
  • 08:17 hoo: Finished running populateSitesTable.php for all wikis (T192628, T192632, T192631, T192633)
  • 08:14 elukey: upload druid_0.10.0-3~jessie1 (collection of druid packages) to jessie-wikimedia - T164008
  • 08:05 godog: power off restbase1010 for ssd replacement - T189822
  • 07:50 hoo: Started running populateSitesTable.php for all wikis (T192628, T192632, T192631, T192633)
  • 07:39 marostegui: Rename user_old and user_temp tables on db1077 - T172664
  • 07:28 gehel: restarting blazegraph on wdqs1004 for jvm upgrade
  • 07:23 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 - T184446
  • 07:16 vgutierrez: Update puppet compiler facts
  • 06:56 elukey: restart zookeeper on conf200[123] for openjdk upgrades
  • 06:41 moritzm: installing poppler security updates
  • 06:35 marostegui: Deploy schema change on db1093 - T191519 T188299 T190148
  • 06:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 59s)
  • 05:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 after alter table (duration: 00m 59s)
  • 05:03 _joe_: rebuilding the docker base images
  • 04:35 mutante: repooled mw2224, reinstalling mw2225 through mw2228
  • 03:08 mutante: reinstalling mw2224.codfw.wmnet with wmf-auto-reimage
  • 02:55 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.30) (duration: 10m 37s)
  • 01:55 cwd: payments, civi, and alerts re-enabled
  • 01:11 ejegg: re-enabled fundraising jobs
  • 01:09 ejegg: updated fundraising python tools from f3ed1d05b8 to 3754f32ab6
  • 00:18 mutante: removing travel@ and travelapproval@ exim aliases, moving to OIT/Google (T127549)

2018-04-23

  • 23:51 eileen: civicrm revision changed from 347e613aa5 to 142edbb90b, config revision is 07dee62bff
  • 23:35 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable non-static internationalized maps on test2wiki (duration: 00m 59s)
  • 23:32 catrope@tin: Synchronized php-1.31.0-wmf.30/extensions/Thanks/includes/EchoCoreThanksPresentationModel.php: Fix fatal error in Thanks notifications (T192711) (duration: 00m 58s)
  • 23:29 eileen: civicrm revision changed from b1e7ccfc4d to 347e613aa5, config revision is 07dee62bff
  • 23:15 XioNoX: changed AMS-IX peering mode to default (filter on radb+rpki)
  • 23:13 cwd: disabled most (all?) frack alerts
  • 23:11 ebernhardson: restart elasticsearch on elastic1031 to apply numa settings
  • 22:56 XioNoX: disabling flapping VCP on asw1-eqsin - T192125
  • 22:37 mutante: phab1001 - deleting duplicate cronjob for public_taskdump.py (the one that did not output to /dev/null) (T188149)
  • 22:21 ebernhardson: restart elasticsearch on elastic1030 to apply numa settings
  • 22:12 ebernhardson: restart elasticsearch on elastic1029 to apply numa settings
  • 21:49 ebernhardson: restart elasticsearch on elastic1028 to apply numa settings
  • 21:40 ejegg: updated fundraising python tools from 7c5c7a5f9e to f3ed1d05b8
  • 21:39 ejegg: updated SmashPig from 1ebee97a45 to a4de12d415
  • 21:36 ebernhardson: restart elasticsearch on elastic1024 to apply numa settings
  • 21:25 ebernhardson: restart elasticsearch on elastic1025 to apply numa settings
  • 20:53 XioNoX: redirect text-lb.eqiad pings to ping1001 on cr1/2-eqiad (24h tests) - T190090
  • 20:47 ppchelko@tin: Finished deploy [restbase/deploy@228caf8]: Log the lack of the index entries, take 2 (duration: 03m 55s)
  • 20:43 ppchelko@tin: Started deploy [restbase/deploy@228caf8]: Log the lack of the index entries, take 2
  • 20:43 ppchelko@tin: Finished deploy [restbase/deploy@228caf8]: Log the lack of the index entries (duration: 14m 19s)
  • 20:40 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.30
  • 20:32 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5650605]: Update mobileapps to b011b2a (duration: 05m 56s)
  • 20:29 ppchelko@tin: Started deploy [restbase/deploy@228caf8]: Log the lack of the index entries
  • 20:26 mholloway-shell@tin: Started deploy [mobileapps/deploy@5650605]: Update mobileapps to b011b2a
  • 20:15 Dereckson: Purged all languages messages from the cache, for gorwiki (rebuildmessages.php, T189127)
  • 19:49 vgutierrez: Repool (Re-enable BGP) in lvs5001 - T191897
  • 19:34 elukey@tin: Finished deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008 (duration: 00m 17s)
  • 19:34 elukey@tin: Started deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008
  • 18:48 catrope@tin: Synchronized dblists/wikidataclient.dblist: Add ruwikimedia to wikidataclient (T188456) (duration: 01m 15s)
  • 18:33 vgutierrez: Depool lvs5001 - T191897
  • 18:33 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Change timezone for napwiki (T192568) (duration: 01m 31s)
  • 18:28 vgutierrez: Repool (Re-enable BGP) lvs5002 - T191897
  • 18:18 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable WikiLove on sawiki (T192212) (duration: 01m 19s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable internationalized maps on testwiki (duration: 01m 17s)
  • 17:52 ariel@tin: Finished deploy [dumps/dumps@02a3e80]: fix up checks for truncated/binary output files (duration: 00m 04s)
  • 17:52 ariel@tin: Started deploy [dumps/dumps@02a3e80]: fix up checks for truncated/binary output files
  • 17:35 XioNoX: pushing firewall block on cr1-eqdfw - T175361
  • 17:24 XioNoX: pushing firewall block on cr1/2-codfw - T175361
  • 17:18 thcipriani@tin: Synchronized php: Group1 to 1.31.0-wmf.30 (duration: 01m 16s)
  • 17:15 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.30
  • 17:02 vgutierrez: Depool and reimage lvs5002 as stretch - T191897
  • 16:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.26 (duration: 03m 28s)
  • 16:07 marostegui: Deploy schema change on db1096:3316 - T191519 T188299 T190148
  • 16:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 01m 16s)
  • 16:03 gehel: restarting wdqs-updater on all nodes
  • 15:55 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010
  • 15:53 bstorm_: Added slots, slot_roles, content and content_models to views on labsdb1010
  • 15:36 dereckson@tin: Finished scap: Rebuild localisation cache to add Gorontalo (T189127) (duration: 08m 29s)
  • 15:28 dereckson@tin: Started scap: Rebuild localisation cache to add Gorontalo (T189127)
  • 15:28 dereckson@tin: scap aborted: Rebuild localisation cache to add Gorontalo (T189127)z (duration: 00m 01s)
  • 15:28 dereckson@tin: Started scap: Rebuild localisation cache to add Gorontalo (T189127)z
  • 15:23 dereckson@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 00m 46s)
  • 15:20 dereckson@tin: Synchronized php-1.31.0-wmf.30/languages/messages/MessagesGor.php: Localisation for MediaWiki in Gorontalo (T189127) (duration: 01m 16s)
  • 15:13 dereckson@tin: Synchronized php-1.31.0-wmf.29/languages/messages/MessagesGor.php: Localisation for MediaWiki in Gorontalo (T189127) (duration: 01m 18s)
  • 14:10 ottomata: switching main -> analytics MirrorMaker to --new.consumer (temporarily stopping puppet on kafka101[234]) https://phabricator.wikimedia.org/T192387
  • 14:02 zeljkof: EU SWAT finished
  • 13:57 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: lfnwiki: add logo path and missing namespace names (T183561) (duration: 01m 15s)
  • 13:55 elukey: reimage analytics1067 to Debian Stretch - T192557
  • 13:53 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 13:50 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 13:43 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: euwikisource: add missing $wgMetaNamespace (T189465) (duration: 01m 16s)
  • 13:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: gorwiki: add missing namespaces (T189109) (duration: 01m 17s)
  • 13:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add logos for gorwiki (T192669) (duration: 01m 14s)
  • 13:27 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Add logos for gorwiki (T192669) (duration: 01m 16s)
  • 13:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Temp rate limit for arwiki due to mass vandalism (T192668) (duration: 01m 17s)
  • 13:12 jynus: restarting es2003 to test gerrit:427902
  • 12:59 marostegui: Deploy schema change on dbstore1002 s6 - T191519 T188299 T190148
  • 12:58 jynus: disabling puppet on several mysql hosts before deploying gerrit:427902
  • 12:40 sbisson@tin: Finished deploy [kartotherian/deploy@2195dde]: Deploy kartotherian with new babel fallback rules (duration: 04m 52s)
  • 12:35 sbisson@tin: Started deploy [kartotherian/deploy@2195dde]: Deploy kartotherian with new babel fallback rules
  • 11:50 moritzm: reimaging mw1238,mw1239,mw1240 (app servers) to stretch
  • 11:46 moritzm: reimaging mw1285 (previous attempt had a hardware problem which failed to trigger the reboot via IPMI) ,mw1287,mw1288 (API servers) to stretch
  • 11:41 moritzm: installing poppler security updates
  • 11:25 mobrovac@tin: Finished deploy [citoid/deploy@b3c0818]: Add support for restful crossRef API and Wikidata QIDs - T108175 T176411 (duration: 03m 36s)
  • 11:22 mobrovac@tin: Started deploy [citoid/deploy@b3c0818]: Add support for restful crossRef API and Wikidata QIDs - T108175 T176411
  • 11:17 mobrovac@tin: Finished deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource, take #2 - T192678 (duration: 07m 21s)
  • 11:10 mobrovac@tin: Started deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource, take #2 - T192678
  • 11:09 mobrovac@tin: Finished deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource - T192678 (duration: 11m 47s)
  • 11:00 gehel: restarting wdqs updater on all wdqs notes
  • 10:57 mobrovac@tin: Started deploy [restbase/deploy@3f3f989]: Add lfnwiki, inhwiki, gorwiki and euwikisource - T192678
  • 10:26 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 10:25 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 17s)
  • 09:56 _joe_: restarting memcached on mc1020-1036 at 1 hour intervals - T184854
  • 09:13 godog: Flashing Smart Array P840 in Slot 3 [ 4.52 -> 6.30 ] on ms-be2034 - T192721 T141756
  • 09:05 _joe_: AMEND: restart memcached on mc1019 (T184854)
  • 09:05 _joe_: restart memcached on mw1019 (Ttail -f /var/log/etcdmirror-conftool-eqiad-wmnet/syslog.log
  • 09:05 vgutierrez: restarting pybal on lvs1006
  • 09:02 _joe_: restarting etcdmirror on conf2002 after restarting nginx on conf1001
  • 08:59 moritzm: reimaging mw1283,mw1285,mw1286 (API servers) to stretch
  • 08:57 marostegui: Deploy schema change on s6 codfw master (db2039) - this will generate lag on codfw - T191519 T188299 T190148
  • 08:56 gehel: rolling restart of blazegraph on wdqs1004, 2004 and 2005 for JVM upgrade
  • 08:55 moritzm: reimaging mw1270,mw1271,mw1272 (app servers) to stretch
  • 08:52 vgutierrez: restarting pybal on esams cluster
  • 08:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 (duration: 01m 16s)
  • 08:48 _joe_: upgrading nginx on the config cluster in eqiad (T164456)
  • 08:47 marostegui: Drop table logging_pre_1_10 in s5 - T118859
  • 08:47 marostegui: Dropped table logging_pre_1_10 in s3 - T118859
  • 08:42 vgutierrez: restarting pybal on lvs4006
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 (duration: 01m 18s)
  • 08:36 vgutierrez: restarting pybal on codfw (once at a time)
  • 08:33 vgutierrez: restart pybal on lvs4007
  • 08:31 vgutierrez: restarting pybal on lvs5002
  • 08:30 vgutierrez: restarting pybal on lvs5001
  • 08:30 marostegui: Drop table logging_pre_1_10 in s4 - T118859
  • 08:27 vgutierrez: restarting pybal on lvs4005
  • 08:27 _joe_: restarting pybal on lvs5003
  • 08:17 _joe_: upgrading nginx on the config cluster in codfw (T164456)
  • 08:13 marostegui: Drop table logging_pre_1_10 in s7 - T118859
  • 08:08 _joe_: restarting memcached in codfw (T184854)
  • 08:08 gehel: restarting blazegraph on wdqs1003 (crazy number of java threads)
  • 08:04 moritzm: upgrading terbium to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:58 ema: cp-misc: upgrade varnish to 5.1.3-1wm7
  • 07:55 marostegui: reload haproxy on dbproxy1010 to depool labsdb1010
  • 07:55 marostegui: Depool labsdb1010 - T184446
  • 07:50 marostegui: Drop table logging_pre_1_10 in s2 - T118859
  • 07:47 marostegui: Drop table logging_pre_1_10 in s6 - T118859
  • 07:36 moritzm: upgrading remaining API servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:35 elukey: reboot ms-be2034 - stuck in com2 console with "sd 0:1:0:1: rejecting I/O to offline device", not responsive to ssh
  • 07:00 moritzm: upgrading remaining app servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 06:26 marostegui: Remove logging_pre_1_10 from codfw - T118859
  • 05:28 marostegui: flow_subscription empty table from officewiki - T149936
  • 05:17 marostegui: Deploy schema change on db1070 (s5 primary master) - T191519 T188299 T190148
  • 02:40 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 56s)

2018-04-22

  • 16:29 ariel@tin: Finished deploy [dumps/dumps@bb7ae96]: creadtedirs date fixup, rerun only missing stub type (duration: 00m 03s)
  • 16:29 ariel@tin: Started deploy [dumps/dumps@bb7ae96]: creadtedirs date fixup, rerun only missing stub type

2018-04-21

2018-04-20

  • 20:45 andrewbogott: re-imaging labvirt1021 and 1022 as Jessie
  • 20:23 ejegg: updated fundraising python tools from 0c50f9e38f to 7c5c7a5f9e
  • 18:23 mutante: add LDAP user "tieu" to group "wmde" (T192256)
  • 17:42 imarlier@tin: Finished deploy [performance/coal@99db58f]: coal - update to submit via graphite. Not yet active, requires puppet changes (duration: 00m 04s)
  • 17:41 imarlier@tin: Started deploy [performance/coal@99db58f]: coal - update to submit via graphite. Not yet active, requires puppet changes
  • 17:35 no_justification: gerrit: update mysql-client and deps 5.5.59 -> 5.5.60
  • 17:28 mutante: phabricator - restarted apache
  • 17:26 mutante: phabricator (phab1001) - upgrading Apache, openssl, mysql-common
  • 17:17 mutante: phab2001 - upgrading apache, openssl, mysql-common
  • 17:04 andrewbogott: rebooting labvirt1021 and 1022
  • 16:44 dcausse@tin: Synchronized php-1.31.0-wmf.30/extensions/CirrusSearch/: T192609: Do not propagate Elastica doc modifications out of DataSender (duration: 01m 34s)
  • 15:07 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2087 (duration: 01m 16s)
  • 14:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2086, depool db2087 (duration: 01m 16s)
  • 14:16 andrew@tin: Synchronized dblists: Purging obsolete silver.dblist (duration: 01m 17s)
  • 14:02 moritzm: upgrading labweb* servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 14:00 jynus: upgrade and restart db2086
  • 13:59 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2086 (duration: 01m 13s)
  • 13:35 anomie: (re-)creating `slots` table on all wikis, following up T190153 and T184446#4143097
  • 13:25 moritzm: upgrading mysql (as shipped in Debian) on bohrium
  • 13:00 moritzm: installing zsh security updates on trusty servers
  • 12:25 moritzm: upgrading apache on auth* servers
  • 12:18 jynus: upgrading and restarting dbstore2002
  • 12:15 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2070 (duration: 01m 17s)
  • 12:06 moritzm: installing apache security updates on video scalers
  • 12:05 moritzm: upgrading apache on einsteinium/icinga.wikimedia.org
  • 11:53 moritzm: installing apache security updates on netmon1002/2001
  • 11:27 elukey: reimage analytics1068 to Debian Stretch - T192557
  • 11:06 moritzm: installing tiff security updates on trusty
  • 09:58 godog: upload scap 3.8.0-2 - T192124
  • 09:51 moritzm: upgrading deployment servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 09:41 jynus: starting reimage of db2070
  • 09:41 moritzm: upgrading mwdebug servers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 09:33 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2071, depool db2070 (duration: 01m 16s)
  • 09:12 elukey: restart of mw apis showing ~50% cpu utilization as precaution before the weekend - mw[1224,1225,1228,1230,1231,1233-1235,1276-1283,1286,1312,1313,1315,1316,1341,1343,1344,1347,1348]*
  • 09:06 moritzm: upgrading video scalers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 08:41 moritzm: upgrading job runners in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 08:39 marostegui: Going to sanitize gorwiki euwikisource romdwikimedia inhwiki on db1095 - T189112 T189466 T187774 T184375
  • 08:39 elukey: restart hhvm on mw[1226,1232].eqiad.wmnet - high load
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 01m 16s)
  • 07:57 jynus: starting reimage of db2071
  • 07:52 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 01m 16s)
  • 07:48 moritzm: upgrading app servers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 01m 17s)
  • 07:38 ema: cp3041: restart varnish-be due to mbox lag
  • 07:37 akosiaris: upgrade qemu on ganeti2006 to 1:2.8+dfsg-3~bpo8+1 and migrate mwdebug2001 to it T150532
  • 07:32 ema: cp3030: restart varnish-be due to mbox lag
  • 07:30 _joe_: upgrading hhvm on all jobrunners in eqiad
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 01m 15s)
  • 07:09 ema: cp3032/cp3043: restart varnish-be due to mbox lag
  • 07:08 moritzm: upgrading API servers in codfw to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 after alter table (duration: 01m 16s)
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 01m 15s)
  • 06:26 ema: kafka::analytics remove strongswan leftovers T185136
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 01m 15s)
  • 06:07 marostegui: Stop mysql db1114 for a reboot
  • 06:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 01m 16s)
  • 05:55 _joe_: depooling mw1227 from live traffic for investigation
  • 05:31 marostegui: Start atop on db1114 with "-R" option enabled - T192551
  • 05:31 marostegui: Deploy schema change on db1110 - T191519 T188299 T190148
  • 05:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 for alter table (duration: 01m 17s)
  • 05:21 ariel@tin: Finished deploy [dumps/dumps@c2d3bb4]: keep completed stubs/abstracts/logs files around for retries (duration: 00m 04s)
  • 05:20 ariel@tin: Started deploy [dumps/dumps@c2d3bb4]: keep completed stubs/abstracts/logs files around for retries
  • 01:50 krinkle@tin: Synchronized wmf-config/CommonSettings.php: If8fdce707d (duration: 01m 17s)

2018-04-19

  • 23:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.29/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off cirrus ab test (duration: 01m 18s)
  • 23:13 ebernhardson@tin: Synchronized php-1.31.0-wmf.30/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off cirrus ab test (duration: 01m 17s)
  • 23:04 thcipriani@tin: Synchronized php: complete group1 and group2 wikis back to 1.31.0-wmf.29 (duration: 01m 16s)
  • 22:30 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 and group2 wikis back to 1.31.0-wmf.29
  • 21:41 urandom: Start cleanup, restbase10{07,11,16}-c -- T189822
  • 21:22 urandom: Start cleanup, restbase10{07,11,16}-b -- T189822
  • 21:15 urandom: Start cleanup, restbase10{07,11,16}-a -- T189822
  • 21:12 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter, restbase1010-c -- T189822, T192456
  • 21:00 ebernhardson: issue move of enwiki_content shard 2 from overloaded elasti1027 to elastic1017
  • 20:48 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter, restbase1010-a -- T189822, T192456
  • 20:48 urandom: restarting cassandra to (temporarily) rollback prometheus jmx exporter -- T189822, T192456
  • 20:32 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.30
  • 20:27 milimetric@tin: Finished deploy [analytics/refinery@c1c9885]: Correcting hql from last deployment (duration: 05m 09s)
  • 20:22 milimetric@tin: Started deploy [analytics/refinery@c1c9885]: Correcting hql from last deployment
  • 19:53 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.30 (duration: 01m 15s)
  • 19:45 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.30
  • 19:35 thcipriani@tin: Synchronized php-1.31.0-wmf.30/includes/page/Article.php: Do not pass USE INDEX to a $dbType parameter T192584 (duration: 01m 17s)
  • 19:33 ejegg: updated fundraising python tools from 626fe02a9f to 0c50f9e38f
  • 19:22 no_justification: gerrit: restarting services to pick up gc & indexing changes
  • 18:32 thcipriani@tin: Synchronized php-1.31.0-wmf.30/resources/src/jquery: jquery.makeCollapsible: Only add "[" "]" to autogenerated toggles T192140 (duration: 01m 17s)
  • 17:21 andrew@tin: Synchronized wmf-config/db-eqiad.php: Moving labtestwikitech to m5, step 3 (duration: 01m 16s)
  • 17:20 andrew@tin: Synchronized wmf-config/db-codfw.php: Moving labtestwikitech to m5, step 2 (duration: 01m 16s)
  • 17:18 andrew@tin: Synchronized docroot/noc/db.php: Moving labtestwikitech to m5, step 1 (duration: 01m 16s)
  • 16:56 ejegg: re-enabled banner impressions loader
  • 16:50 moritzm: uploaded tidy-0.99 to component/ci for apt.wikimedia.org/stretch-wikimedia (T191771)
  • 16:46 ejegg: disabled banner impressions loader in order to run backfill mode
  • 16:28 gehel: restarting tilerator on maps[12].* - T191655
  • 16:20 gehel: shutting down tilerator on maps[12].* for maintenance - T191655
  • 15:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 after alter table (duration: 01m 16s)
  • 15:50 fdans@tin: Finished deploy [analytics/refinery@5d0f63f]: deploying to launch page preview job (duration: 06m 34s)
  • 15:48 marostegui: Deploy schema change on dbstore1002 (s5) - T191519 T188299 T190148
  • 15:44 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2074 (duration: 01m 17s)
  • 15:44 fdans@tin: Started deploy [analytics/refinery@5d0f63f]: deploying to launch page preview job
  • 15:42 sbisson@tin: Finished deploy [kartotherian/deploy@0a5a3ef]: Deploy latest kartotherian with new i18n sources (take 3) (duration: 05m 22s)
  • 15:37 sbisson@tin: Started deploy [kartotherian/deploy@0a5a3ef]: Deploy latest kartotherian with new i18n sources (take 3)
  • 15:36 sbisson@tin: Finished deploy [kartotherian/deploy@89c4ca9]: Deploy latest kartotherian with new i18n sources (take 2) (duration: 03m 05s)
  • 15:33 sbisson@tin: Started deploy [kartotherian/deploy@89c4ca9]: Deploy latest kartotherian with new i18n sources (take 2)
  • 15:16 sbisson@tin: Finished deploy [kartotherian/deploy@74121d5]: Deploy latest kartotherian with new i18n sources (duration: 05m 19s)
  • 15:11 sbisson@tin: Started deploy [kartotherian/deploy@74121d5]: Deploy latest kartotherian with new i18n sources
  • 14:48 Dereckson: Erratum: read "User:Andrei Stroe" and not "User:Anderi Store" for the previous entry (T187184)
  • 14:47 Dereckson: Create bureaucrat account for User:Anderi Store on romd.wikimedia (T187184)
  • 14:30 marostegui: Star atop on db1114 without "-R" - T192551
  • 14:29 marostegui: Deploy schema change on db1082 (this will generate lag on s5 on labs hosts) - T191519 T188299 T190148
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 for alter table (duration: 01m 13s)
  • 14:19 ejegg: re-enabled queue jobs
  • 14:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 after alter table (duration: 01m 16s)
  • 14:12 jynus: starting reimage of db2074
  • 13:56 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2075, depool db2074 (duration: 01m 16s)
  • 13:39 marostegui: Stop atop on db1114 - T191996
  • 13:33 marostegui: Start atop on db1114 - T191996
  • 13:30 Trey314159: reindexing serbian wikis on elastic@eqiad (T189265)
  • 13:30 moritzm: upgrading mw1334-mw1337 (job runners) to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 13:14 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: T192427 T189277 (duration: 01m 17s)
  • 12:58 jynus: starting reimage of db2075
  • 12:48 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2075 (duration: 01m 16s)
  • 11:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2076 (duration: 01m 16s)
  • 11:39 moritzm: upgrading eqiad video scalers to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 11:24 marostegui: Run check_private_data on labsdb - T183566
  • 11:21 marostegui: Sanitize lfnwiki - T183566
  • 11:20 moritzm: upgrading app servers mw1238-mw1258 to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 11:14 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: T181121 (duration: 01m 16s)
  • 11:09 akosiaris@tin: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 01m 17s)
  • 11:05 marostegui: Deploy schema change on db1113:3315 - T191519 T188299 T190148
  • 11:03 jynus: starting reimage of db2076
  • 11:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 for alter table (duration: 01m 16s)
  • 11:01 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2076 (duration: 01m 18s)
  • 10:34 moritzm: upgrading API servers mw1221-mw1235 to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build
  • 10:27 vgutierrez: Repool (Re-enable BGP) lvs4005 - T191897
  • 09:59 elukey: complete migration of zookeeper on conf100[123]
  • 09:55 akosiaris: reboot ganeti VMs on row_B in codfw for cache=none setting. T181121
  • 09:54 vgutierrez: Updating puppet compiler facts
  • 09:51 moritzm: rolling restart of Cassandra on maps completed
  • 09:33 elukey: upgrade zookeper on conf100[123] from 3.4.5 to 3.4.9 - T182924
  • 09:31 akosiaris: start a force puppet run in all of eqiad with a batch size of 30
  • 09:29 akosiaris: stop ircecho for a while, puppetdb1001 reboot was eventful
  • 09:17 akosiaris: reboot puppetdb1001 for cache=none setting apply. T181121
  • 09:14 moritzm: installing Java security updates on maps* plus rolling restart of Cassandra to pick up new JRE
  • 09:06 vgutierrez: Depool and reimage lvs4005 as stretch - T191897
  • 09:03 moritzm: upgrading API server canaries to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build (T184854)
  • 08:40 vgutierrez: Repool (Re-enable BGP) lvs4006 - T191897
  • 08:14 ema: reboot deploy1001 and arm keyholder T175288
  • 08:14 moritzm: upgrading app server canaries to MEMC_VAL_COMPRESSION_ZLIB enabled HHVM build (T184854)
  • 07:47 akosiaris: set cache=none for ganeti VMs in codfw cluster configuration. VM reboots to follow T181121
  • 07:32 vgutierrez: Depool and reimage lvs4006 - T191897
  • 07:24 akosiaris: reboot ganeti VMs on row_A in eqiad for cache=none setting. T181121
  • 07:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 after alter table (duration: 01m 17s)
  • 05:36 marostegui: Kill atop on db1114 - T191996
  • 05:33 marostegui: Revert RX buffer changes on db1114 - T191996
  • 05:27 marostegui: Deploy schema change on db1097:3315 - T191519 T188299 T190148
  • 05:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 for alter table (duration: 01m 33s)
  • 03:18 urandom: decommissioning Cassandra, restbase1010-c -- T189822
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 52s)
  • 01:15 eileen: civicrm revision changed from 0ac27e7c0d to b1e7ccfc4d, config revision is 49f5ba45e8
  • 00:12 Dereckson: Wikis creation done
  • 00:12 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Set project namespace for hi.wikimedia (T188366) (duration: 01m 16s)
  • 00:04 Dereckson: dereckson@tin Synchronized wmf-config/InitialiseSettings.php: Fix path to hi.wikimedia.org 1x logo (Gerrit:427567)

2018-04-18

  • 23:44 Dereckson: Created bureaucrat account for Suyash.dwivedi at hi.wikimedia (T188366)
  • 23:35 dereckson@tin: Synchronized wmf-config/interwiki.php: New interwiki map for the six newest wikis (duration: 01m 17s)
  • 23:22 Dereckson: HTCP purge for https://hi.wikimedia.org and https://hi.wikimedia.org/
  • 23:19 Dereckson: Create tables for Translate extension on hiwikimedia
  • 23:13 Dereckson: HTCP purge for eu.wikisource logos
  • 23:10 dereckson@tin: Synchronized multiversion/MWMultiVersion.php: +hi.wikimedia.org +romd.wikimedia.org (duration: 01m 15s)
  • 23:05 dereckson@tin: Synchronized langlist: New languages: gor, inh, lfn (duration: 01m 17s)
  • 23:04 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Initial configuration for six wikis (duration: 01m 16s)
  • 23:03 dereckson@tin: rebuilt and synchronized wikiversions files: (no justification provided)
  • 23:02 dereckson@tin: Synchronized dblists: (no justification provided) (duration: 01m 15s)
  • 23:00 Dereckson: Starting syncing to production sequence for six wiki creation
  • 22:58 dereckson@tin: Synchronized static/images/project-logos/: Logos for eu.wikisource (T189465) (duration: 01m 12s)
  • 22:58 Dereckson: Created database and set initial stuff for hi.wikimedia.org (T188366)
  • 22:57 Dereckson: Created database and set initial stuff for romd.wikimedia.org
  • 22:31 Dereckson: Created database and set initial stuff for eu.wikisource.org (T189465)
  • 22:28 Dereckson: Created database and set initial stuff for gor.wikipedia.org (T189109)
  • 22:27 Dereckson: Created database and set initial stuff for inh.wikipedia.org (T184374)
  • 22:24 dereckson@tin: Synchronized php-1.31.0-wmf.29/extensions/WikimediaMaintenance/addWiki.php: Fix MassMessage fatal error (T192468) (duration: 01m 17s)
  • 22:17 Dereckson: Created database for lfn.wikipedia.org (T183561)
  • 21:57 eileen: civicrm revision changed from 00870af548 to 0ac27e7c0d, config revision is 853fcc9111
  • 21:53 ebernhardson: restart elasticsearch on elastic1022 with numa interleave
  • 21:17 eileen: civicrm revision changed from cddfe9416c to 00870af548, config revision is 853fcc9111
  • 20:52 ebernhardson: restart elasticsearch on elastic1020 with numa interleave
  • 20:13 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9328a7d]: Update mobileapps to fb161d7 (duration: 05m 56s)
  • 20:10 ebernhardson: restart elasticsearch on elastic1019 with numa interleave
  • 20:07 mholloway-shell@tin: Started deploy [mobileapps/deploy@9328a7d]: Update mobileapps to fb161d7
  • 19:55 thcipriani@tin: Finished scap: rebuild l10n cache (duration: 58m 57s)
  • 19:28 ppchelko@tin: Finished deploy [restbase/deploy@8d8f1df]: Test concurrent worker startups (duration: 15m 23s)
  • 19:15 ebernhardson: restart elasticsearch on elastic1018 with numa interleave
  • 19:13 ppchelko@tin: Started deploy [restbase/deploy@8d8f1df]: Test concurrent worker startups
  • 18:56 thcipriani@tin: Started scap: rebuild l10n cache
  • 18:35 dereckson@tin: Synchronized php-1.31.0-wmf.30/extensions/CentralNotice: Emit CSP headers on banner preview (duration: 01m 18s)
  • 18:33 ppchelko@tin: Finished deploy [changeprop/deploy@d83fad3]: Support multi-topic rules, rename metrics, update dependencies (duration: 01m 14s)
  • 18:32 ppchelko@tin: Started deploy [changeprop/deploy@d83fad3]: Support multi-topic rules, rename metrics, update dependencies
  • 18:25 imarlier@tin: Finished deploy [performance/coal@3c0ef36]: coal: typoed the run file (duration: 00m 04s)
  • 18:25 imarlier@tin: Started deploy [performance/coal@3c0ef36]: coal: typoed the run file
  • 18:17 imarlier@tin: Finished deploy [performance/coal@f1ca191]: Deploying coal version that includes a runner for service use (duration: 00m 04s)
  • 18:17 imarlier@tin: Started deploy [performance/coal@f1ca191]: Deploying coal version that includes a runner for service use
  • 17:53 ebernhardson: restart elasticsearch on elastic1017
  • 17:35 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Emit CSP headers on banner previews (T190100, no-op for now) (duration: 01m 16s)
  • 17:19 ejegg: updated CiviCRM from 64b26ad377 to cddfe9416c
  • 16:47 andrewbogott: deleted lots of log files (mostly nova-api logs) on labtestnet2001
  • 16:42 reedy@tin: Synchronized wmf-config/interwiki.php: sync! (duration: 01m 15s)
  • 16:32 reedy@tin: Synchronized php-1.31.0-wmf.30/extensions/WikimediaMaintenance: fix addwiki.php (duration: 01m 18s)
  • 16:30 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: translatew for advisorswiki (duration: 01m 16s)
  • 16:26 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: advisorswikki (duration: 01m 15s)
  • 16:24 reedy@tin: rebuilt and synchronized wikiversions files: advisorswiki
  • 16:21 reedy@tin: Synchronized dblists/: advisorswiki (duration: 01m 16s)
  • 16:11 ppchelko@tin: Finished deploy [cpjobqueue/deploy@749ae82]: Update dependencies and reduce dedupe logging rate (duration: 00m 43s)
  • 16:10 ppchelko@tin: Started deploy [cpjobqueue/deploy@749ae82]: Update dependencies and reduce dedupe logging rate
  • 15:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2077 (duration: 01m 16s)
  • 15:33 _joe_: depooling mw1227 for investigation in high load
  • 15:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 after alter table (duration: 01m 15s)
  • 15:09 urandom: decommissioning Cassandra, restbase1010-b -- T189822
  • 15:08 dcausse: reindexing serbian wikis on elastic@eqiad (T189265)
  • 14:55 urandom: restarting Cassandra, restbase1011-a to test v 0.8 of Prometheus JMX exporter -- T192456
  • 14:51 jynus: starting reimage of db2077
  • 14:37 urandom: restarting Cassandra, restbase1011-a -- T192456
  • 14:35 marostegui: Disable puppet on db1114 - T191996
  • 14:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2080, depool db2077 (duration: 01m 16s)
  • 14:04 gehel: powercycle unresponsive maps-test2001
  • 14:00 elukey: restart kafka on kafka1001 and kafka2001 (jobqueues,eventbus) for opnejdk-7 upgrades
  • 13:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 for alter table (duration: 01m 16s)
  • 13:49 marostegui: Deploy schema change on db1100 - T191519 T188299 T190148
  • 13:44 moritzm: uploaded HHVM 3.18.5+dfsg-1+wmf7+icu57 to apt.wikimedia.org/jessie-wikimedia (includes a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854))
  • 13:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 after alter table (duration: 01m 15s)
  • 13:17 Amir1: EU SWAT is done
  • 13:17 moritzm: uploaded HHVM 3.18.5+dfsg-1+wmf7+deb9u1 to apt.wikimedia.org/stretch-wikimedia (includes a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 13:16 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Limit page creation and edit rate on Wikidata (T184948) (duration: 01m 17s)
  • 13:00 jynus: starting reimage of db2080
  • 12:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2081, depool db2080 (duration: 01m 16s)
  • 11:20 vgutierrez: Repool (Re-enable BGP) lvs2004 - T191897
  • 11:02 jynus: starting reimage of db2081
  • 10:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2081, repool db2082, es2013 (duration: 01m 15s)
  • 10:45 vgutierrez: Depool and reimage lvs2004 - T191897
  • 10:27 vgutierrez: Repool (Re-enable BGP) in lvs2005 - T191897
  • 09:49 hoo: Ran scap pull on mwdebug1001 after checking https://gerrit.wikimedia.org/r/427156
  • 09:49 jynus: starting reimage of db2082
  • 09:46 Amir1: start of deleting auto patrol actions in small wikis (T184485)
  • 09:41 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2082 (duration: 01m 15s)
  • 09:37 moritzm: strip apache/nginx/nutcracker/hhvm from former image scaler (now spares)
  • 09:32 vgutierrez: Depool and reimage lvs2005 - T191897
  • 09:30 marostegui: Deploy schema change on db1096:3315 - T191519 T188299 T190148
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 for alter table (duration: 01m 22s)
  • 09:27 godog: reenable puppet fleetwide after https://gerrit.wikimedia.org/r/c/421860
  • 09:16 moritzm: imported lz4 0.0~r131-2~wmf1+trusty1 for trusty-wikimedia to apt.wikimedia.org (needed to build HHVM 3.18 for trusty)
  • 09:09 godog: stop puppet agent fleetwide before applying https://gerrit.wikimedia.org/r/c/421860/
  • 09:08 moritzm: reimaging mw1281 to stretch
  • 09:04 _joe_: restart HHVM on mw1223,mw1224, also repool them after investigation in crashes
  • 08:59 vgutierrez: Repool (Re-enable BGP) in lvs3003 - T191897
  • 08:44 elukey: execute cumin 'analytics10[28-69]*' 'rm /etc/apt/preferences.d/r_* && apt-get update' to clear jessie backports apt config - T192348
  • 07:39 vgutierrez: Depool and reimage lvs3003 as stretch - T191897
  • 06:49 marostegui: Deploy schema change on s5 codfw master (db2052) this will generate lag in codfw - T191519 T188299 T190148
  • 06:43 moritzm: installing ruby security updates for trusty
  • 05:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 after changing RX buffers - T191996 (duration: 01m 09s)
  • 05:20 marostegui: Change RX buffers on db1114 - T191996
  • 05:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 01m 15s)
  • 05:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092 after alter table (duration: 01m 16s)
  • 05:02 marostegui: Deploy schema change on db1071 (s8 primary master) - T185128 T153182
  • 02:51 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 55s)
  • 00:05 aaron@tin: Synchronized wmf-config/mc-labs.php: 8ad186728d: use mcrouter key prefixes (deployment-prep only) (duration: 01m 15s)

2018-04-17

  • 23:31 ebernhardson@tin: Synchronized wmf-config/CommonSettings-labs.php: labs config noop (duration: 01m 15s)
  • 23:17 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: T191236: Shift search traffic back to eqiad (duration: 01m 17s)
  • 23:08 gilles: Private wiki thumbnail traffic now going to eqiad T191643
  • 23:07 gilles@tin: Synchronized wmf-config/filebackend.php: Fix private wiki DC configuration: Serve private wiki thumbnails with Thumbor (T191643) (duration: 01m 18s)
  • 21:34 demon@tin: Synchronized wmf-config/CommonSettings.php: ext-dist config changes for rel1_31 (duration: 01m 16s)
  • 20:13 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.30
  • 19:59 smalyshev@tin: Started deploy [wdqs/wdqs@f08fbcc]: GUI update
  • 19:48 demon@tin: Finished scap: bootstrap wmf.30 (duration: 112m 35s)
  • 19:01 imarlier@tin: Finished deploy [performance/navtiming@22483a4]: Navtiming refactor for increased testability, and to add wrapper for easy service use (duration: 00m 02s)
  • 19:01 imarlier@tin: Started deploy [performance/navtiming@22483a4]: Navtiming refactor for increased testability, and to add wrapper for easy service use
  • 18:52 urandom: rebooting restbase-dev1006 (kernel oom killer misbehaving)
  • 18:45 urandom: rebooting restbase-dev1005 (kernel oom killer misbehaving)
  • 18:41 urandom: rebooting restbase-dev1004 (kernel oom killer misbehaving)
  • 17:56 demon@tin: Started scap: bootstrap wmf.30
  • 17:27 ejegg: updated payments-wiki from 320a6c2600 to 4a8aada491
  • 17:16 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 100% of anons for enwiki - T191101 (duration: 00m 59s)
  • 16:57 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1017 fully (duration: 01m 16s)
  • 16:37 elukey: incremental rollout of the new zookeeper jmx config to druid1* and conf*
  • 16:34 urandom: decommissioning Cassandra, restbase1010-a -- T189822
  • 16:02 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 75% of anons for enwiki - T191101 (duration: 00m 58s)
  • 15:50 arturo: enable puppet in labstore1004
  • 15:37 vgutierrez: Repool (Enable BGP) on lvs3004 - T191897
  • 15:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2048 IP - T191193 (duration: 00m 58s)
  • 15:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2048 IP - T191193 (duration: 00m 58s)
  • 15:23 marostegui: Stopping mysql on db2048 will break replication on codfw s1 slaves
  • 15:23 marostegui: Stop MySQL on db2048 for rack movement - T191193
  • 15:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067, es1017 with low load (duration: 01m 02s)
  • 14:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 after changing the network cable - T191996 (duration: 01m 02s)
  • 14:55 gehel: starting data reimport after re-image for wdqs2001 - T189192
  • 14:53 marostegui: Stop MySQL on db2042 to move it to another rack - https://phabricator.wikimedia.org/T191193
  • 14:36 ariel@tin: Finished deploy [dumps/dumps@1073d75]: more exception logging from xmlstream (duration: 00m 03s)
  • 14:36 ariel@tin: Started deploy [dumps/dumps@1073d75]: more exception logging from xmlstream
  • 14:30 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 50% of anons for enwiki - T191101 (duration: 00m 58s)
  • 14:25 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Support per-event dispatch of events, file 3/3 - T191464 (duration: 03m 07s)
  • 14:23 jynus: start es1017 reimage
  • 14:22 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Support per-event dispatch of events, file 2/3 - T191464 (duration: 03m 06s)
  • 14:16 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/extension.json: Support per-event dispatch of events, file 1/3 - T191464 (duration: 03m 00s)
  • 14:08 vgutierrez: Depool and reimage lvs3004 as stretch - T191897
  • 13:42 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/extension.json: Support per-event dispatch of events, file 1/3 - T191464 (duration: 03m 07s)
  • 13:33 moritzm: removed role::mediawiki::imagescaler from deployment-mediawiki05, per watroles the only use of that role in WMCS
  • 13:32 moritzm: removed role::mediawiki::imagescaler from deployment-prep, per watroles the only use of that role in WMCS
  • 13:30 jynus: starting backup from db1067, may generate some lag
  • 13:26 volans: updating puppet compiler facts
  • 13:25 elukey: completed migration of zookeeper on conf200[123]
  • 13:15 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 (duration: 00m 58s)
  • 13:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 to get it ready for a network cable change (duration: 00m 58s)
  • 13:00 elukey: upgrade zookeeper on conf200[123] to 3.4.9~jessie - T182924
  • 12:31 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 25% of annons on enwiki, take #2 - T191101 (duration: 00m 58s)
  • 12:04 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Enable page previews for 25% of annons on enwiki - T191101 (duration: 01m 03s)
  • 10:52 ema: lvs100[63] restart pybal to apply https://gerrit.wikimedia.org/r/424553 T188062
  • 10:39 ema: lvs200[63]: restart pybal to apply https://gerrit.wikimedia.org/r/424553 T188062
  • 10:03 mobrovac@tin: Finished deploy [restbase/deploy@e463fcf]: Use keep-alive for connections to AQS (duration: 20m 17s)
  • 09:43 mobrovac@tin: Started deploy [restbase/deploy@e463fcf]: Use keep-alive for connections to AQS
  • 09:37 moritzm: reimaging mw1280, mw1281, mw1282 (API servers) to stretch
  • 09:36 moritzm: reimaging mw1266, mw1267, mw1268 (app servers) to stretch
  • 09:17 godog: restart xenon-log on mwlog* - T169249
  • 08:46 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 08:19 elukey: restart nrpe-server on kafka2001 (kafka check not defined)
  • 08:01 moritzm: rolling restart of HHVM on video scalers to pick up ICU security update
  • 07:42 moritzm: installing ICU security updates
  • 07:27 jynus: restarting dbstore2001
  • 07:14 moritzm: installing perl security updates on trusty
  • 06:48 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=chromium.wikimedia.org,service=pdns_recursor
  • 06:47 vgutierrez: Depool and reimage chromium as stretch - T187090
  • 06:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 00m 58s)
  • 05:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 58s)
  • 05:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 05:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 00m 58s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 05:21 marostegui: Deploy schema change on db1092 - T187089 T185128 T153182
  • 05:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1092 for alter table (duration: 00m 58s)
  • 05:11 marostegui: Stop MySQL and reboot db1114 to boot up with the new kernel
  • 05:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 59s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 05m 27s)
  • 01:09 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op Ib39022 (duration: 01m 00s)

2018-04-16

  • 23:57 eileen: update civicrm revision changed from b3326dbf70 to 64b26ad377, config revision is 853fcc9111
  • 21:03 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the correct way of calculating the domain from the wiki, file 2/2 - T192198 (duration: 00m 58s)
  • 21:02 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the correct way of calculating the domain from the wiki, file 1/2 - T192198 (duration: 00m 59s)
  • 20:34 imarlier@tin: Finished deploy [performance/navtiming@64d9c90]: null deploy (duration: 00m 02s)
  • 20:33 imarlier@tin: Started deploy [performance/navtiming@64d9c90]: null deploy
  • 20:13 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Revert using the wiki of the job runner, file 2/2 (duration: 00m 58s)
  • 20:12 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Revert using the wiki of the job runner, file 1/2 (duration: 00m 58s)
  • 19:47 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the wiki set in the JobQueue when creating the event, file 2/2 - T192198 (duration: 00m 59s)
  • 19:46 mobrovac@tin: Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the wiki set in the JobQueue when creating the event, file 1/2 - T192198 (duration: 01m 00s)
  • 18:28 ottomata: temporarily stopping puppet on kafka200[123] to apply MirrorMaker --new.consumer https://gerrit.wikimedia.org/r/#/c/424344/ T190940
  • 18:03 ottomata: restarting main <-> main DC kafka mirror maker instances to blacklist job and cp topics T190940 T167039
  • 17:11 moritzm: upgraded HHVM on mediawiki-jobrunner03 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 15:53 akosiaris: restart hhvm on mw2252
  • 15:29 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2a720fc]: Log HTML for PHP fatal errors from MW (duration: 01m 01s)
  • 15:28 ppchelko@tin: Started deploy [cpjobqueue/deploy@2a720fc]: Log HTML for PHP fatal errors from MW
  • 15:25 moritzm: upgraded HHVM on mediawiki-deployment-07 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 15:07 jynus: start reimage of es3-codfw master, es2017
  • 15:01 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 14:53 vgutierrez: restart pybal on lvs1003 - T187766
  • 14:49 vgutierrez: restart pybal on lvs2003 - T187766
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 in API - T191996 (duration: 00m 58s)
  • 14:42 vgutierrez: restart pybal on lvs1006 - T187766
  • 14:39 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: cluster=wdqs-internal
  • 14:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 57s)
  • 14:25 vgutierrez: restarting pybal on lvs2006 - T187766
  • 14:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 14:12 moritzm: upgraded HHVM on mediawiki-deployment-09 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 14:06 jynus: start reimage of es2-codfw master, es2016
  • 14:05 hashar: restarted Jenkins for plugin upgrade T192261
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore main traffic original weight for db1114 (duration: 00m 58s)
  • 13:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 13:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1017 (duration: 00m 58s)
  • 13:31 marostegui: Stop MySQL on db1114 to reboot with another kernel - T191996
  • 13:30 godog: roll-restart swift-proxy in codfw and eqiad - T188062
  • 13:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 54s)
  • 13:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 (duration: 00m 59s)
  • 12:12 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=hydrogen.wikimedia.org,service=pdns_recursor
  • 12:11 vgutierrez: Depool and reimage hydrogen as stretch - T187090
  • 11:50 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 11:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1114 original weight (duration: 00m 59s)
  • 10:50 moritzm: reimaging mw1299 (job runner) to stretch
  • 10:23 ariel@tin: Finished deploy [dumps/dumps@4706d30]: show full stacktrace for dump job errors (duration: 00m 04s)
  • 10:23 ariel@tin: Started deploy [dumps/dumps@4706d30]: show full stacktrace for dump job errors
  • 10:18 godog: upload prometheus-memcached-exporter to stretch-wikimedia - T189056
  • 10:17 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 10:16 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more API traffic to db1114 (duration: 00m 58s)
  • 09:50 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=acamar.wikimedia.org,service=pdns_recursor
  • 09:49 vgutierrez: Depool and reimage acamar as stretch - T187090
  • 09:43 gehel: rolling restart of wdqs100[35] and wdqs200[123] for kernel upgrade completed
  • 09:40 jynus: restarting dbstore2001:s8 to increase the number of purge threads
  • 09:23 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=achernar.wikimedia.org,service=pdns_recursor
  • 09:07 gehel: starting rolling restart of wdqs100[35] and wdqs200[123] for kernel upgrade
  • 09:05 moritzm: pooled mw1276-mw1278 (API app server canaries running stretch)
  • 08:49 gehel: first manual run of populate_admin() for maps[12]001 - T190605
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1114 original main traffic weight (duration: 00m 58s)
  • 08:41 moritzm: pooled mw1261-mw1264 (app server canaries running stretch)
  • 08:29 joal@tin: Finished deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy (duration: 05m 27s)
  • 08:25 _joe_: depooling mw1223 for investigation too
  • 08:23 joal@tin: Started deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy
  • 08:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)
  • 08:04 elukey: restart hhvm on mw[1228,1234,1281-1287,1289,1290,1312-1314,1317,1339,1343,1345,1346,1348] - more than 50% cpu usage, prevention scheme for current high load
  • 08:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)
  • 07:49 marostegui: Stop MySQL and reboot db1114 - T191996
  • 07:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 59s)
  • 07:40 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=achernar.wikimedia.org,service=pdns_recursor
  • 07:39 vgutierrez: Depool and reimage achernar.wikimedia.org - T187090
  • 07:27 moritzm: installing perl security updates on Debian systems
  • 06:45 TimStarling: depooled mw1230
  • 06:38 _joe_: repooling mw1230
  • 06:20 marostegui: Drop table flow_subscription from x1 - T149936
  • 05:59 elukey: restart hhvm on mw[1221,1233,1280,1347] - high load
  • 05:55 elukey: repool mw1341 after investigation
  • 05:48 elukey: restart hhvm on mw1225, 1315, 1316, 1340, 1341, 1342, 1347 - high load
  • 05:42 marostegui: Reload haproxy on dbproxy1010
  • 05:36 elukey: restart hhvm on mw1226,27,32,88 - high load
  • 05:35 _joe_: depooling mw1341 to further debug the API issue
  • 05:33 marostegui: Deploy schema change on db1087 with replication (this will generate lag in labs) - T187089 T185128 T153182
  • 05:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1087 (duration: 00m 59s)
  • 03:02 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.29) (duration: 11m 09s)

2018-04-15

  • 22:09 ema: cp3037: restart varnish-be
  • 21:45 ema: cp3039: restart varnish-be
  • 21:42 elukey: restart hhvm on mw1286,1317,1339 - high load
  • 21:31 ema: cp3038: restart varnish-be
  • 21:30 ema: cp3036: restart varnish-be
  • 20:52 elukey: restart hhvm on mw13[43,45,46,48] - high load
  • 20:48 elukey: restart hhvm on mw13[12-14] - high load
  • 20:45 elukey: restart hhvm on mw[1285,1287,1289-1290] - high load
  • 20:40 _joe_: restart mw1344, high load
  • 20:38 elukey: restart hhvm on mw12[22,79,82] - high load
  • 20:32 elukey: restart hhvm on mw12[32-35] - high load
  • 20:24 elukey: restart hhvm on mw1229-31 - high load
  • 20:24 _joe_: restarted mw1280-4, high load
  • 20:17 elukey: restart hhvm on mw122[6-8] - high load
  • 20:05 elukey: restart hhvm on mw122[3,4] - high load
  • 13:42 elukey: restart hhvm on mw1227 due to high load (hhvm dump debug in /tmp/hhvm.44071.bt)
  • 10:53 elukey: powercycle mw1272 - not responsive to ssh, mgmt com2 console showing "[OK" and no tty

2018-04-13

  • 20:44 imarlier@tin: Finished deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active) (duration: 00m 02s)
  • 20:44 imarlier@tin: Started deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active)
  • 20:00 demon@tin: Pruned MediaWiki: 1.31.0-wmf.28 [keeping static files] (duration: 01m 34s)
  • 19:23 demon@tin: Pruned MediaWiki: 1.31.0-wmf.25 (duration: 05m 03s)
  • 17:17 andrewbogott: upgraded packages on all labvirts and restarted nova-compute
  • 16:55 arturo: enable puppet in labstore1005
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give db1104 origina main traffic weight (duration: 01m 00s)
  • 16:34 andrewbogott: upgrading packages on labvirt1016 and rebooting (1016 is a spare server that won't affect VPS users)
  • 16:26 arturo: disable puppet in labstore1005 to hot-test https://gerrit.wikimedia.org/r/#/c/426103/
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give db1104 some main traffic - T191996 (duration: 01m 00s)
  • 16:04 hashar: cleaning up lost instances in nodepool (nodepool delete XXXXX)
  • 15:50 andrewbogott: upgrading lots of packages and rebooting labservices1002 and 1002
  • 15:43 andrewbogott: restarting nodepool on labnodepool1001
  • 15:27 andrewbogott: rebooting lots of packages on labnet1001 and labnet1002 for T145919
  • 15:14 bd808: wiki replicas: added page_assessments views for frwiki & huwiki
  • 15:09 chasemp: labstore1004 stop nfs-exportd, cp export.bak to export.d, exportfs -ra (all exports were wiped out)
  • 14:59 andrewbogott: rebooting labcontrol1001
  • 14:42 andrewbogott: upgrading lots of packages on labcontrol1001 and 1002 and rebooting. T145919
  • 14:38 andrewbogott: stopping puppet and nodepool on labnodepool1001
  • 14:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 - T191996 (duration: 01m 07s)
  • 14:22 XioNoX: enable flow control on db1114's switch port - T191996
  • 14:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T191996 (duration: 00m 59s)
  • 14:13 andrewbogott: disabling puppet on labcontrol*, labnet*, labservices*, labvirt* before beginning T145919
  • 14:13 moritzm: installing apache security updates on contint1001
  • 14:09 andrewbogott: silencing alerts for labcontrol*, labnet*, labservices*, labvirt* before beginning T145919
  • 14:06 moritzm: uploaded ivy-debian-helper to apt.wikimedia.org/jessie (needed for zookeeper backport)
  • 13:52 elukey: roll restart druid + zookeeper daemons on druid100[123] for openjdk-7 updates
  • 13:49 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1013 with full weight (duration: 01m 00s)
  • 13:32 elukey: restart druid and zookeeper daemons on druid100[456] for opejdk-7 updates
  • 13:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 after alter table (duration: 01m 02s)
  • 13:19 urandom: increasing heap size to 16G -- T186751
  • 12:37 moritzm: installing apache security updates on mendelevium (otrs)
  • 12:36 moritzm: installing apache security updates on bohrium (piwik)
  • 11:58 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 11:56 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1013 with low load (duration: 01m 04s)
  • 10:59 moritzm: reimaging mw1261-mw1264 to stretch (T174431)
  • 10:40 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=maerlant.wikimedia.org,service=pdns_recursor
  • 10:38 vgutierrez: Depool and reimage maerlant.wikimedia.org as stretch
  • 10:16 vgutierrez@neodymium: conftool action : set/pooled=yes; selector: name=nescio.wikimedia.org,service=pdns_recursor
  • 10:01 moritzm: installing java security updates on meiterium/archive.wikimedia.org
  • 09:33 jynus: start reimage of es1013
  • 09:03 moritzm: reimaging mw1276-mw1278 to stretch (T174431)
  • 08:53 vgutierrez@neodymium: conftool action : set/pooled=no; selector: name=nescio.wikimedia.org,service=pdns_recursor
  • 08:52 vgutierrez: depool and reimage nescio.wikimedia.org as stretch
  • 08:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 in API - T191996 (duration: 01m 00s)
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully depool db1114 - T191996 (duration: 01m 00s)
  • 07:58 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Kick Electron, hanging, take 2 - T174916
  • 07:52 mobrovac@tin: Started restart [electron-render/deploy@94d27d7]: Kick Electron, hanging - T174916
  • 07:22 legoktm: restarting jenkins
  • 07:15 moritzm: pooling mw1265 and mw1279 for production traffic
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from main traffic - T191996 (duration: 01m 00s)
  • 05:37 marostegui: Deploy schema change on db1104 - T187089 T185128 T153182
  • 05:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 for alter table (duration: 01m 00s)
  • 05:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after alter table (duration: 01m 01s)
  • 05:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 after alter table (duration: 01m 01s)

2018-04-12

  • 23:33 awight@tin: Finished deploy [ores/deploy@543901a]: Restore ores1001 canary to master branch (duration: 03m 24s)
  • 23:30 awight@tin: Started deploy [ores/deploy@543901a]: Restore ores1001 canary to master branch
  • 23:25 awight@tin: Finished deploy [ores/deploy@a5cec53]: Canary ores1001 only: Limited test of git-lfs for ORES (duration: 02m 31s)
  • 23:22 awight@tin: Started deploy [ores/deploy@a5cec53]: Canary ores1001 only: Limited test of git-lfs for ORES
  • 23:09 dereckson@tin: Synchronized tests/: Update PHPUnit tests to use PHPUnit\Framework\TestCase (no-op) (duration: 01m 01s)
  • 22:07 urandom: restarting Cassandra, restbase2003 -- T192112
  • 21:07 urandom: restarting Cassandra, restbase1010 -- T192112
  • 21:03 urandom: temporarily disabling puppet to make (ephemeral) change to GC settings, restbase1010 -- T192112
  • 20:37 urandom: increase change-prop sample rate in dev env to 100% (from 80) -- T186751
  • 20:34 ppchelko@tin: Finished deploy [cpjobqueue/deploy@bd772eb]: Revert switching TranslationUpdateJob T192107 (duration: 00m 39s)
  • 20:33 ppchelko@tin: Started deploy [cpjobqueue/deploy@bd772eb]: Revert switching TranslationUpdateJob T192107
  • 20:32 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch TranslateUpdateJob back to the Redis-based queue as it is using PHP serialisation - T192107 (duration: 01m 00s)
  • 20:04 XioNoX: all good, revert routing ns1 to radon
  • 19:54 ema: reboot baham for kernel upgrade T188092
  • 19:51 XioNoX: routing ns1 to radon
  • 19:46 XioNoX: all good, revert routing ns0 to baham
  • 19:41 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to 1.31.0-wmf.29
  • 19:40 ema: reboot radon for kernel upgrade T188092
  • 19:37 XioNoX: routing ns0 to baham
  • 18:02 arlolra@tin: Finished deploy [parsoid/deploy@1807a38]: Updating Parsoid to 322b6e8 (duration: 15m 09s)
  • 17:47 arlolra@tin: Started deploy [parsoid/deploy@1807a38]: Updating Parsoid to 322b6e8
  • 17:38 herron: puppet master updates complete — re-enabling puppet agents
  • 17:35 moritzm: installing apache security updates on hafnium
  • 17:31 herron: temporarily disabling puppet agents for openssl updates and apache restarts on puppet masters
  • 17:27 moritzm: installing apache security updates on krypton
  • 17:17 moritzm: installing patch security updates on trusty
  • 16:59 urandom: increase change-prop sample rate in dev env to 80% (from 60) -- T186751
  • 16:21 marostegui: Deploy schema change on db1066 - T132416
  • 16:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 to main traffic and depool db1066 for alter table - T191996 (duration: 01m 17s)
  • 16:07 marostegui: Reboot es2013 - T191977
  • 15:27 gehel: rolling restart of elasticsearch cirrus / eqiad for jvm upgrade completed
  • 15:06 moritzm: installing django/apache security updates on labmon*
  • 15:03 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2013 (duration: 01m 17s)
  • 14:59 jynus: shutting down es2013's mariadb
  • 14:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: No-op: Clean up an unused global var for the EventBus-based JobQueue (duration: 01m 17s)
  • 14:44 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the second bulk of low-traffic jobs for all wikis - T190327 (duration: 01m 16s)
  • 14:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327 (duration: 00m 35s)
  • 14:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327
  • 14:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from main traffic - T191996 (duration: 01m 18s)
  • 14:21 vgutierrez: Reimage lvs2006 as stretch
  • 14:11 moritzm: pooling mw1265 (app server) temporarily for production traffic
  • 14:03 urandom: increase change-prop sample rate in dev env to 60% (from 40) -- T186751
  • 13:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 into API - T191996 (duration: 01m 17s)
  • 13:47 herron: updated puppet-run script to log using syslog and updated rsyslog config to direct puppet-agent logs to /var/log/puppet.log https://gerrit.wikimedia.org/r/425538
  • 13:44 sbisson@tin: Finished deploy [tilerator/deploy@46cc948]: Deploying tilerator@i18n everywhere (duration: 02m 04s)
  • 13:44 marostegui: Deploy schema change on db1101:3318 - T187089 T185128 T153182
  • 13:42 sbisson@tin: Started deploy [tilerator/deploy@46cc948]: Deploying tilerator@i18n everywhere
  • 13:40 gehel: dropping leftover keyspace v2 and v5 on maps / eqiad - T191655
  • 13:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 for alter table (duration: 01m 17s)
  • 13:31 moritzm: installing openssl updates
  • 13:31 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 (duration: 01m 17s)
  • 13:22 gehel: i18n maps will not be available yet, this is only preliminary work
  • 13:22 gehel: deploying maps internationalization, including new keyspace and generating new tiles - T191655
  • 13:18 zeljkof: EU SWAT finished
  • 13:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Page Previews for 10% enwiki anon users (T189906) (duration: 01m 18s)
  • 13:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1012 with full weight (duration: 01m 17s)
  • 12:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1109 after alter table (duration: 01m 17s)
  • 12:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 from API - T191996 (duration: 01m 17s)
  • 12:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool es1012 with low weight (duration: 01m 19s)
  • 12:13 marostegui: Deploy schema change on s8 dbstore1002 - T187089 T185128 T153182
  • 11:59 moritzm: pooling mw1279 for some brief test production traffic
  • 09:58 jynus: reimage es1012, take 2
  • 08:12 marostegui: Drop table linkscc from s3 codfw primary master
  • 08:11 marostegui: Drop table linkscc from s1
  • 07:55 marostegui: Drop table linkscc from s2 and s7
  • 07:50 marostegui: Drop table linkscc from s4,s5 and s6
  • 07:41 jynus: reimage es1012
  • 07:40 moritzm: enabling production traffic for mw1265
  • 07:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 after alter table - T190780 (duration: 01m 16s)
  • 07:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 for alter table - T190780 (duration: 01m 17s)
  • 06:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 after alter table - T190780 (duration: 01m 17s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for alter table - T190780 (duration: 01m 17s)
  • 06:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 after alter table - T190780 (duration: 01m 16s)
  • 06:42 marostegui: Deploy schema change on db1072 (sanitarium master for s3) - this will generate lag on s3 labsdb - T190780
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 for alter table - T190780 (duration: 01m 18s)
  • 06:27 marostegui: Deploy schema change on s3 codfw master (db2043) - this will generate lag on s3 codfw -T190780
  • 06:24 marostegui: Deploy schema change on s1 primary master (db1052) - T190780
  • 06:11 marostegui: Deploy schema change on s7 primary master (db1062) - T190780
  • 06:08 elukey: force kill of fuse_dfs (handling /mnt/hdfs) on stat1004, apparently causing a huge load
  • 06:05 elukey: force kill of fuse_dfs (handling /mnt/hdfs) on stat1005, apparently causing a huge load
  • 05:52 marostegui: Deploy schema change on s2 primary master (db1054) - T190780
  • 05:49 marostegui: Deploy schema change on s8 primary master (db1071) - T190780
  • 05:45 marostegui: Deploy schema change on s4 primary master (db1068) - T190780
  • 05:39 marostegui: Deploy schema change on s6 primary master (db1061) - T190780
  • 05:34 marostegui: Deploy schema change on s5 primary master (db1070) - T190780
  • 05:27 marostegui: Deploy schema change on db1109 - T187089 T185128 T153182
  • 05:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 for alter table (duration: 01m 17s)
  • 05:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 after alter table (duration: 01m 18s)
  • 05:11 marostegui: Reload haproxy on dbproxy1011 to repool labsdb1009
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 07m 20s)
  • 01:34 eileen: civicrm revision changed from 07bade75a2 to b3326dbf70, config revision is 853fcc9111 (deploy wmffraud report)
  • 00:44 twentyafterfour: The hotfix that I deployed for phabricator: https://phabricator.wikimedia.org/rPHEX7801b519442eea2bfd47a272ba36959b487ae7d7
  • 00:33 twentyafterfour: phabricator: hotfixing DeadlineEditEngineSubtype.php
  • 00:23 twentyafterfour: phabricator is back
  • 00:18 twentyafterfour: phabricator will be offline for just a moment while I run the upgrade script.
  • 00:15 twentyafterfour: preparing to deploy phabricator rPHDEP/release/2018-04-12/1 https://phabricator.wikimedia.org/project/view/3335/
  • 00:09 mutante: jerkins-bot tests all return -1 due to operations-mw-config-php55lint failing which says it can't clone on integration-slave-jessie-1003, which is out of disk space in /srv as reported by shinken. it's mostly all /srv/pbuilder
  • 00:08 twentyafterfour: phabricator update will begin shortly, running a bit behind due to a massive upstream merge which will have to wait until later date.
  • 00:08 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/425723/ (duration: 01m 18s)

2018-04-11

  • 23:48 ejegg: enabled new civicrm contact de-dupe job
  • 23:19 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Allow sysops to create Flow boards on euwiki (T190500) (duration: 01m 17s)
  • 23:09 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Stop logging autopatrol actions everywhere (T184485) (duration: 01m 18s)
  • 22:47 samwilson@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy GlobalPreferences T184121 (duration: 01m 17s)
  • 22:47 mutante: ores2* - puppet ran to change venv config, then 'rm -rf /srv/deployment/ores/venv/' via cumin to clean-up (T181071)
  • 22:41 mutante: ores1002-1009 - deleting old venv dir - rm -f /srv/deployment/ores/venv (T181071)
  • 22:37 mutante: ores1001 - rm -rf /srv/deployment/ores/venv/
  • 22:37 mutante: ores - same for codfw instances, change of venv path to /srv/deployment/ores/deploy/venv/
  • 22:30 mutante: ores - all eqiad instances are being restarted by puppet after config change
  • 22:28 mutante: ores - running puppet on all instances to apply venv path change for T181071
  • 22:24 musikanimal@tin: Synchronized wmf-config/InitialiseSettings.php: Enabling PageAssessments on huwiki (T191697) (duration: 01m 17s)
  • 22:23 bstorm_: views updated on labsdb1009
  • 22:13 musikanimal@tin: Synchronized wmf-config/InitialiseSettings.php: Enabling PageAssessments on frwiki (T153393) (duration: 01m 26s)
  • 20:36 urandom: increase change-prop sample rate in dev env to 40% (from 20) -- T186751
  • 20:20 awight@tin: Finished deploy [ores/deploy@b6deb5d]: Transitional virtualenv for ORES (take 2), T181071 (duration: 18m 34s)
  • 20:02 thcipriani@tin: Synchronized php: group1 to 1.31.0-wmf.29 (duration: 01m 16s)
  • 20:02 awight@tin: Started deploy [ores/deploy@b6deb5d]: Transitional virtualenv for ORES (take 2), T181071
  • 20:00 thcipriani@tin: rebuilt and synchronized wikiversions files: group1 to 1.31.0-wmf.29
  • 19:23 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.29
  • 19:11 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki back to 1.31.0-wmf.29
  • 19:09 thcipriani@tin: Synchronized php-1.31.0-wmf.29/includes/libs/rdbms/database: rdbms: fix transaction flushing in Database::close T191916 (duration: 01m 01s)
  • 18:47 urandom: restarting cassandra, dev environment (set -XX:+PerfDisableSharedMem) -- T186751
  • 18:11 mutante: deploy1001 is back on stretch once again - it has been removed from scap hosts though (T175288 T185275)
  • 17:40 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy page previews for anons on dewiki T191966 (duration: 00m 54s)
  • 17:30 sbisson@tin: Finished deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 everywhere (duration: 02m 27s)
  • 17:29 Krinkle: actually re-enabled puppet on graphite2001
  • 17:28 sbisson@tin: Started deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 everywhere
  • 17:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on wikis with <50 issues in high priority linter cats T190731 (duration: 00m 59s)
  • 16:53 sbisson@tin: Finished deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 to maps-test* (duration: 01m 16s)
  • 16:51 sbisson@tin: Started deploy [kartotherian/deploy@4cd5a19]: Deploying kartotherian v0.0.38 to maps-test*
  • 16:44 elukey: restart hadoop hdfs namenodes on analytics100[12] to pick up HDFS Trash settings - T189051
  • 16:35 robh: cp2018 returned to service
  • 16:33 foks: See T191887
  • 16:24 robh: cp2011 returned to service
  • 16:23 marostegui: Reload haproxy on dbproxy1011 to depool labsdb1009
  • 16:14 elukey: reboot notebook1001 for kernel updates
  • 16:11 urandom: restarting cassandra, dev environment (testing default GC settings) -- T186751
  • 15:58 Krinkle: Re-enabled puppet and coal on graphite2001
  • 15:43 robh: cp2008 repooled after memory swap
  • 15:20 Krinkle: disabling coal service on graphite2001 and disabling puppet – T191239
  • 15:19 jynus: fixing grant issue on db1114
  • {{safesubst:SAL entry|1=15:14 ema: restart pybal on lvs1003 for logstash-{json,syslog} UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425253/}}
  • {{safesubst:SAL entry|1=15:08 ema: restart pybal on lvs1006 for logstash-{json,syslog} UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425253/}}
  • 15:06 robh: shutting down cp2008, cp2011, and cp2018 for onsite work
  • 15:01 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool es1012 (duration: 01m 00s)
  • 15:01 marlier: Stopping coal on graphite2001.codfw.wmnet for data replay
  • 14:54 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2013 (duration: 01m 00s)
  • 14:54 gehel: starting rolling restart of elasticsearch cirrus / eqiad for jvm upgrade
  • 14:39 moritzm: rolling restart of restbase in eqiad to pick up openssl update
  • 14:38 Krinkle: Turned regular coal back on (T191239)
  • 14:37 ppchelko@tin: Finished deploy [cpjobqueue/deploy@a090a3c]: Fix the low priority jobs topic names (duration: 00m 38s)
  • 14:36 ppchelko@tin: Started deploy [cpjobqueue/deploy@a090a3c]: Fix the low priority jobs topic names
  • 14:15 jynus: start reimage of es2013
  • 14:14 marostegui: Deploy schema change on db1099:3318 - T187089 T185128 T153182
  • 14:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 for alter table (duration: 01m 00s)
  • 14:12 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2013 (duration: 01m 00s)
  • 13:44 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3ba6580]: Enable second bulk of low-traffic jobs T190327 take 2 (duration: 00m 49s)
  • 13:44 ppchelko@tin: Started deploy [cpjobqueue/deploy@3ba6580]: Enable second bulk of low-traffic jobs T190327 take 2
  • 13:41 ppchelko@tin: Finished deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327 (duration: 08m 27s)
  • 13:37 moritzm: rolling restart of restbase in codfw to pick up openssl update
  • 13:33 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 2/2 - T190327 (duration: 01m 00s)
  • 13:32 ppchelko@tin: Started deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327
  • 13:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1072 (duration: 01m 07s)
  • 13:31 ppchelko@tin: Started deploy [cpjobqueue/deploy@2b59313]: Enable second bulk of low-traffic jobs T190327
  • 13:27 marostegui: Drop prefstats table on s3 sanitarium master (db1072) this might cause lag on labs - T154490
  • 13:26 moritzm: installing java security updates on kafka/main cluster
  • 13:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 (duration: 01m 00s)
  • 13:13 marostegui: Drop prefstats table on s1 codfw master - db2048 (this might generate lag on codfw) - T154490
  • 13:12 elukey: restart kafka brokers on kafka1012->23 for openjdk-7 upgrades
  • 13:09 marostegui: Drop prefstats table on s3 codfw master - db2043 (this might generate lag on codfw) - T154490
  • 13:01 vgutierrez: Reimage lvs4007 as stretch
  • 13:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2012 (duration: 01m 00s)
  • 12:39 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry #2) (duration: 01m 01s)
  • 12:32 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry) - T190327 (duration: 01m 00s)
  • 12:21 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 - T190327 (duration: 01m 01s)
  • 12:21 moritzm: enable production traffic for mw1265 (stretch app server) for a brief test period
  • 12:09 jynus: start reimage of es2012
  • 12:05 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2011, depool es2012 (duration: 01m 01s)
  • 11:47 jynus: start reimage of es2011
  • 11:09 ema: start pybal on lvs5001, test completed on lvs5003
  • 11:04 marostegui: Drop table prefstats in s7 - T154490
  • 10:59 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2011 (duration: 00m 59s)
  • 10:56 ema: stop pybal on lvs5001 to test requests through lvs5003, reimaged as stretch T191897
  • 10:50 moritzm: installing openssl updates
  • 10:43 marostegui: Drop table prefstats in s2 - T154490
  • 10:33 marostegui: Drop table prefstats in s4 - T154490
  • 10:31 marostegui: Drop table prefstats in s6 - T154490
  • 10:28 marostegui: Drop table prefstats in s5 - T154490
  • 10:04 jynus: start reimage of es2015
  • 10:00 moritzm: installing java security updates on kafka/jumbo cluster
  • 09:57 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2014, depool es2015 (duration: 01m 02s)
  • 09:52 moritzm: installing java security updates on kafka/analytics cluster
  • 09:29 arturo: doing some testing in labtestvirt2001 mounting instance's qcow2 files into /home/aborrero/mnt
  • 09:17 jynus: start reimage of es2014
  • 09:08 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 (duration: 01m 03s)
  • 09:03 ema: restart pybal on lvs1003 for UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425251/
  • 08:59 moritzm: reimaging mw1265 to stretch (T174431)
  • 08:18 jynus: rerunning eqiad misc backups
  • 08:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 as candidate master for x1 - T191275 (duration: 01m 03s)
  • 07:45 ema: cp2022: restart varnish-be due to child process crash https://phabricator.wikimedia.org/P6979 T191229
  • 07:27 marostegui: Stop MySQL on db2033 to copy its data away before reimaging - T191275
  • 07:08 vgutierrez: Reimaging lvs5003.eqsin as stretch (2nd attempt)
  • 06:49 elukey: restart Yarn Resource Manager daemons on analytics100[12] to pick up the new Prometheus configuration file
  • 06:20 marostegui: Stop MySQL on db2033 to clone db2069 - T191275
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 03s)
  • 06:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 01s)
  • 05:28 Krinkle: manual coal back-fill still running with the normal coal disabled via systemd. Will restore normal coal when I wake up.
  • 05:22 marostegui: Deploy schema change on codfw s8 master (db2045) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182
  • 05:17 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1011
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 41s)
  • 00:12 bstorm_: Updated views and indexes on labsdb1011

2018-04-10

  • 23:32 XioNoX: depolled eqsin due to router issue
  • 23:04 Krinkle: Seemingly from 22:53 - 23:03 global traffic dropped by 30-60%, presumably due to issues in eqiad where 10 Gbits dropped to 3 Gbits sharper than ever before.
  • 22:49 joal@tin: Finished deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors (duration: 04m 46s)
  • 22:45 joal@tin: Started deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors
  • 21:18 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35 (duration: 06m 27s)
  • 21:12 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35
  • 20:41 sbisson@tin: Finished deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot) (duration: 03m 45s)
  • 20:37 sbisson@tin: Started deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot)
  • 20:30 mutante: deploy1001 - reinstalled with stretch - re-adding to puppet (T175288)
  • 20:30 mutante: deploy1001 - reinstalled with jessie - re-adding to puppet (T175288)
  • 20:13 urandom: increasing sample change-prop sample rate to 20% (from 10) in dev environment -- T186751
  • 20:06 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki back to 1.31.0-wmf.28
  • 20:02 sbisson@tin: Finished deploy [kartotherian/deploy@6e4d666]: Deploying kartotherian pre-i18n everywhere (duration: 04m 34s)
  • 19:58 sbisson@tin: Started deploy [kartotherian/deploy@6e4d666]: Deploying kartotherian pre-i18n everywhere
  • 19:57 sbisson@tin: Finished deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n everywhere (duration: 00m 48s)
  • 19:56 sbisson@tin: Started deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n everywhere
  • 19:48 sbisson@tin: Finished deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n to maps-test* (duration: 00m 27s)
  • 19:48 sbisson@tin: Started deploy [tilerator/deploy@3326c14]: Deploying tilerator pre-i18n to maps-test*
  • 19:16 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.29 and rebuild l10n cache (duration: 66m 28s)
  • 18:10 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.29 and rebuild l10n cache
  • 18:07 Krinkle: Stopping coal on graphite1001 to manually repopulate for T191239
  • 18:04 otto@tin: Finished deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 3 (duration: 04m 54s)
  • 17:59 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 3
  • 17:58 otto@tin: Finished deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2 (duration: 01m 50s)
  • 17:56 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2
  • 17:56 otto@tin: Started deploy [analytics/refinery@b8ea97f]: refinery 0.0.60 - take 2^
  • 17:49 joal@tin: Finished deploy [analytics/refinery@b8ea97f]: Analytics weekly deploy - Move to spark 2 (duration: 03m 55s)
  • 17:48 joal@tin: (no justification provided)
  • 17:47 joal@tin: (no justification provided)
  • 17:45 joal@tin: Started deploy [analytics/refinery@b8ea97f]: Analytics weekly deploy - Move to spark 2
  • 17:43 chasemp: add static route to neutron poc instance range for codfw 172.16.128.0/21
  • 17:22 papaul: shutting down cp2022 for main board replacement
  • 17:20 awight@tin: Finished deploy [ores/deploy@d35a1e6]: Test deploy virtualenv on ores1001, with logging and forced failure (duration: 02m 44s)
  • 17:17 awight@tin: Started deploy [ores/deploy@d35a1e6]: Test deploy virtualenv on ores1001, with logging and forced failure
  • 17:07 awight@tin: Finished deploy [ores/deploy@1e18fa6]: Test deploy virtualenv on ores1001, with logging (duration: 02m 28s)
  • 17:05 awight@tin: Started deploy [ores/deploy@1e18fa6]: Test deploy virtualenv on ores1001, with logging
  • 16:57 thcipriani: starting branch cut of 1.31.0-wmf.29
  • 16:45 andrew@tin: Synchronized wmf-config/CommonSettings.php: disable new accounts on labtestwikitech (duration: 01m 00s)
  • 16:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2045 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 16:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2045 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 16:21 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011
  • 16:11 marostegui: Stop MySQL on db2045 (s8 codfw master) to move it to another rack, this will break replication on codfw - T191193
  • 16:07 bstorm_: labsdb1010 now has the latest views available, including the comment table
  • 16:05 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010
  • 15:42 ottomata: disable puppet on analytics1003 and stop camus crons in preperation for spark 2 upgrade
  • 15:32 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010
  • 15:26 vgutierrez: Reimage lvs5003 as stretch
  • 15:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2040 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 15:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2040 IP as it is being moved to another rack - T191193 (duration: 00m 59s)
  • 15:08 volans: restarting Icinga on einsteinium, command file not working
  • 15:06 bd808: Wiki replicas: ran `sudo maintain-views --table page_assessments --database arwiki` on all 3 servers for T191455
  • 14:46 marostegui: Stop MySQL on db2040 for server move - this is s7 master, so replication will break in codfw T191193
  • 14:23 volans: restarted nsca server on einsteinium
  • 14:21 vgutierrez: re-enable puppet on primary LVS
  • 14:17 moritzm: installing python-crypto security updates on trusty
  • 13:55 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T188198 Enable TemplateStyles on ruwiki (duration: 01m 00s)
  • 13:51 vgutierrez: disable puppet on primary LVS to merge safely gerrit/425040 T177961
  • 13:47 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/: SWAT: Restore subtract method for backward compatibility (T191696) (duration: 01m 01s)
  • 13:41 moritzm: upgraded HHVM on mediawiki-deployment04/05/06 to a build with a patch for the MEMC_VAL_COMPRESSION_ZLIB flag in the memcached module (T184854)
  • 13:35 elukey: restart kafka on kafka-jumbo1001 for openjdk upgrades
  • 13:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Update wikis with consolidate editing feedback" (T168886) (duration: 00m 59s)
  • 13:29 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/: SWAT: Disable search for global filters (T191539) (duration: 01m 01s)
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update wikis with consolidate editing feedback (T168886) (duration: 01m 00s)
  • 13:19 ema: restart pybal on lvs1006 for config changes introduced by https://gerrit.wikimedia.org/r/#/c/425251/
  • 12:02 moritzm: upgrading naos and wasat to ICU57-enabled build of HHVM
  • 12:01 _joe_: uploading mcrouter 0.37.0 to stretch-wikimedia (T190979)
  • 11:59 _joe_: uploading mcrouter 0.37.0 to jessie-wikimedia (T190979)
  • 11:15 mobrovac@tin: Finished deploy [restbase/deploy@29df9db]: Use the MCS-provided content-type in the definition response - T191809 (duration: 24m 19s)
  • 11:07 moritzm: upgrading mwdebug servers in codfw to ICU57-enabled build of HHVM
  • 10:51 mobrovac@tin: Started deploy [restbase/deploy@29df9db]: Use the MCS-provided content-type in the definition response - T191809
  • 10:47 arturo: T188266 reimage labtestservices2002.wikimedia.org
  • 10:23 moritzm: upgrading job runners in codfw to ICU57-enabled build of HHVM
  • 09:29 moritzm: upgrading app servers in codfw to ICU57-enabled build of HHVM
  • 07:52 hoo: Updated operations/dumps/dcat (7ea4e75c..61154ca4) on snapshot1007
  • 07:37 moritzm: upgrading API servers in codfw to ICU57-enabled build of HHVM
  • 05:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2069 from config - T191275 (duration: 00m 58s)
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2069 from config - T191275 (duration: 00m 59s)
  • 05:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 after alter table (duration: 01m 11s)
  • 05:17 marostegui: Deploy alter table on s1 primary master (db1052) - T185128 T153182
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 39s)

2018-04-09

  • 21:11 XioNoX: cr1-eqsin 24h experiment on applying same local-pref to peers and transits - T186835
  • 20:48 arlolra: Updated Parsoid to edeeb60 (T191281, T187386, T185266)
  • 20:38 awight@tin: Finished deploy [ores/deploy@be69c1d]: Transitional virtualenv for ORES, T181071 (duration: 24m 14s)
  • 20:32 arlolra@tin: Finished deploy [parsoid/deploy@447fab2]: Updating Parsoid to edeeb60 (duration: 11m 03s)
  • 20:21 arlolra@tin: Started deploy [parsoid/deploy@447fab2]: Updating Parsoid to edeeb60
  • 20:14 awight@tin: Started deploy [ores/deploy@be69c1d]: Transitional virtualenv for ORES, T181071
  • 20:12 awight@tin: Finished deploy [ores/deploy@b61c338]: Transitional virtualenv for ORES, T181071 (duration: 00m 19s)
  • 20:12 awight@tin: Started deploy [ores/deploy@b61c338]: Transitional virtualenv for ORES, T181071
  • 20:01 herron: repooled rhodium (puppet master backend) https://gerrit.wikimedia.org/r/425078
  • 19:57 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2017.codfw.wmnet
  • 19:26 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Switch SET on frwiktionary to use wikitexteditor by default (T169741) (duration: 01m 00s)
  • 19:17 sbisson@tin: Finished deploy [kartotherian/deploy@a26712b]: Deploying kartotherian i18n to maps-test* (with updated source and style) (duration: 01m 46s)
  • 19:15 sbisson@tin: Started deploy [kartotherian/deploy@a26712b]: Deploying kartotherian i18n to maps-test* (with updated source and style)
  • 18:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 18:58 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2010.codfw.wmnet
  • 18:58 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Enable PageAssessments on arwiki (T185023) (duration: 01m 00s)
  • 18:50 papaul: shutting down cp2017 for memory replacement
  • 18:37 papaul: shutting down cp2010 for memory replacement
  • 18:21 papaul: shutting down cp2006 for memory replacement
  • 18:04 gehel@tin: Finished deploy [wdqs/wdqs@7116a56]: new GUI version (duration: 02m 11s)
  • 18:01 gehel@tin: Started deploy [wdqs/wdqs@7116a56]: new GUI version
  • 17:58 papaul: shutting down cp2022 for memory replacement
  • 16:53 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2017.codfw.wmnet
  • 16:53 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2010.codfw.wmnet
  • 16:52 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 15:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:28 dereckson@tin: Synchronized wmf-config/flaggedrevs.php: Always show latest revision even if not reviewed on hu.wikipedia (T121995) (duration: 00m 59s)
  • 14:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 14:11 marostegui: Deploy schema change on db1067 - T187089 T185128 T153182
  • 14:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 for alter table (duration: 00m 59s)
  • 14:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1080 after alter table, kernel and mariadb upgrade (duration: 00m 59s)
  • 13:52 marostegui@tin: Synchronized wmf-config/db-codfw.php: Pool db2092 in s1 T170662 (duration: 00m 59s)
  • 13:49 zeljkof: EU SWAT finished
  • 13:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RelatedArticles for vector at hewiki (T191573) (duration: 00m 59s)
  • 13:43 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add adm.dp.gov.ua to wgCopyUploadDomains, change if.gov.ua to www.if.gov.ua (T191692) (duration: 00m 59s)
  • 13:37 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix broken line that includes a group into a group by mistake (T191719) (duration: 00m 59s)
  • 13:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable <mapframe> on ku.wikipedia (T190944) (duration: 00m 57s)
  • 13:14 moritzm: upgrading Boost libraries on mwdebug with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 13:14 _joe_: started updateCollation.php maintenance script for the ICU 57 migration (T189295)
  • 13:03 marostegui: Stop MySQL on db1080 for mariadb and kernel upgrade
  • 13:03 _joe_: upgrading HHVM / libboost for ICU 57 upgrade (T189295)
  • 13:01 sbisson@tin: Finished deploy [tilerator/deploy@aef010b]: Deploying tilerator i18n to maps-test* (with updated source and style) (duration: 00m 33s)
  • 13:00 sbisson@tin: Started deploy [tilerator/deploy@aef010b]: Deploying tilerator i18n to maps-test* (with updated source and style)
  • 12:54 moritzm: upgrading Boost libraries on mwdebug with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 12:39 moritzm: upgrading Boost libraries on job runners with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 12:23 _joe_: preparing to run updateCollation from mw1338: stop videoscaler, disable puppet (T189295)
  • 12:05 _joe_: upgrading boost, hhvm on terbium for ICU 57 upgrade (T189295)
  • 12:01 elukey: upgrading Boost libraries on all mediawiki eqiad API server with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 11:50 moritzm: upgrading Boost libraries on remaining app servers with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 11:42 moritzm: removed profile::beta::icu57 from deployment-prep Hiera config now that the component is part of the standard app server manifests
  • 11:04 moritzm: upgrading Boost libraries on API server canaries with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:41 moritzm: upgrading Boost libraries on mw1300 with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:31 moritzm: upgrading Boost libraries on app server canaries with a ICU 57-enabled HHVM build and restart HHVM (T189295)
  • 10:15 moritzm: upgrading tin/deploy1001 to a ICU 57-enabled HHVM build (T189295)
  • 10:13 elukey: completed upgrade of mw eqiad api appservers to ICU 57-enabled HHVM
  • 10:10 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 10:09 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 09:54 moritzm: upgrading mwdebug servers in eqiad to to ICU 57-enabled HHVM build (T189295)
  • 09:33 _joe_: all eqiad jobrunners migrated to ICU 57 (T189295)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db2092 to the config - T170662 (duration: 00m 59s)
  • 09:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db2092 to the config - T170662 (duration: 00m 58s)
  • 08:45 elukey: upgrading eqiad api appservers to ICU 57-enabled HHVM build (T189295)
  • 08:37 marostegui: Deploy schema change on db1080 - T187089 T185128 T153182
  • 08:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 for alter table (duration: 00m 59s)
  • 08:35 jynus@tin: Synchronized wmf-config/db-codfw.php: Repoo es2019 (duration: 00m 59s)
  • 08:32 moritzm: upgrading remaining app servers in eqiad to to ICU 57-enabled HHVM build (T189295)
  • 08:32 _joe_: upgrading eqiad jobrunners to ICU 57-enabled HHVM build (T189295)
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 after alter table (duration: 00m 58s)
  • 07:56 marostegui: Remove /var/log/wikidata/rebuildTermSqlIndex.log* as per Amir1's request
  • 07:48 moritzm: upgrading mw1276-1279 (API canaries) to ICU 57-enabled HHVM build (T189295)
  • 07:42 _joe_: repooling mw1300 now with ICU 57-enabled HHVM build (T189295)
  • 07:38 _joe_: upgrading mw1300 to ICU 57-enabled HHVM build (T189295)
  • 07:32 moritzm: upgrading mw1262-1265 to ICU 57-enabled HHVM build (T189295)
  • 07:24 moritzm: repooling mw1261 after upgrade to ICU 57-enabled HHVM build (T189295)
  • 07:17 moritzm: upgrading mw1261 to ICU 57-enabled HHVM build (T189295)
  • 07:09 elukey: upgrade burrow to 1.0 on kafkamon[12]* - T188719
  • 06:58 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=zhwiktionary --check-old --before 20180223210426 --sleep 2 (T184485)
  • 06:43 marostegui: Reboot db2072 for kernel upgrade
  • 06:41 marostegui: Stop MySQL on db2072 to clone db2092 from it - T170662
  • 06:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2072 - T170662 (duration: 00m 59s)
  • 06:24 elukey: upgrade burrow 1.0.0 to stretch/jessie wikimedia
  • 06:21 marostegui: Reboot db2092 for mariadb and kernel upgrade
  • 06:04 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2079 is now s8 candidate master (duration: 00m 59s)
  • 05:54 marostegui: Stop MySQL on db2079 to change its binlog format
  • 05:34 marostegui: Deploy schema change on db1106 with replication enabled (this will generate lag on labs replicas) - T187089 T185128 T153182
  • 05:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 for alter table (duration: 01m 00s)
  • 02:36 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 57s)

2018-04-07

  • 23:44 Dereckson: OATHAuth disabled for Wikimedia SUL global account Barek (T191708)
  • 07:28 legoktm: disabled and cleaned up spam from @Farjksn on Phabricator
  • 00:14 mutante: bromine - scheduled downtime, reboot for reinstall, upgrade to stretch, misc_static_services switched to codfw (T188163)

2018-04-06

  • 22:35 mutante: rsyncing bugzilla-static raw html from eqiad to codfw VM
  • 19:59 herron: moved rhodium:/var/lib/git/operations/puppet away and triggered puppet agent run to re-create
  • 19:43 ottomata: running puppet-merge on rhodium after clash between puppet-merge and new patch submitted
  • 19:23 demon@tin: Finished scap: Forcing full scap. Mostly no-op, consistency, paranoia, that sort of thing (duration: 11m 51s)
  • 19:13 bd808: wiki replicas: ran maintain-views --database mediawikiwiki --clean on labsdb10{09,10,11} for T191387
  • 19:11 demon@tin: Started scap: Forcing full scap. Mostly no-op, consistency, paranoia, that sort of thing
  • 19:02 demon@tin: scap aborted: Forcing full scap, removed clean plugin updates (duration: 11m 03s)
  • 19:00 herron: depooled rhodium (puppet master backend) again https://gerrit.wikimedia.org/r/#/c/424646/
  • 18:51 demon@tin: Started scap: Forcing full scap, removed clean plugin updates
  • 18:49 demon@tin: scap failed: average error rate on 5/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 18:47 demon@tin: Pruned MediaWiki: 1.31.0-wmf.26 [keeping static files] (duration: 01m 51s)
  • 14:37 herron: repooled rhodium (puppet master backend)
  • 14:08 herron: upgraded apache on fermium for security updates
  • 14:07 anomie: Running populateArchiveRevId.php for group2 for T191307
  • 14:03 herron: apache updated on puppet masters — re-enabling puppet agents
  • 13:55 herron: temporarily disabling puppet agents for apache security update on puppet masters
  • 13:14 moritzm: installing apache security updates on thorium (running several analytics web services)
  • 12:38 moritzm: installing apache security updates on the Kibana nodes of the logstash cluster
  • 11:50 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=fawiki --before 20180223210426 --sleep 2 (T184485)
  • 10:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1114 (duration: 01m 00s)
  • 09:45 moritzm: installing apache security updates on graphite hosts
  • 09:39 marostegui: Deploy test alter table on db2038 to test osc_host.py in core
  • 09:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 09:24 moritzm: installing apache security updates on planet1001/planet.wikimedia.org
  • 09:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:57 no_justification: gerrit: restarting services to pick up openjdk updates
  • 08:50 moritzm: installing apache security updates on prometheus hosts
  • 08:45 no_justification: installed apache updates to gerrit2001/cobalt
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:41 moritzm: installing apache security updates on mwlog*
  • 08:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 59s)
  • 08:28 moritzm: installing apache security updates on releases.wikimedia.org
  • 08:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 59s)
  • 08:07 elukey: upgrade prometheus-burrow-exporter on kafkamon1001/2001 - T188719
  • 08:07 elukey: upload prometheus-burrow-exporter 0.0.5 to jessie/stretch-wikimedia - T188719
  • 08:00 marostegui: Stop MySQL on db1114 for kernel and mariadb upgrade
  • 07:40 moritzm: removed mediawiki-deployment07 from deployment-prep (T191578)
  • 07:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2047 after changing binlog format, upgrade mariadb and kernel (duration: 00m 59s)
  • 06:33 marostegui: Stop MySQL on db2047 for binlog format change, upgrade kernel and mariadb
  • 06:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2047 to change binlog format, upgrade mariadb and kernel (duration: 00m 59s)
  • 06:05 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046 as candidate master (duration: 00m 59s)
  • 05:59 marostegui: Restart MySQL on db2046 to change its binlog format - T191275
  • 05:44 marostegui: Deploy schema change on db1114 - T187089 T185128 T153182
  • 05:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 for alter table (duration: 00m 53s)
  • 05:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 after alter table (duration: 00m 55s)

2018-04-05

  • 21:44 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2017.codfw.wmnet
  • 21:44 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet
  • 21:43 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2006.codfw.wmnet
  • 21:43 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2010.codfw.wmnet
  • 21:34 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet
  • 21:09 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet,service=varnish-be
  • 20:10 twentyafterfour@tin: Synchronized php-1.31.0-wmf.28/extensions/Echo/: Sync https://gerrit.wikimedia.org/r/#/c/424379/ refs T183967 (duration: 01m 05s)
  • 20:07 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/424379/ refs T191335
  • 19:59 herron: added rhodium puppet master backend in offline mode
  • 19:52 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.28 refs T183967
  • 19:51 robh@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2008.codfw.wmnet
  • 18:45 catrope@tin: Synchronized wmf-config/Wikibase-production.php: Disable writing wb_terms search fields on Wikidata (T189777) (duration: 01m 16s)
  • 18:25 catrope@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter/includes/Views/AbuseFilterViewList.php: Unbreak Special:AbuseFilter (T191512) (duration: 01m 17s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Disable logging autopatrol actions on commonswiki (T184485) (duration: 01m 17s)
  • 17:56 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --before 20180223210426 --from-id 156008475 (T184485)
  • 17:42 Amir1: finished the script
  • 17:33 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=commonswiki --before 20180223210426 (T184485)
  • 17:18 bsitzmann@tin: Finished deploy [mobileapps/deploy@eed7961]: Update mobileapps to dbc0687 (T187430) (duration: 09m 45s)
  • 17:09 bsitzmann@tin: Started deploy [mobileapps/deploy@eed7961]: Update mobileapps to dbc0687 (T187430)
  • 16:41 robh: cp2008 shutting down for firmware updates
  • 16:09 vgutierrez: updating librdkafka1 to 0.11.3 on cache text
  • 15:54 vgutierrez: updating librdkafka1 to 0.11.3 on cache upload
  • 15:44 vgutierrez: updating librdkafka1 to 0.11.3 on cache misc
  • 15:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2039 IP as it is being moved to a different rack - T191193 (duration: 01m 17s)
  • 15:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2039 IP as it is being moved to a different rack - T191193 (duration: 01m 17s)
  • 15:26 vgutierrez: uploaded pybal 1.15.3 for stretch on apt.w.o
  • 15:17 jynus: stopping mariadb on db2039 T191193
  • 14:59 moritzm: installing apache security updates
  • 14:54 marostegui: Deploy schema change on db1066 - T187089 T185128 T153182
  • 14:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 for alter table (duration: 01m 17s)
  • 14:43 moritzm: uploaded apache2 2.4.10-10+deb8u12+wmf1 to apt.wikimedia.org/jessie-wikimedia (rebase of our local patches against the latest DSA)
  • 14:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2053 is no longer a candidate master (duration: 01m 17s)
  • 14:03 andrew@tin: Finished deploy [horizon/deploy@cd1cda6]: Deploying potential fix for T191232 (duration: 03m 17s)
  • 14:00 andrew@tin: Started deploy [horizon/deploy@cd1cda6]: Deploying potential fix for T191232
  • 13:41 anomie: Running populateArchiveRevId.php on group 1 for T191307
  • 13:39 zeljkof: EU SWAT finished
  • 13:32 zfilipin@tin: Synchronized php-1.31.0-wmf.28/extensions/AbuseFilter: SWAT: Make $mode optional for checkAllFilters (T191468) (duration: 01m 20s)
  • 13:23 marostegui: Stop MySQL on db2053 for binlog format change
  • 13:09 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Stop logging autopatrol actions in wikidatawiki (T184485) (duration: 01m 16s)
  • 12:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1083 after alter table (duration: 01m 17s)
  • 12:52 Amir1: finished the script
  • 12:41 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=wikidatawiki --before 20180223210426 (T189596)
  • 12:30 moritzm: installing net-snmp security updates on jessie (stretch not affected)
  • 12:12 ariel@tin: Finished deploy [dumps/dumps@88ca17c]: fix monitor to import status module after refactor (duration: 00m 04s)
  • 12:12 ariel@tin: Started deploy [dumps/dumps@88ca17c]: fix monitor to import status module after refactor
  • 12:04 hoo: Manually back-filled hashes for the Wikidata JSON dumps in https://dumps.wikimedia.org/wikidatawiki/entities/20180402/wikidata-20180402-*sums.txt (T190457)
  • 11:58 vgutierrez: updating libssl1.1 to 1.1.0h on cache text cluster (and nginx restart)
  • 11:36 vgutierrez: updating libssl1.1 to 1.1.0h on cache upload cluster (and nginx restart)
  • 11:22 vgutierrez: updating libssl-1-1 to 1.1.0h on cache misc cluster (and nginx restart)
  • 10:57 jynus: restart dbstore1001 for RAID re-setup and reimage
  • 10:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Specify that db1106 is sanitarium's master (duration: 01m 16s)
  • 10:33 marostegui: Deploy schema change on db1083 - T187089 T185128 T153182
  • 10:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083 for alter table (duration: 01m 17s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 after alter table (duration: 01m 16s)
  • 09:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1089 original weight (duration: 01m 17s)
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 17s)
  • 08:30 jynus: starting backup of es2019, it may create lag T153440
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 16s)
  • 08:23 moritzm: installing net-snmp security updates on jessie (stretch not affected)
  • 08:16 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2019 (duration: 01m 16s)
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 (duration: 01m 16s)
  • 07:52 moritzm: removed unused/defunct deployment-videoscaler01 from deployment-prep (T191293)
  • 07:51 moritzm: removed unused/defunct deployment-tmh01 from deployment-prep (T191293)
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1089 after alter table, mariadb and kernel upgrade (duration: 01m 16s)
  • 07:44 moritzm: upgrading openjdk-7 on contint*
  • 07:36 marostegui: Stop MySQL on db1089 for kernel and mariadb upgrade
  • 07:33 marostegui: Deploy schema change on db1105:3311 - T187089 T185128 T153182
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 for alter table (duration: 01m 16s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2053 as candidate master (duration: 01m 09s)
  • 07:05 marostegui: Restart MySQL on db2053 for binlog format change
  • 06:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2038 (duration: 01m 13s)
  • 06:43 marostegui: Stop MySQL on db2038 to change binlog format, upgrade mariadb and kernel
  • 06:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2038 (duration: 01m 17s)
  • 06:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2058 is now a candidate master for s4 - T191275 (duration: 01m 16s)
  • 05:58 marostegui: Restart MySQL on db2058 to change its binlog to STATEMENT - T191275
  • 05:52 marostegui: Deploy schema change on db1089 - T187089 T185128 T153182
  • 05:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 for alter table (duration: 01m 16s)
  • 05:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 after alter table (duration: 01m 18s)
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 07m 06s)

2018-04-04

  • 23:53 andrew@tin: Finished deploy [horizon/deploy@2c55bd5]: (no justification provided) (duration: 03m 10s)
  • 23:50 andrew@tin: Started deploy [horizon/deploy@2c55bd5]: (no justification provided)
  • 23:42 catrope@tin: Synchronized php-1.31.0-wmf.27/extensions/VisualEditor/lib/ve: Fix VE drag-and-drop bugs (T191103) (duration: 01m 17s)
  • 23:36 catrope@tin: Synchronized php-1.31.0-wmf.28/resources/src/mediawiki.rcfilters/: Fix missing bookmark icon (T191366) (duration: 01m 16s)
  • 23:12 catrope@tin: Synchronized wmf-config/CommonSettings.php: Set $wgVisualEditorSourceFeedbackTitle (no-op until later) (T157953) (duration: 01m 16s)
  • 23:09 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add Txikipedia namespace on euwiki (T191396) (duration: 01m 18s)
  • 22:54 akosiaris: increase the number of mathoid pods to 16 from 4
  • 21:53 bd808: Wiki replicas: ran `sudo maintain-views --table page_assessments --database trwiki` on all 3 servers for T191455
  • 20:27 arlolra: Updated Parsoid to d887aff (T177102, T189474)
  • 20:22 twentyafterfour@tin: Synchronized php-1.31.0-wmf.28/skins/MonoBook: sync https://gerrit.wikimedia.org/r/#/c/424041/ (duration: 01m 16s)
  • 20:22 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0460519]: Update mobileapps to 2d5ab5b (duration: 05m 58s)
  • 20:18 arlolra@tin: Finished deploy [parsoid/deploy@a8e759f]: Updating Parsoid to d887aff (duration: 11m 58s)
  • 20:16 mholloway-shell@tin: Started deploy [mobileapps/deploy@0460519]: Update mobileapps to 2d5ab5b
  • 20:15 mholloway-shell@tin: Started deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88
  • 20:06 arlolra@tin: Started deploy [parsoid/deploy@a8e759f]: Updating Parsoid to d887aff
  • 19:19 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.28 refs T183967 (duration: 01m 16s)
  • 19:18 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.28 refs T183967
  • food: re-enabled thank you mailer
  • 19:03 hasharAway: upgraded blubbler 0.2.0-1 -> 0.3.0-1 on contint1001 and contint2001
  • 18:17 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0125bc4]: Fixed the new metrics names. Again (duration: 00m 37s)
  • 18:17 ppchelko@tin: Started deploy [cpjobqueue/deploy@0125bc4]: Fixed the new metrics names. Again
  • 18:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0185e74]: Fix the metric names and support multi-topic rules (duration: 00m 35s)
  • 18:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@0185e74]: Fix the metric names and support multi-topic rules
  • 17:54 madhuvishy: Reset ttl for dumps.wikimedia.org CNAME to 1H post switchover to labstore1007 T188646
  • 17:26 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: gerrit:422414 Enable TemplateStyles on dewiki T190910 (duration: 01m 17s)
  • 17:19 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on all wikiquotes except frwikiquote T190726 (duration: 01m 17s)
  • 17:11 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on all wikimedia wikis T188881 (duration: 01m 18s)
  • 16:58 ppchelko@tin: Finished deploy [cpjobqueue/deploy@60a2292]: Revert: Support multi-topic rules (duration: 00m 21s)
  • 16:58 ppchelko@tin: Started deploy [cpjobqueue/deploy@60a2292]: Revert: Support multi-topic rules
  • 16:55 robh: dbstore1001 rebooting for bios firmware update
  • 16:47 ppchelko@tin: Finished deploy [cpjobqueue/deploy@d4a84ae]: Support multi-topic rules T191238 (duration: 00m 42s)
  • 16:47 ppchelko@tin: Started deploy [cpjobqueue/deploy@d4a84ae]: Support multi-topic rules T191238
  • 16:26 madhuvishy: Move cert for dumps.wikimedia.org to labstore1007 (do_acme: true) T188646
  • 16:22 madhuvishy: Change CNAME for dumps.wikimedia.org to labstore1007 T188646
  • 15:44 jynus: starting backup from es2015 (will create lag)
  • 15:37 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool es2015 (duration: 01m 17s)
  • 15:20 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Clean up config for the rest of high-traffic jobs after the switch - T190327 (duration: 01m 16s)
  • 15:14 madhuvishy: Update ttl for dumps.wikimedia.org CNAME to 1M in prep for switchover to labstore1007 T188646
  • 15:07 mobrovac@tin: Started restart [restbase/deploy@f3a53b6]: Pick up the net.ipv4.tcp_tw_reuse flag change - T190213
  • 15:06 elukey: delete /srv/deployment/prometheus from restbase* as clean up step for T181728
  • 14:30 anomie: Running populateArchiveRevId.php on group0 wikis for T191307
  • 14:20 elukey: apply net.ipv4.tcp_tw_reuse=1 to restbase* via https://gerrit.wikimedia.org/r/#/c/421901 - T190213
  • 14:15 moritzm: updating deployment-prep to HHVM 3.18.5+wmf6
  • 14:11 godog: purge cron smart-data-dump from lvs100[1-6]
  • 14:09 marostegui: Deploy schema change on db1099:3311 - T187089 T185128 T153182
  • 14:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 for alter table (duration: 01m 16s)
  • 14:08 moritzm: uploaded HHVM 3.18.5+wmf6 to component/icu57 for jessie-wikimedia (updated build with the security fix for CVE-2018-6334)
  • 13:59 marostegui: Deploy schema change on dbstore1002:s1 - T187089 T185128 T153182
  • 13:56 godog: rollout https://gerrit.wikimedia.org/r/c/423852 across ms-fe machines - T183902
  • 13:32 zeljkof: EU SWAT finished
  • 13:29 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add namespace to euwiki" (T191396) (duration: 01m 14s)
  • 13:08 godog: upgrade smartmontools to -backports version after https://gerrit.wikimedia.org/r/c/423871/
  • 12:02 elukey: removing /srv/deployment/prometheus from restbase2001/1007 - T181728
  • 12:00 akosiaris: revert scb hosts to apertium-fra-cat_1.2.0~r78602-1+wmf2
  • 11:47 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2057 is now a candidate master for s3 - T191275 (duration: 01m 17s)
  • 11:13 akosiaris: upgrade apertium on all scb hosts. Rolling update with in groups of 2 hosts with a 30 seconds delay
  • 11:06 marostegui: Stop MySQL on db2057 for binlog format change, mariadb and kernel upgrade
  • 11:02 akosiaris: upgrade apertium on scb1001
  • 09:46 marostegui: Deploy schema change on s1 codfw master db2048 (this will generate lag on codfw) - T187089 T185128 T153182
  • 09:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1077 (duration: 01m 16s)
  • 09:25 Amir1: end of the deleteAutoPatrolLogs.php script on mediawikiwiki (T184485)
  • 09:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2041 is now a candidate master for s2 - T191275 (duration: 01m 16s)
  • 09:16 elukey: executed systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka1020
  • 09:02 Amir1: start of mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki--before 20180223210426 --sleep 2 (T184485)
  • 09:02 marostegui: Stop MySQL on db2041 for binlog format change and kernel upgrade
  • 09:01 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2041 (duration: 01m 17s)
  • 08:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1072 (duration: 01m 17s)
  • 08:19 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki --check-old --before 20160423210426 (T184485)
  • 08:17 Amir1: start of ladsgroup@terbium:~$ mwscript deleteAutoPatrolLogs.php --wiki=mediawikiwiki --dry-run --check-old --before 20160423210426
  • 08:08 marostegui: Deploy schema change on s3 primary master (db1075) - T153182 T185128
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1072 (duration: 01m 17s)
  • 07:59 godog: depool ms-fe2005 to test rewrite.py - T183902
  • 07:53 marostegui: Drop flaggedrevs from s3 mediawikiwiki - T186865
  • 07:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: db2055 is now a candidate master - T191275 (duration: 01m 16s)
  • 07:37 moritzm: running some apache/stretch tests on mw2261
  • 07:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2083 - T188279 (duration: 01m 17s)
  • 07:30 ema: finish up cache@eqiad reboots for retpoline kernel updates T188092
  • 07:26 marostegui: Restart MySQL on db2055 to change its binlog to STATEMENT - T191275
  • 05:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db2083 - T188279 (duration: 01m 17s)
  • 05:48 marostegui: Deploy schema change on db1072 - s3 - with replication. This will generate lag on labs T187089 T185128 T153182
  • 05:43 marostegui: Drop click_tracking_events table from where it still exists - T115982
  • 05:21 marostegui: Stop mariadb for upgrade and kernel upgrade on db1072 - this will generate lag on s3 labs
  • 05:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1072 for alter table, kernel and mariadb upgrade (duration: 01m 17s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 05m 31s)
  • 01:02 eileen: update civicrm - civicrm revision changed from d6855cd281 to 7010f0f5d6, config revision is 3b900436c9

2018-04-03

  • 23:55 XioNoX: re-activating graceful-switchover on cr1-codfw - T189588
  • 23:16 dereckson@tin: Synchronized wmf-config/CommonSettings.php: Make a note about the loading order of GlobalPreferences and Echo (Gerrit:422642) (no-op) (duration: 01m 17s)
  • 23:10 dereckson@tin: Synchronized wmf-config/InitialiseSettings.php: Rollout VirtualPageViews (final stage) (T189906) (duration: 01m 19s)
  • 22:34 mutante: cobalt - puppet disabled temporarily to apply fix to "simplify directory structure" change .. on gerrit2001 first
  • 22:25 mutante: restarting Apache on phab1001 - T182832
  • 22:14 twentyafterfour: Finished MediaWiki Train for group0, 1.31.0-wmf.28 refs T183967
  • 22:12 twentyafterfour@tin: Pruned MediaWiki: 1.31.0-wmf.25 [keeping static files] (duration: 01m 55s)
  • 22:10 twentyafterfour@tin: Pruned MediaWiki: 1.31.0-wmf.24 [keeping static files] (duration: 04m 18s)
  • 21:30 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.28 refs T183967
  • 21:15 twentyafterfour@tin: Finished scap: testwikis wikis to 1.31.0-wmf.28 refs T183967 (duration: 46m 38s)
  • {{safesubst:SAL entry|1=21:13 urandom: (re)starting restbase-dev1004-{a,b} (ooms), and enabling alternately patched cassandra 3.11.2 build - T186751}}
  • 20:29 twentyafterfour@tin: Started scap: testwikis wikis to 1.31.0-wmf.28 refs T183967
  • 20:22 ejegg: disabled thank you mail sender
  • {{safesubst:SAL entry|1=19:46 urandom: restarting restbase-dev1004-{a,b} to enable patched cassandra 3.11.2 build - T186751}}
  • 19:07 twentyafterfour: Preparing to deploy 1.31.0-wmf.28 refs T183967
  • 18:25 urandom: upgrading restbase-dev1006-b to cassandra 3.11.2 - T186751
  • 18:23 urandom: upgrading restbase-dev1006-a to cassandra 3.11.2 - T186751
  • 18:20 urandom: upgrading restbase-dev1005-b to cassandra 3.11.2 - T186751
  • 18:18 urandom: upgrading restbase-dev1005-a to cassandra 3.11.2 - T186751
  • 18:15 urandom: upgrading restbase-dev1004-b to cassandra 3.11.2 - T186751
  • 18:13 urandom: upgrading restbase-dev1004-a to cassandra 3.11.2 - T186751
  • 18:05 mutante: rhodium - closing idle screen session from maintenance work on puppetmasters
  • 18:03 mutante: elnath - fixing and re-enabling Icinga alert about screens, none are running, spare hosts should not have these
  • 17:59 mutante: restarting ferm on bromine
  • 17:40 elukey: manually set net.ipv4.tcp_tw_reuse=1 on restbase1007 as test for T190213
  • 17:35 sbisson@tin: Finished deploy [tilerator/deploy@8e68cb8]: Deploying tilerator i18n to maps-test* (take 2) (duration: 00m 25s)
  • 17:35 sbisson@tin: Started deploy [tilerator/deploy@8e68cb8]: Deploying tilerator i18n to maps-test* (take 2)
  • 17:28 awight@tin: Finished deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071 (duration: 01m 27s)
  • 17:28 sbisson@tin: Finished deploy [tilerator/deploy@03add2d]: Deploying tilerator i18n to maps-test* (duration: 04m 09s)
  • 17:27 awight@tin: Started deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071
  • 17:25 awight@tin: Finished deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071 (duration: 00m 27s)
  • 17:25 awight@tin: Started deploy [ores/deploy@7701cee]: ORES versioned virtualenv, T181071
  • 17:24 sbisson@tin: Started deploy [tilerator/deploy@03add2d]: Deploying tilerator i18n to maps-test*
  • 17:24 moritzm: upgrading HHVM on labweb*
  • 17:18 jynus: reloading labsdb proxy configuration
  • 17:08 elukey: manually set net.ipv4.tcp_tw_reuse=1 on restbase2001 as test for T190213
  • 16:53 demon@tin: Finished deploy [gerrit/gerrit@aa1a1a0]: no-op, pushing empty motd.config file (duration: 00m 11s)
  • 16:53 demon@tin: Started deploy [gerrit/gerrit@aa1a1a0]: no-op, pushing empty motd.config file
  • 16:33 urandom: rebooting restbase-dev1006 - T186751
  • 16:10 urandom: rebooting restbase-dev1004 - T186751
  • 16:04 ariel@tin: Finished deploy [dumps/dumps@77dc467]: split up some large modules, prep work for prefetch changes (duration: 00m 04s)
  • 16:04 ariel@tin: Started deploy [dumps/dumps@77dc467]: split up some large modules, prep work for prefetch changes
  • 15:39 elukey: roll restart of zookeeper on conf100[123] to pick up prometheus monitoring
  • 15:09 godog: depool ms-fe2005 to test rewrite.py - T183902
  • 14:40 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch the remaining high-traffic jobs for all wikis, file 2/2 - T190327 (duration: 00m 59s)
  • 14:39 ppchelko@tin: Finished deploy [cpjobqueue/deploy@60a2292]: Switch all high traffic jobs to kafka T190327 (duration: 00m 44s)
  • 14:39 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the remaining high-traffic jobs for all wikis, file 1/2 - T190327 (duration: 00m 59s)
  • 14:38 ppchelko@tin: Started deploy [cpjobqueue/deploy@60a2292]: Switch all high traffic jobs to kafka T190327
  • 13:48 anomie@tin: Synchronized php-1.31.0-wmf.27/extensions/intersection/DynamicPageList.hooks.php: Backporting fix for T191116 (gerrit:423689) (duration: 00m 58s)
  • 13:47 anomie@tin: Synchronized php-1.31.0-wmf.27/includes/specials/SpecialWhatlinkshere.php: Backporting fix for T191116 (gerrit:423688) (duration: 00m 58s)
  • 13:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1077 after alter table (duration: 00m 58s)
  • 13:21 marostegui: Reimport  s51541_sulwatcher.logging from master to slave - T191020
  • 13:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1077 after alter table (duration: 00m 58s)
  • 13:18 elukey: roll restart of zookeeper on conf200[123] to pick up prometheus monitoring settings
  • 12:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1077 after alter table (duration: 00m 59s)
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1077 after alter table (duration: 00m 58s)
  • 11:16 godog: deploy thumbor 1.16 in codfw
  • 11:06 moritzm: installing libdatetime-timezone-perl update from Debian SUA
  • 09:51 godog: deploy thumbor 1.16 in codfw and eqiad - T186528 T179200 T189647 T191028
  • 08:46 marostegui: Deploy schema change on db1077 - s3 - T187089 T185128 T153182
  • 08:41 moritzm: upgrading HHVM on video scalers
  • 08:40 volans: temporarily disabled puppet (and re-enabling it one-by-one) on all prod puppetmasters to deploy g/422907 - T190918
  • 08:36 marostegui: Stop MySQL on db1077 for mysql and kernel upgrade
  • 08:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for alter table (duration: 00m 59s)
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1078 (duration: 00m 58s)
  • 08:29 godog: codfw-prod: more weight to ms-be204[0-3] - T189633
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 08:01 elukey: restart of druid-(overlord|middlemanager) on druid1004[456] as precautionary measure after zk restart
  • 08:01 moritzm: uploaded HHVM 3.18.5-dfsg-1+wmf5+deb9u1 for stretch-security to apt.wikimedia.org
  • 08:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 (duration: 00m 58s)
  • 07:50 elukey: roll restart zookeeper on druid100[456] to enable prometheus monitoring
  • 07:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1078 with low weight (duration: 00m 58s)
  • 07:12 jynus: upgrade and restart of labsdb1010
  • 07:10 marostegui: Stop MySQL on db1078 for mariadb and kernel upgrade
  • 06:43 elukey: execute systemctl reset-failed kafka-mirror-main-eqiad_to_jumbo-eqiad.service on kafka102[23]
  • 06:18 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2035 rack comment (duration: 00m 58s)
  • 05:37 marostegui: Deploy schema change on db1078 - s3 - T187089 T185128 T153182
  • 05:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1078 for alter table (duration: 00m 59s)
  • 05:18 marostegui: Enable back gtid on db2035 - T191193
  • 02:44 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.27) (duration: 19m 03s)
  • 00:11 Amir1: Evening SWAT is done

2018-04-02

  • 23:56 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Add several domains of Ukraine government to wgCopyUploadsDomains (T185399) (duration: 00m 59s)
  • 23:45 ladsgroup@tin: Synchronized tests/cirrusTest.php: Shift all search traffic to codfw, part II (T191236) (duration: 00m 58s)
  • 23:44 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Shift all search traffic to codfw (T191236) (duration: 00m 59s)
  • 23:29 Amir1: Persian Wikipedia logos have been purged using purgeList.php on terbium
  • 23:26 ladsgroup@tin: Synchronized static/images/project-logos: Update logo for the Persian Wikipedia (T191174) (duration: 00m 59s)
  • 22:41 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/CSSMin.php: sync https://gerrit.wikimedia.org/r/423574 (duration: 00m 58s)
  • 22:11 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.27
  • 21:59 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/CSSMin.php: (no justification provided) (duration: 01m 16s)
  • 21:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 14s)
  • 21:26 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 21:22 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/: Revert ceb7d61 refs T183966 T190960 (duration: 00m 59s)
  • 21:09 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.26
  • 20:59 twentyafterfour: MediaWiki Train: rolling back to 1.31.0-wmf.26 refs T183966, T190960
  • 20:38 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqlBase.php: sync 36c5235 refs T190960 (duration: 01m 16s)
  • 20:22 mholloway-shell@tin: Finished deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88 (duration: 05m 52s)
  • 20:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@940bd48]: Update mobileapps to 58a0a88
  • 19:56 herron: puppetdb postgres update complete — puppet agents re-enabled
  • 19:46 herron: temporarily disabling puppet agents for puppetdb postgres security update
  • 19:41 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqlBase.php: sync 779e7fd refs T190960 (duration: 01m 16s)
  • 19:17 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 15s)
  • 19:16 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 19:09 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/libs/rdbms/: sync I57dd8d refs T183966 T190960 (duration: 01m 19s)
  • 19:06 twentyafterfour: sync rdbms: avoid lag estimates in getLagFromPtHeartbeat ruined by snapshots Bug: T190960 Change-Id: I57dd8d
  • 19:04 twentyafterfour: Getting the train back on track: deploying 1.31.0-wmf.27 to Group0
  • 17:49 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the remaining high-traffic jobs to EventBus, test wikis only, file 2/2 - T190327 (duration: 01m 15s)
  • 17:48 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9e1b203]: Switch remaining high traffic jobs for test wikis. T190327 (duration: 00m 43s)
  • 17:47 ppchelko@tin: Started deploy [cpjobqueue/deploy@9e1b203]: Switch remaining high traffic jobs for test wikis. T190327
  • 17:47 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch the remaining high-traffic jobs to EventBus, test wikis only, file 1/2 - T190327 (duration: 01m 16s)
  • 17:36 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: Shift serach traffic for enwiki to codfw (duration: 01m 17s)
  • 17:21 smalyshev@tin: Finished deploy [wdqs/wdqs@49f4eed]: GUI update (duration: 09m 49s)
  • 17:11 smalyshev@tin: Started deploy [wdqs/wdqs@49f4eed]: GUI update
  • 16:37 madhuvishy: Rolling out new symlinks to /public/dumps for labstore1006 dumps nfs mount T188643
  • 15:59 madhuvishy: Absenting /public/dumps mount from labstore1003 across the VPS fleet T188643
  • 15:56 ebernhardson: restart elasticsearch on elastic1024, been stuck at 100% cpu for 3+ hours
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Change db2035 IP - T191193 (duration: 01m 15s)
  • 15:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Change db2035 IP - T191193 (duration: 01m 15s)
  • 15:28 marostegui: Stop MySQL and power off db2035 (s2 codfw master - this will stop replication on s2 codfw slaves) for rack change - T191193
  • 15:06 madhuvishy: Reenabled puppet and rolled out mounting new dumps NFS shares from labstore1006|7 on VPS instances T188643
  • 14:40 cmjohnson1: disabling puppet on decom host db1020
  • 14:28 madhuvishy: Disabling puppet across VPS instances with dumps mounted (https://phabricator.wikimedia.org/P6921) T188643
  • 14:22 marostegui: Drop contest* tables from s3 - T186867
  • 14:12 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=recommendation-api,cluster=scb,name=scb1003.*
  • 14:12 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=recommendation-api,cluster=scb,name=scb1004.*
  • 14:10 akosiaris: lower weight for scb1001, scb1002 from 10 to 8 for all services. T191199. scb1003, scb1004 have a weight of 15 already
  • 14:09 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,cluster=scb,name=scb1002.*
  • 14:09 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,cluster=scb,name=scb1001.*
  • 13:54 ariel@tin: Finished deploy [dumps/dumps@0363d50]: add check that xml files don't have binary corruption (nulls) after the header (duration: 00m 04s)
  • 13:54 ariel@tin: Started deploy [dumps/dumps@0363d50]: add check that xml files don't have binary corruption (nulls) after the header
  • 13:48 twentyafterfour@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Sync initializesettings for T190445 (duration: 01m 16s)
  • 13:36 twentyafterfour@tin: Synchronized wmf-config/throttle.php: SWAT: Sync throttle rules for T191187 (duration: 01m 15s)
  • 13:30 twentyafterfour@tin: Synchronized wmf-config/throttle.php: SWAT: Sync throttle rules for T191168 (duration: 01m 16s)
  • 13:27 jynus: restarting pdfrender on scd1003 (Socket timeout)
  • 12:49 akosiaris: upgrade mediawiki servers for hhvm upgrade
  • 12:06 marostegui: Deploy schema change on dbstore1002 - s3 - T187089 T185128 T153182
  • 11:51 akosiaris: repool mediawiki canary servers after hhvm upgrade
  • 11:44 akosiaris: depool mediawiki canary servers for hhvm upgrade
  • 10:16 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 10:15 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 16s)
  • 09:13 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove references to virt1000 (duration: 01m 16s)
  • 09:12 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove references to virt1000 (duration: 01m 16s)
  • 08:50 marostegui: Deploy schema change on s3 codfw master db2043 (this will generate lag on codfw) - T187089 T185128 T153182
  • 08:21 jynus: stop mariadb at labsdb1009 and labsdb1010
  • 08:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Specify current m5 codfw master (duration: 01m 17s)
  • 08:11 jynus: depool labsdb1011 from web wikirreplicas
  • 07:21 apergos: restarted pdfrender on scb1004 after poking around there a bit
  • 07:01 apergos: restarted pdfrender on scb1001,2, service paged and no jobs were being processed
  • 06:06 marostegui: Drop localisation table from the hosts where it still existed - T119811
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.26) (duration: 12m 53s)

2018-03-31

  • 21:15 mutante: bast1001 has been shutdown and decom'ed as planned. if you have any issues with shell access make sure you have replaced with bast1002 or any other bast host
  • 11:26 urandom: removing corrupt commitlog segment, restbase1009-c
  • 11:25 urandom: removing corrupt commitlog segment, restbase1009-b
  • 11:19 urandom: starting restbase1009-c
  • 11:18 urandom: truncating hints, restbase1009-a
  • 11:14 urandom: restarting restbase1009-b
  • 11:13 urandom: stopping restbase1009-a (high hints storage)

2018-03-30

  • 14:16 akosiaris: T189076 upload apertium-fra-cat to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189076 upload apertium-cat to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189075 upload apertium-lex-tools to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189075 upload apertium-separable to apt.wikimedia.org/jessie-wikimedia/main
  • 12:47 akosiaris: T189076 upload apertium-fra to apt.wikimedia.org/jessie-wikimedia/main
  • 11:44 dcausse: running forceSearchIndex from terbium to cleanup elastic indices for (testwiki, mediawikiwiki, labswiki, labtestwiki, svwiki) (T189694)
  • 11:40 dcausse: elastic@codfw cluster restarts complete (T189239)
  • 10:55 dcausse: resuming elastic@codfw cluster restarts
  • 10:17 elukey: roll restart of zookeeper daemons on druid100[123] (Druid analytics cluster) to pick up the new prometheus jmx agent
  • 09:31 elukey: restart oozie/hive daemons on an1003 for openjdk-8 upgrades
  • 08:38 elukey: rolling restart of hadoop-hdfs-datanode on all the hadoop worker nodes after https://gerrit.wikimedia.org/r/423000
  • 07:39 elukey: rolling restart of yarn-hadoop-nodemanagers on all the hadoop worker nodes after https://gerrit.wikimedia.org/r/423000

2018-03-29

  • 23:47 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T189252: Enable perf oversampling for remaining countries in Asia (duration: 01m 16s)
  • 23:40 ebernhardson@tin: Synchronized php-1.31.0-wmf.27/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Start cirrus AB test (duration: 01m 16s)
  • 23:37 ebernhardson@tin: Synchronized php-1.31.0-wmf.26/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Start cirrus AB test (duration: 01m 16s)
  • 23:12 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Configure 5 buckets for cirrus AB test (duration: 01m 17s)
  • 22:10 andrew@tin: Finished deploy [horizon/deploy@14d3e7d]: Updating Horizon with possible fix for T189706 (duration: 03m 16s)
  • 22:06 andrew@tin: Started deploy [horizon/deploy@14d3e7d]: Updating Horizon with possible fix for T189706
  • 20:07 robh: shuttdown cp2022 for hw testing
  • 18:49 maxsem@tin: Synchronized php-1.31.0-wmf.27/skins/MinervaNeue: https://gerrit.wikimedia.org/r/#/c/423012/ (duration: 01m 17s)
  • 18:27 maxsem@tin: Synchronized php-1.31.0-wmf.26/includes/: Shorten summary length to 500 (duration: 02m 06s)
  • 18:22 maxsem@tin: Synchronized php-1.31.0-wmf.27/includes/: Shorten summary length to 500 (duration: 02m 14s)
  • 17:55 dcausse: pausing restarts of elastic@codfw (6 nodes left)
  • 17:35 mobrovac@tin: Finished deploy [restbase/deploy@af592d6]: Add bawikibooks - T191033 (duration: 30m 35s)
  • 17:30 demon@tin: Synchronized docroot/wwwportal/w/search-redirect.php: removing symlink indirection (duration: 01m 16s)
  • 17:05 mobrovac@tin: Started deploy [restbase/deploy@af592d6]: Add bawikibooks - T191033
  • 14:54 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Cleanup: Use only EventBus for refreshLinks - T185052 (duration: 01m 18s)
  • 14:00 moritzm: restarting parsoid and related service on ruthenium to pick up openssl update
  • 13:52 dcausse: reverted and rebased tin for undeployed patch due to scap issues (https://gerrit.wikimedia.org/r/#/c/422906/ https://gerrit.wikimedia.org/r/#/c/422929/)
  • 13:34 dcausse: aborted scap sync-dir php-1.31.0-wmf.27/extensions/CirrusSearch/ (was taking too much time at: waiting on sync-masters, ok: 1, left: 1)
  • 12:54 moritzm: installing ICU security updates on trusty
  • 12:29 dcausse: recreating replicas for skwiki_content in elastic@codfw due to stalled shard recovery
  • 12:18 ariel@tin: Finished deploy [dumps/dumps@982cebd]: ability to configure production of recombined metacurrent page content file (duration: 00m 02s)
  • 12:18 ariel@tin: Started deploy [dumps/dumps@982cebd]: ability to configure production of recombined metacurrent page content file
  • 11:02 ariel@tin: Finished deploy [dumps/dumps@96ba844]: cleanup 'latest' links, rss files from old runs (duration: 00m 04s)
  • 11:02 ariel@tin: Started deploy [dumps/dumps@96ba844]: cleanup 'latest' links, rss files from old runs
  • 10:50 dcausse: restarting elastic@codfw for JVM and plugin upgrade (T189239)
  • 09:16 elukey: roll restart aqs on aqs100* for icu/openssl upgrades
  • 08:18 akosiaris: T189075 upload apertium_3.5.1-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 08:18 moritzm: installing OpenJDK security updates on elastic* hosts (along with current version of the search plugins package)
  • 08:07 elukey: roll restart of cassandra on aqs* for openjdk-8 upgrades
  • 07:20 moritzm: installing openssl security updates
  • 07:18 ema: reboot cache@eqiad for retpoline kernel updates: T188092
  • 04:35 twentyafterfour: ran scap pull on deploy1001
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-28

  • 23:50 eileen: update civicrm revision changed from 9478ca39f1 to d6855cd281 (further security module updates, engage import dedupe)
  • 23:38 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Configure next Cirrus AB test (duration: 01m 16s)
  • 23:18 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T184969: Enable PageAssessments on trwiki (duration: 01m 09s)
  • 23:13 MaxSem: created PageAssessments tables on trwiki
  • 22:17 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.26 (duration: 01m 18s)
  • 22:16 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.26
  • 22:13 twentyafterfour: deploy of 1.31.0-wmf.27 resulted in a lot of SlowTimer errors for SlowTimer [10000ms] at runtime/ext_mysql: slow query: SELECT MASTER_GTID_WAIT(...)
  • 22:12 eileen: civicrm revision changed from 3f6028b24f to 9478ca39f1 (drupal security update)
  • 22:09 twentyafterfour@tin: rebuilt and synchronized wikiversions files: sync https://gerrit.wikimedia.org/r/#/c/422563/ group1 wikis to 1.31.0-wmf.27 refs T183966 T190960
  • 22:08 twentyafterfour: rolling forward group1 to 1.31.0-wmf.27 refs T183966 T190960
  • 22:05 twentyafterfour@tin: Synchronized php-1.31.0-wmf.27/includes/: sync https://gerrit.wikimedia.org/r/#/c/422565/ (duration: 02m 15s)
  • 22:03 twentyafterfour: syncing https://gerrit.wikimedia.org/r/#/c/422565/ refs T190960 T183966
  • 21:53 mutante: deploy1001 - revoking old puppet certs and signing new ones
  • 21:42 twentyafterfour: getting the train back on track, group1 wikis to 1.31.0-wmf.27
  • 20:51 XenoRyet: updated civicrm from 85c89c7d0a to 3f6028b24f
  • 20:50 bsitzmann@tin: Finished deploy [mobileapps/deploy@6a0d877]: Update mobileapps to a5833a0 (duration: 05m 36s)
  • 20:44 bsitzmann@tin: Started deploy [mobileapps/deploy@6a0d877]: Update mobileapps to a5833a0
  • 20:12 mlitn@tin: Finished deploy [3d2png/deploy@c447488]: Updating 3d2png (duration: 02m 26s)
  • 20:09 mlitn@tin: Started deploy [3d2png/deploy@c447488]: Updating 3d2png
  • 19:54 mutante: deploy1001 - schedule downtime for reinstall with jessie, reinstalling (T175288)
  • 19:24 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.26 (duration: 01m 17s)
  • 19:22 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.26
  • 19:20 twentyafterfour: Rolling back to wmf.26 due to increase in fatals: "Replication wait failed: lost connection to MySQL server during query"
  • 19:12 milimetric@tin: Finished deploy [analytics/refinery@c22fd1e]: Fixing python import bug (duration: 02m 48s)
  • 19:09 milimetric@tin: Started deploy [analytics/refinery@c22fd1e]: Fixing python import bug
  • 19:09 milimetric@tin: Started deploy [analytics/refinery@c22fd1e]: (no justification provided)
  • 19:06 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.27 (duration: 01m 17s)
  • 19:05 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.27
  • 19:02 ebernhardson: restore elasticsearch eqiad disk high/low watermarks to 75/80% with all large reindexes complete
  • {{safesubst:SAL entry|1=18:52 urandom: upgrading restbase-dev1005-{a,b} to cassandra 3.11.2 -- T178905}}
  • 18:17 urandom: upgrading restbase-dev1004-b to cassandra 3.11.2 (canary) -- T178905
  • 18:12 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.27
  • 18:12 urandom: upgrading restbase-dev1004-a to cassandra 3.11.2 (canary) -- T178905
  • 18:03 twentyafterfour: deploying 1.31.0-wmf.27 to group0. group1 in an hour. See T183966 for blockers.
  • 17:38 joal@tin: Finished deploy [analytics/refinery@7135d44]: Regular weekly analytics deploy - Scheduled hadoop jobs updates (duration: 05m 21s)
  • 17:32 joal@tin: Started deploy [analytics/refinery@7135d44]: Regular weekly analytics deploy - Scheduled hadoop jobs updates
  • 16:37 akosiaris: T189075 upload lttoolbox_3.4.0~r84331-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 15:37 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable oversampling for IN, GU, MP in preparation for eqsin (T189252) (duration: 01m 18s)
  • 15:13 andrewbogott: restarting nodepool on labnodepool1001 (cleanup from T189115)
  • 15:08 andrewbogott: restarting nova-fullstack on labnet1001
  • 15:07 andrewbogott: restarting nova-network on labnet1001 in case it's upset by the rabbit outage
  • 15:02 andrewbogott: rebooting labservices1001 and labcontrol1001 for T189115
  • 15:00 andrewbogott: stopping nova-fullstack on labnet1001 for T189115
  • 15:00 andrewbogott: stopping nodepool on labnodepool1001
  • 14:58 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Disable redis queue for cirrusSearch jobs for test wikis, file 2/2 - T189137 (duration: 01m 17s)
  • 14:56 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Disable redis queue for cirrusSearch jobs for test wikis, file 1/2 - T189137 (duration: 01m 17s)
  • 14:54 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c84880a]: Switch CirrusSearch jobs to kafka for test wikis (duration: 00m 44s)
  • 14:54 ppchelko@tin: Started deploy [cpjobqueue/deploy@c84880a]: Switch CirrusSearch jobs to kafka for test wikis
  • 13:51 elukey: reduced number of jobrunner runners on the videoscalers after the last burst of jobs that maxed out the cluster
  • 13:51 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateStyles on all Wikivoyages (T189838) (duration: 01m 17s)
  • 13:42 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Wikidata description override on enwik (T184000) (duration: 01m 18s)
  • 13:36 catrope@tin: Synchronized php-1.31.0-wmf.27/extensions/Echo/modules/nojs/mw.echo.badge.less: Prevent FOUC when loading notification badges (duration: 01m 20s)
  • 13:35 jynus: upgrade mariadb client on sarin, neodymium, terbium and wasat
  • 13:18 catrope@tin: Synchronized dblists/flow.dblist: Enable Flow on euwiki (T190500) (duration: 01m 17s)
  • 13:07 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Translate extension on amwikimedia (T180879) (duration: 01m 22s)
  • 12:35 twentyafterfour@tin: Finished scap: test running full scap sync from tin (duration: 46m 05s)
  • 11:49 twentyafterfour@tin: Started scap: test running full scap sync from tin
  • 11:48 twentyafterfour@tin: Synchronized README: test deploy from tin.eqiad.wmnet (duration: 03m 35s)
  • 10:59 volans: performing a few minutes live test of reporting Puppet reports to puppetdb too on puppetmaster1001 - T190918
  • 10:27 godog: reload icinga on einsteinium after https://gerrit.wikimedia.org/r/c/413142
  • 10:05 jynus: upgrade and restart db2093
  • 09:25 godog: disable puppet on icinga servers before merging https://gerrit.wikimedia.org/r/c/413142/
  • 08:25 arturo: reboot labstore200[2,3,4] for T189115
  • 08:25 godog: add more weight to ms-be204[0-3] - T189633
  • 08:18 arturo: reboot labstore2001 for T189115
  • 08:17 arturo: reboot labstore1002 for T189115
  • 08:15 arturo: reboot labstore1001 for T189115
  • 07:49 moritzm: uploaded openssl 1.0.2o to apt.wikimedia.org/jessie-wikimedia
  • 06:51 moritzm: installing remaining ICU security updates
  • 02:28 l10nupdate@deploy1001: scap sync-l10n completed (1.31.0-wmf.26) (duration: 13m 33s)

2018-03-27

  • 23:18 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T189906: (duration: 00m 55s)
  • 23:08 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148: Update enwiki search ranking model (duration: 00m 54s)
  • 22:56 twentyafterfour@deploy1001: Finished scap: Deploy 1.31.0-wmf.27 to test wikis (duration: 41m 00s)
  • 22:28 mutante: DNS - switching deployment service name to deploy1001 (T175288)
  • 22:15 twentyafterfour@deploy1001: Started scap: Deploy 1.31.0-wmf.27 to test wikis
  • 22:14 demon@deploy1001: Synchronized wmf-config/abusefilter.php: beta-only sync (duration: 00m 53s)
  • 22:12 demon@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: beta-only sync (duration: 02m 32s)
  • 21:26 twentyafterfour@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.738JVwJRDN" ' returned non-zero exit status 127 (duration: 00m 43s)
  • 21:26 twentyafterfour@deploy1001: Started scap: Deploy 1.31.0-wmf.27 to test wikis
  • 21:25 mutante: deploy100 rm /var/lock/scap-global-lock to switch to active server, puppet code only adds lock file to inactive servers (T175288)
  • 21:22 twentyafterfour@deploy1001: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap-global-lock"; owner is "root"; reason is "Not the active deployment server, use tin.eqiad.wmnet" (duration: 00m 00s)
  • 21:22 mutante: deployment_server has been switched to deploy1001.eqiad.wmnet. tin is not the active server anymore as of right now
  • 20:55 twentyafterfour@deploy1001: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap-global-lock"; owner is "root"; reason is "Not the active deployment server, use tin.eqiad.wmnet" (duration: 00m 00s)
  • 20:47 twentyafterfour@tin: Finished scap: 2nd Sync to co-masters to initialize deploy1001.eqiad.wmnet (duration: 12m 50s)
  • 20:34 twentyafterfour@tin: Started scap: 2nd Sync to co-masters to initialize deploy1001.eqiad.wmnet
  • 20:32 twentyafterfour@tin: Finished scap: Sync to co-masters to initialize deploy1001.eqiad.wmnet (duration: 21m 30s)
  • 20:11 twentyafterfour@tin: Started scap: Sync to co-masters to initialize deploy1001.eqiad.wmnet
  • 20:09 twentyafterfour@tin: Synchronized README: (no justification provided) (duration: 00m 52s)
  • 19:41 mutante: deploy1001 - deleting /srv and letting puppet recreate it, so _not_ rsyncing manually from tin but just a clean version of what puppet pulls in (T175288)
  • 18:42 twentyafterfour: branching 1.31.0-wmf.27
  • 18:03 andrewbogott: rebooting labsdb1007 for T189115
  • 17:59 demon@tin: Finished deploy [gerrit/gerrit@4910e7c]: motd plugin (duration: 00m 11s)
  • 17:59 demon@tin: Started deploy [gerrit/gerrit@4910e7c]: motd plugin
  • 17:55 andrewbogott: rebooting labsdb1006 for T189115
  • 17:51 foks: disable 2FA from User:Céréales Killer
  • 16:51 madhuvishy: Running rsync catch up job for dumps from ms1001 to labstore1007
  • 16:43 moritzm: uploaded openssl 1.1.0h for jessie-wikimedia to apt.wikimedia.org
  • 16:18 godog: point eqiad puppet traffic to eqiad
  • 15:58 godog: point esams puppet agent traffic to eqiad
  • 15:35 hashar: Bumping operations-puppet-tests-docker job to docker-registry.wikimedia.org/releng/operations-puppet:0.3.1 | https://gerrit.wikimedia.org/r/#/c/422169/ | ping vgutierrez
  • 15:23 godog: reenable puppet fleetwide for CA failover - T189891
  • 15:10 godog: stop puppet fleetwide for CA failover - T189891
  • 14:45 andrewbogott: rebooting labpuppetmaster1001 for T189115
  • 14:36 andrewbogott: rebooting labpuppetmaster1002 for T189115
  • 14:12 ppchelko@tin: Finished deploy [restbase/deploy@e19bad9]: Deploy without feed check to verify that misterious deploy timeouts still happen (duration: 10m 52s)
  • 14:04 zeljkof: EU SWAT finished
  • 14:01 ppchelko@tin: Started deploy [restbase/deploy@e19bad9]: Deploy without feed check to verify that misterious deploy timeouts still happen
  • 13:54 ppchelko@tin: Started restart [restbase/deploy@e19bad9]: Restart to verify that misterious deploy timeouts still happen
  • 13:37 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Change wording for AbuseFilter global block durations (T190602) (duration: 00m 57s)
  • 13:31 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Enable $wgAbuseFilterProfile on itwiki (T190137) (duration: 00m 57s)
  • 13:30 godog: deactivate/clean iridium.eqiad.wmnet -- decom'd
  • 13:24 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AbuseFilter runtime profile on more Wikis (T175954) (duration: 00m 58s)
  • 11:36 moritzm: installing ICU security updates
  • 10:50 arturo: reboot labtestvirt2002 to test if it would boot or not
  • 09:44 elukey: reboot aqs1009 for kernel + cassandra upgrades
  • 09:28 elukey: reboot aqs1008 for kernel + cassandra upgrades
  • 09:25 vgutierrez: uploaded mtail-3.0.0~rc5-1 to apt.w.o for jessie-wikimedia
  • 09:09 elukey: reboot aqs1007 for kernel + cassandra upgrades
  • 08:36 kartik@tin: Finished deploy [cxserver/deploy@a6b029f]: Update cxserver to 9e8ebda (Fix etag parsing and T188403) (duration: 03m 09s)
  • 08:33 kartik@tin: Started deploy [cxserver/deploy@a6b029f]: Update cxserver to 9e8ebda (Fix etag parsing and T188403)
  • 08:33 elukey: reboot aqs1006 for kernel + openjdk-8 + cassandra upgrade
  • 08:29 godog: add more weight to ms-be204[0-3] - T189633
  • 08:15 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1005.eqiad.wmnet
  • 08:11 elukey: reboot aqs1005 for kernel + openjdk-8 + cassandra upgrade
  • 06:59 elukey: powercycle restbase2007 (no ssh, vsp not available via mgmt console)
  • 05:59 Krinkle: Manually purge https://en.wikipedia.org/static/images/project-logos/nds_nlwiki-2x.pngT190051
  • 05:59 Krinkle: Manually purge https://en.wikipedia.org/static/images/project-logos/nds_nlwiki-1.5x.pngT190051
  • 02:57 Krinkle: Fix retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/ve/*)
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-26

  • 23:41 niharika29@tin: Synchronized static/images/project-logos/: Correct high-density logos for the Dutch Low Saxon Wikipedia T190051 (duration: 00m 59s)
  • 22:38 mutante: syncing /srv from tin.eqiad to deploy1001.eqiad (T175288)
  • 22:09 demon@tin: Finished deploy [gerrit/gerrit@b14b43b]: wikimedia plugin (duration: 00m 10s)
  • 22:09 demon@tin: Started deploy [gerrit/gerrit@b14b43b]: wikimedia plugin
  • 21:43 urandom: rolling restart of restbase dev environment
  • 20:50 demon@tin: Pruned MediaWiki: 1.31.0-wmf.25 [keeping static files] (duration: 01m 26s)
  • 20:46 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e223f51]: Update mobileapps to 534f95d (duration: 05m 23s)
  • 20:41 mholloway-shell@tin: Started deploy [mobileapps/deploy@e223f51]: Update mobileapps to 534f95d
  • 20:29 no_justification: gerrit: restarting services to pick up bugfix
  • 20:26 demon@tin: Finished deploy [gerrit/gerrit@f6c5350]: update to 2.14.7-9-g0f04397dbd (duration: 00m 10s)
  • 20:25 demon@tin: Started deploy [gerrit/gerrit@f6c5350]: update to 2.14.7-9-g0f04397dbd
  • 19:55 andrew@tin: Finished deploy [horizon/deploy@99153e4]: Rolling out fix for security groups, 421983 (duration: 03m 12s)
  • 19:52 andrew@tin: Started deploy [horizon/deploy@99153e4]: Rolling out fix for security groups, 421983
  • 19:44 ejegg: updated payments-wiki from 9e83e7f7a0 to 320a6c2600
  • 19:23 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 01m 22s)
  • 19:21 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:21 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 04m 16s)
  • 19:17 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:17 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 03m 29s)
  • 19:13 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:12 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 03m 49s)
  • 19:09 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 19:06 mobrovac@tin: Finished deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations (duration: 19m 28s)
  • 18:46 mobrovac@tin: Started deploy [restbase/deploy@908febb]: Add tag descriptions for citations and recommendations
  • 18:28 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mobile-only Mediawiki:MainPageCss styles for Hindi wiki T190101 (duration: 00m 58s)
  • 17:26 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2010.codfw.wmnet
  • 17:26 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp2006.codfw.wmnet
  • 16:38 demon@tin: Pruned MediaWiki: 1.31.0-wmf.23 (duration: 05m 03s)
  • 13:55 hashar: restarting CI Jenkins . Upgrades Mail plugin from 1.20 to 1.21 | T190393
  • 13:30 moritzm: restarting HHVM on app server canaries to pick up ICU security update (not rebooting as logged before)
  • 13:30 moritzm: rebooting app server canaries to pick up ICU security update
  • 13:27 zeljkof: EU SWAT finished
  • 13:26 zfilipin@tin: Synchronized php-1.31.0-wmf.26/extensions/MobileFrontend/: SWAT: Squash: Hygiene: Auto namespace ResourceLoader modules and Add $wgMFMobileMainPageCss config flag; Hygiene: Auto namespace ResourceLoader modules; Add $wgMFMobileMainPageCss config flag (T190101) (duration: 01m 01s)
  • 13:23 ottomata: temporarily stopping puppet on kafka102[023] to use --new.consumer mirrormaker consuming from end
  • 13:21 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Enable AbuseFilter profiler at zh.wikipedia (T190663) (duration: 01m 00s)
  • 13:13 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add tboverride to engineer at ruwiki (T190619) (duration: 01m 01s)
  • 12:47 godog: add ms-be204[0-3] with minimal weight - T189633
  • 12:40 arturo: reboot labservices1002 for T189115
  • 12:30 arturo: reboot labnet100[2,3,4]* for T189115
  • 12:30 arturo: reboot labbwr100[2,3,4] for T189115
  • 12:00 arturo: reboot labmon100[1,2] for T189115
  • 12:00 moritzm: restarting HHVM on mediawiki canaries to pick up ICU security update
  • 11:47 arturo: reboot labcontrol100[3,4] for T189115
  • 11:31 arturo: reboot labcontrol1002 for T189115
  • 11:16 akosiaris: depool scb hosts for mathoid service. T184919
  • 11:16 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: service=mathoid,cluster=scb,name=scb.*
  • 10:56 moritzm: installing ICU security updates for jessie/stretch
  • 10:39 arturo: reboot silver for T189115
  • 10:34 arturo: reboot californium for T189115
  • 10:26 moritzm: upgrading debdeploy across the fleet to 0.0.99.4
  • 10:23 moritzm: uploaded debdeploy 0.0.99.4 to apt.wikimedia (for trusty/jessie/stretch)
  • 08:17 moritzm: upgrading debdeploy across the fleet to latest release
  • 07:33 elukey: stop eventlogging zmq-forwarder on eventlog1001 as part of decom process - T189566
  • 05:39 _joe_: restarting pdfrenderer on scb1001,1003
  • 02:00 l10nupdate@tin: LocalisationUpdate failed: git pull of core failed

2018-03-24

  • 20:22 foks: rm 2fa from Awight@officewiki
  • 15:00 elukey: rm -rf /srv/mediawiki/core on stat100[456] and force puppet run (git pull returned fatal: protocol error: bad pack header)
  • 02:33 bblack: powercycle cp3048
  • 02:31 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3048.esams.wmnet
  • 01:27 Krinkle: Correct retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/VisualEditor/*)
  • 00:39 Krinkle: Correct retention rules for Whisper files on graphite2001 and graphite1001 per T179622 (/var/lib/carbon/whisper/mw/*)

2018-03-23

  • 21:35 ebernhardson: delete indices for deleted wikis (from deleted.dblist) in eqiad and codfw elasticsearch clusters: alswikiquote, alswiktionary, mowiki, mowiktionary, ukwikimedia
  • 19:24 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 4) (duration: 06m 58s)
  • 19:17 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 4)
  • 19:11 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 3) (duration: 04m 19s)
  • 19:07 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 3)
  • 19:06 sbisson@tin: Finished deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 2) (duration: 06m 23s)
  • 18:59 sbisson@tin: Started deploy [kartotherian/deploy@f716cde]: Deploying i18n feature to maps-test* (take 2)
  • 18:28 sbisson@tin: Finished deploy [kartotherian/deploy@a66ff1d]: Deploying i18n feature to maps-test* (duration: 00m 29s)
  • 18:28 sbisson@tin: Started deploy [kartotherian/deploy@a66ff1d]: Deploying i18n feature to maps-test*
  • 15:43 moritzm: uploaded debdeploy 0.0.99.3 to apt.wikimedia.org (now based on Python 3 for the clients)
  • 15:08 ema: cache_codfw: begin reboots for retpoline kernel upgrades T188092
  • 15:02 bawolff@tin: Synchronized php-1.31.0-wmf.26/includes/api/ApiQueryUserContributions.php: T190507 (duration: 00m 59s)
  • 13:24 moritzm: installing postgres security updates on rhenium
  • 12:51 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=appserver,service=apache2
  • 12:48 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=api_appserver,service=apache2
  • 11:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Increase db1072 weight (duration: 00m 59s)
  • 11:19 moritzm: installing libvorbis security updates on trusty (Debian already fixed)
  • 11:09 elukey: restarting jvm daemons on analytics100[12] (Hadoop Masters) for openjdk-8 upgrade
  • 10:59 jynus: deployed new replication filter for labsdb1004 on u2815__p.all_articles T190488
  • 10:49 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Disable reading wb_terms search fields on wikidata (T189777) (duration: 00m 59s)
  • 10:36 elukey: upload cassandra2.2.6-wmf3 to jessie/stretch-wikimedia -C component/cassandra22 - T189529
  • 10:22 moritzm: restarting apache on krypton to pick up curl security update
  • 10:00 moritzm: installing plexus-utils2 security updates
  • 09:49 moritzm: armed keyholder on deploy1001
  • 08:19 elukey: reboot eventlog1001 for kernel upgrades

2018-03-22

  • 23:40 Amir1: Evening SWAT is done
  • 23:40 Amir1: Just to note, if you are seeing any performance regression (specially database-wise) 421333 might be the reason
  • 23:39 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Disable reading wb_terms search fields on wikidata (T189777) (duration: 00m 58s)
  • 23:29 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 58s)
  • 23:27 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-2x.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 56s)
  • 23:26 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-1.5x.png: Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 58s)
  • 23:23 ladsgroup@tin: Synchronized static/images/project-logos/nds_nlwiki-1.5x.png: static/images/project-logos/nds_nlwiki-2x.png static/images/project-logos/nds_nlwiki.png Update logo for Dutch Low Saxon Wikipedia (T190051) (duration: 00m 59s)
  • 22:47 mutante: restarting Gerrit to apply config changes gerrit:406145 and gerrit:410474
  • 22:25 mutante: icinga - re-enabling notifications for a LOT of "systemd checks" that were all OK since a longer time but had not been re-enabled after some maintenance
  • 20:18 andrewbogott: reimaged labtestvirt2002
  • 19:52 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.26
  • 19:34 cmjohnson1: db1052 replacing disk slot 8
  • 18:52 XioNoX: done with the asw-a/b/c-eqiad switches uplink work
  • 18:43 Amir1: Morning SWAT is done
  • 18:41 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable VirtualPageViews on s6 (ja,ru,fr) wikis (T189906) (duration: 01m 16s)
  • 17:59 ppchelko@tin: Finished deploy [changeprop/deploy@4f9fbe4]: Purge page metadata and references on html change and page deletion. (duration: 01m 16s)
  • 17:57 ppchelko@tin: Started deploy [changeprop/deploy@4f9fbe4]: Purge page metadata and references on html change and page deletion.
  • 17:44 mutante: install1002 - restarted dhcp server to confirm there was no syntax error
  • 17:21 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, with increased timeout (duration: 03m 15s)
  • 17:18 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, with increased timeout
  • 17:14 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 3 (duration: 02m 54s)
  • 17:11 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 3
  • 17:10 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 2 (duration: 03m 00s)
  • 17:07 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints, take 2
  • 17:03 ppchelko@tin: Finished deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints (duration: 08m 39s)
  • 16:55 ppchelko@tin: Started deploy [restbase/deploy@93dadf7]: Release metadata and references endpoints
  • 16:42 moritzm: installing postgres security updates on netmon*
  • 16:28 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Redeploy GlobalPreferences to test wikis and mw.org" (T189806) (duration: 01m 14s)
  • 16:28 moritzm: restarting graphite on labmon1001 to pick up uwsgi security update
  • 16:04 XioNoX: starting the asw-a/b/c-eqiad switches uplink work
  • 15:43 sbisson@tin: Finished deploy [tilerator/deploy@e259530]: Weekly progress to production (duration: 00m 43s)
  • 15:42 sbisson@tin: Started deploy [tilerator/deploy@e259530]: Weekly progress to production
  • 15:37 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Weekly progress to production (duration: 02m 27s)
  • 15:35 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Weekly progress to production
  • 15:23 sbisson@tin: Finished deploy [tilerator/deploy@e259530]: Deploying weekly progress to maps-test* (duration: 00m 26s)
  • 15:23 ottomata: ran puppet-merge on puppetmaster2001, got ssh: connect to host puppetmaster1001.eqiad.wmnet port 22: Connection timed out, hope all is ok. T189891
  • 15:23 sbisson@tin: Started deploy [tilerator/deploy@e259530]: Deploying weekly progress to maps-test*
  • 15:17 moritzm: installing openssh updates from stretch point release
  • 15:14 cmjohnson1: db1054 replacing disk at slot 1
  • 15:10 cmjohnson1: replacing disk slot 11 db1061
  • 15:09 sbisson@tin: Finished deploy [kartotherian/deploy@8f3a903]: Deploying weekly progress to maps-test* (duration: 01m 59s)
  • 15:08 moritzm: installing java-atk-wrapper updates from stretch point release
  • 15:07 sbisson@tin: Started deploy [kartotherian/deploy@8f3a903]: Deploying weekly progress to maps-test*
  • 14:57 moritzm: installing cups update from stretch point release (we only install the client libs)
  • 14:24 jynus: killing ongoing truncate to investigate s3 issues
  • 14:16 elukey: rolling restart of the three hadoop hdfs journal nodes (an1028/35/52) for openjdk-8 upgrades
  • 14:00 godog: reimage puppetmaster1001 - T184562
  • 13:57 zeljkof: EU SWAT finished
  • 13:55 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Properly setup ProofreadPage namespaces for cywikisource (T181406) (duration: 01m 16s)
  • 13:38 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Make eswikibooks logo normal size (T190366) (duration: 01m 16s)
  • 13:29 mobrovac@tin: Finished deploy [zotero/translators@1c30955]: Update translators - T188893 (duration: 00m 08s)
  • 13:29 mobrovac@tin: Started deploy [zotero/translators@1c30955]: Update translators - T188893
  • 13:27 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change bewikibooks logo (T189218) (duration: 01m 15s)
  • 13:25 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change bewikibooks logo (T189218) (duration: 01m 16s)
  • 13:23 godog: reenabling puppet fleetwide to enable CA switch - T189891
  • 13:11 ladsgroup@tin: Synchronized wmf-config/Wikibase.php: Remove forceWriteTermsTableSearchFields from testwikidatawiki, part II (T189776) (duration: 01m 15s)
  • 13:09 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Remove forceWriteTermsTableSearchFields from testwikidatawiki, part I (T189776) (duration: 01m 16s)
  • 13:05 godog: stop rsync of ca/volatile on puppetmaster1001
  • 12:31 godog: chown puppet:puppet /var/lib/puppet/server/ssl/ca on puppetmaster2001
  • 12:20 godog: running puppet on puppetmaster[21]001 - T189891
  • 12:12 godog: stopping puppet fleetwide for ca migration - T189891
  • 11:20 elukey: rolling restart of the hadoop hdfs datanode daemons on all the analytics hadoop workers for openjdk-8 upgrade
  • 11:18 apergos: and a third time to try updating the puppet compiler facts, this time using puppetmaster2001
  • 11:09 arturo: T189722 reboot labtestvirt2002 to downgrade kernel
  • 11:02 moritzm: installing plexus-utils security updates
  • 11:01 arturo: T189722 reboot labtestvirt2001 to downgrade kernel
  • 10:53 apergos: due to miscommunication, second update of puppet compiler facts happening now. oh well
  • 10:42 elukey: update puppet compiler's fact
  • 10:28 ema: cp-upload_esams: carry on with reboots for retpoline kernel updates T188092
  • 10:10 ema: repool cp3010
  • 09:55 elukey: rolling restart of yarn nodemanagers on the analytics hadoop workers for openjdk-8 upgrade
  • 09:21 marostegui: Truncate updatelog on s3 - T174804
  • 09:19 marostegui: Truncate updatelog on s1 - T174804
  • 09:04 marostegui: Truncate updatelog on s7 - T174804
  • 08:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 (duration: 01m 15s)
  • 08:45 marostegui: Truncate updatelog on s2 - T174804
  • 08:30 marostegui: Truncate updatelog on s4,s5,s6,s8 - T174804
  • 08:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1006 after kernel, mariadb and socket location upgrade (duration: 01m 11s)
  • 08:21 jynus: upgrade and restart db1060
  • 08:17 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 (duration: 01m 15s)
  • 08:06 marostegui: Restart pt-heartbeat on pc2006
  • 08:05 marostegui: Restart pt-heartbeat on pc2004 and pc2005
  • 08:04 marostegui: Restart pt-heartbeat on pc1004 and pc1005
  • 07:59 marostegui: Stop MySQL on pc1006 for kernel, mariadb and socket path upgrade
  • 07:58 elukey: depool cp3010 + powercycle (no ssh access, mgmt console frozen)
  • 07:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1006 for kernel, mariadb and socket location upgrade (duration: 01m 16s)
  • 06:25 marostegui: Remove db1001 from tendril - T190262
  • 06:25 marostegui: Stop MySQL on db1001 to get ready to decommission it - T190262
  • 06:16 marostegui: Reload dbproxy1006 to pick up the new standby host - T183469
  • 06:16 marostegui: Reload dbproxy1001 to pick up the new standby host - T183469
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 07m 46s)
  • 01:52 ebernhardson: increase cluster.routing.allocation.disk.watermark.low to 80% on eqiad elasticsearch due to shards not allocating during reindex
  • 01:10 ebernhardson: started in-place reindex of all wikis on both elasticsearch clusters
  • 00:02 andrewbogott: restarted nova-network on labnet1001 and nova-compute on labvirt1015 as part of debugging T190367
  • 00:00 Amir1: Evening SWAT is done
  • 00:00 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: guwiki: fix rollback -> rollbacker (group) (T190370) (duration: 01m 16s)

2018-03-21

  • 23:53 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Migrate $wgOresModels to the new config system (T189948) (duration: 01m 16s)
  • 23:41 ladsgroup@tin: Synchronized wmf-config/throttle.php: Add new throttle rule and add task for one in comment (duration: 01m 16s)
  • 23:36 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: guwiki: clean up $wg{Add,Remove}Groups configuration (duration: 01m 16s)
  • 23:21 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgAbuseFilterProfile & $wgAbuseFilterRuntimeProfile on eswikibooks, part II (T190264) (duration: 01m 15s)
  • 23:19 ladsgroup@tin: Synchronized wmf-config/abusefilter.php: Enable $wgAbuseFilterProfile & $wgAbuseFilterRuntimeProfile on eswikibooks, part I (T190264) (duration: 01m 15s)
  • 22:33 eileen: civicrm revision changed from 3291ad35c9 to 85c89c7d0a, config revision is 03511638ed
  • 22:32 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: Revert global prefs (duration: 01m 15s)
  • 22:18 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/421185/ (duration: 01m 15s)
  • 21:55 andrew@tin: Synchronized wmf-config/CommonSettings.php: turning off wgReadOnly on labtestwikitech (duration: 01m 16s)
  • 20:34 mlitn@tin: Finished deploy [3d2png/deploy@812a68a]: Updating 3d2png (duration: 02m 57s)
  • 20:31 mlitn@tin: Started deploy [3d2png/deploy@812a68a]: Updating 3d2png
  • 20:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@675837f]: Update mobileapps to e6b50a0 (duration: 05m 33s)
  • 20:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@675837f]: Update mobileapps to e6b50a0
  • 19:12 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.26
  • 19:09 demon@tin: Synchronized php: symlink bump (duration: 01m 15s)
  • 19:05 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: rvv (duration: 01m 15s)
  • 19:03 anomie: Deleted some 12-year-old open proxy blocks to resolve T189840.
  • 18:36 demon@tin: Synchronized wmf-config/CommonSettings-labs.php: beta-only (duration: 01m 16s)
  • 18:34 demon@tin: Synchronized scap/plugins/prep.py: consistency (duration: 01m 17s)
  • 18:09 oblivian@puppetmaster2001: conftool action : set/pooled=yes; selector: name=mw22(59|[6-9][0-9])\.codfw\.wmnet
  • 18:08 _joe_: pooling all the new codfw appservers
  • 18:05 maxsem@tin: Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/420397/ (duration: 01m 15s)
  • 18:02 godog: delete obsolete metrics from prometheus following https://gerrit.wikimedia.org/r/c/421086
  • 17:46 maxsem@tin: Synchronized wmf-config/Wikibase.php: https://gerrit.wikimedia.org/r/#/c/420336/ (duration: 01m 15s)
  • 17:43 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/421046/ (duration: 01m 15s)
  • 17:35 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/420947/ (duration: 01m 15s)
  • 17:30 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/420910/ (duration: 01m 16s)
  • 17:26 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/419528/ (duration: 01m 15s)
  • 17:22 volans@tin: Finished deploy [puppetboard/deploy@d6514d6]: Adjust wsgi config - T184563 (duration: 00m 06s)
  • 17:22 volans@tin: Started deploy [puppetboard/deploy@d6514d6]: Adjust wsgi config - T184563
  • 17:17 maxsem@tin: Synchronized dblists/flow.dblist: https://gerrit.wikimedia.org/r/#/c/420799/ (duration: 01m 12s)
  • 17:11 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/419611/ (duration: 01m 15s)
  • 17:07 ppchelko@tin: Finished deploy [cpjobqueue/deploy@545cb61]: Increase refreshLinks concurrency to 20 per partition (duration: 00m 37s)
  • 17:06 ppchelko@tin: Started deploy [cpjobqueue/deploy@545cb61]: Increase refreshLinks concurrency to 20 per partition
  • 16:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 fully, post-silver cleanup (duration: 01m 14s)
  • 16:53 _joe_: running systemd-tmpfiles --create on the new appservers
  • 16:51 jynus@tin: Synchronized wmf-config/db-codfw.php: Post-silver cleanup (duration: 01m 03s)
  • 16:48 andrew@tin: Synchronized wmf-config/CommonSettings.php: one of many wikitech cleanups (duration: 01m 38s)
  • 16:46 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: one of many wikitech cleanups (duration: 03m 12s)
  • 16:42 andrew@tin: Synchronized wmf-config/wikitech.php: first of many wikitech cleanups (duration: 03m 16s)
  • 16:12 andrew@tin: Synchronized wmf-config/filebackend.php: labtestwikitech -> swift (duration: 01m 14s)
  • 16:10 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: labtestwikitech -> swift (duration: 01m 15s)
  • 16:07 oblivian@puppetmaster2001: conftool action : set/pooled=inactive; selector: name=mw22(59|[6-9][0-9])\.codfw\.wmnet
  • 15:53 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2 (duration: 00m 40s)
  • 15:53 ppchelko@tin: Started deploy [cpjobqueue/deploy@b291728]: Partition the refreshLinks topic by DB shard T189738 take 2
  • 15:51 ppchelko@tin: Finished deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738 (duration: 03m 03s)
  • 15:48 ppchelko@tin: Started deploy [cpjobqueue/deploy@0dcdc82]: Partition the refreshLinks topic by DB shard T189738
  • 15:28 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 15:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1019 after socket location upgrade (duration: 01m 12s)
  • 15:24 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 with low load (duration: 01m 15s)
  • 15:11 volans@tin: Finished deploy [puppetboard/deploy@81cd93a]: Adjust wsgi config - T184563 (duration: 00m 06s)
  • 15:11 volans@tin: Started deploy [puppetboard/deploy@81cd93a]: Adjust wsgi config - T184563
  • 15:05 jynus: stop, upgrade and restart db1079
  • 15:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 (duration: 01m 15s)
  • 13:39 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 13:23 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 13:20 zeljkof: EU SWAT finished
  • 13:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: config: Enable testwiki NavTiming oversample for a bunch more countries (T190229) (duration: 01m 15s)
  • 13:12 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T189778) (duration: 01m 16s)
  • 11:33 moritzm: rolling restart of Kibana/Logstash to pick up OpenJDK security update
  • 11:32 ema: cache_misc@esams: upgrade varnish to 5.1.3-1wm7
  • 11:32 jynus@tin: Synchronized wmf-config/db-eqiad.php: Cleanup old hosts (duration: 01m 18s)
  • 11:29 jynus@tin: Synchronized wmf-config/db-codfw.php: Cleanup old hosts (duration: 01m 13s)
  • 11:17 ema: varnish 5.1.3-1wm7 uploaded to apt.w.o
  • 10:51 marostegui: Stop MySQL on db1016 to clone db1065 - T183469
  • 10:47 moritzm: rolling restart of elasticsearch on logstash to pick up OpenJDK security update
  • 10:40 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1065 from config - T183469 (duration: 01m 15s)
  • 10:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1065 from config - T183469 (duration: 01m 15s)
  • 10:37 moritzm: rolling restart of elasticsearch on relforge to pick up OpenJDK security update
  • 10:16 volans: re-enabling puppet on einsteinium (icinga host) see T177253#4067901
  • 09:57 moritzm: installing php5 security updates on trusty (jessie already fixed)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T183469 (duration: 01m 15s)
  • 09:47 moritzm: installing tiff security updates on trusty
  • 09:40 marostegui: Stop db1065 and db1106 in sync - this will generate lag on labs
  • 09:23 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3032.esams.wmnet,service=varnish-be
  • 09:11 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 09:11 marostegui: Stop mysql on db2078 for new socket config
  • 08:56 marostegui: Stop mysql on db2037 for new socket config
  • 08:46 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:46 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3030.esams.wmnet,service=varnish-be
  • 08:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T183469 (duration: 01m 14s)
  • 08:35 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3030.esams.wmnet,service=varnish-be
  • 08:35 ema@neodymium: conftool action : set/pooled=yes; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1005 after kernel, mariadb and socket location upgrade (duration: 01m 15s)
  • 08:20 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet,service=varnish-be
  • 08:19 ema@neodymium: conftool action : set/pooled=no; selector: name=cp3033,service=varnish-be
  • 08:10 hashar: contint1001: deleting some old docker images
  • 08:09 hashar: contint1001: docker image prune ; docker container prune # T178663
  • 08:09 hashar: contint1001: docker image prune ; docker container prune
  • 08:08 marostegui: Stop MySQL on pc1005 for kernel, mariadb and socket path upgrade
  • 08:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1005 for kernel, mariadb and socket location upgrade (duration: 01m 15s)
  • 07:07 marostegui: Remove db1020 from tendril - T189773
  • 07:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1020 from config - T189773 (duration: 01m 15s)
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1020 from config - T189773 (duration: 01m 13s)
  • 06:50 marostegui: Stop MySQL on db1020 - T189773
  • 06:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1019 after socket location upgrade (duration: 01m 14s)
  • 06:29 marostegui: Stop MySQL on es1019 to upgrade socket path
  • 06:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 - socket location upgrade (duration: 01m 21s)
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 06m 18s)
  • 01:51 herron: codfw puppetdb upgrade complete. eqiad puppetmaster remains depooled T177253

2018-03-20

  • 23:41 Krinkle: Mass no-op resizing of Whisper files on graphite2001 and graphite1001 for T179622 (webpagetest.* namespace)
  • 23:01 MaxSem: Cleaned centralauth.global_preferences after testing
  • 22:58 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: Revert GlobalPreferences (duration: 01m 17s)
  • 22:55 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 05s)
  • 22:55 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 22:54 maxsem@tin: Finished scap: Test deployment of GlobalPreferences (duration: 39m 31s)
  • 22:41 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 07s)
  • 22:41 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 22:14 maxsem@tin: Started scap: Test deployment of GlobalPreferences
  • 21:02 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 26s)
  • 21:02 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:55 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 00m 09s)
  • 20:55 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:44 volans@tin: Finished deploy [puppetboard/deploy@0975558]: Initial sync (duration: 02m 19s)
  • 20:42 volans@tin: Started deploy [puppetboard/deploy@0975558]: Initial sync
  • 20:28 papaul: OS install on mw2259-mw2290
  • 19:36 herron: temporarily disabling puppet agents for puppetdb upgrade
  • 19:28 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.26
  • 19:23 ejegg: updated payments-wiki from 30f5f3edfb to 9e83e7f7a0
  • 18:54 demon@tin: Finished scap: bootstrap wmf.26 (duration: 42m 16s)
  • 18:33 ema: varnish 5.1.3-1wm6 uploaded to apt.w.o
  • 18:30 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751) (duration: 02m 43s)
  • 18:27 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751)
  • 18:24 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751) (duration: 05m 58s)
  • 18:18 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): Update Dev environment to current production (T186751)
  • 18:12 demon@tin: Started scap: bootstrap wmf.26
  • 18:10 demon@tin: Synchronized wmf-config/CommonSettings.php: instantcommons for labstestwiki (duration: 01m 58s)
  • 17:30 mholloway-shell@tin: Finished deploy [mobileapps/deploy@fad1009]: Update mobileapps to 634a15f (duration: 05m 34s)
  • 17:29 elukey: test a depool/repool action for kafka1001 (eventbus/jobqueue) - part of an investigation to figure out where timeouts come from
  • 17:24 mholloway-shell@tin: Started deploy [mobileapps/deploy@fad1009]: Update mobileapps to 634a15f
  • 17:06 demon@tin: Pruned MediaWiki: 1.31.0-wmf.24 [keeping static files] (duration: 01m 23s)
  • 17:04 demon@tin: Pruned MediaWiki: 1.31.0-wmf.22 (duration: 02m 57s)
  • 16:38 jynus: running reset slave all on db1063 T189655
  • 16:16 akosiaris: restart bacula-dir T189655
  • 16:14 akosiaris: restart etherpad T189655
  • 16:13 jynus: db1063 in read-write (m1) again
  • 16:10 jynus: set m1 in read only
  • 16:09 jynus: heartbeat killed on m1-master
  • 16:02 herron: restarted apache2 on puppetmaster1001
  • 16:00 jynus: disable puppet on db1063, db1016
  • 15:57 jynus: changing replication topology of m1
  • 15:51 no_justification: gerrit: restarting services to pick up 2.14.6 -> 2.14.7 upgrade
  • 15:49 demon@tin: Finished deploy [gerrit/gerrit@09534cb]: gerrit 2.14.7 (duration: 00m 12s)
  • 15:49 demon@tin: Started deploy [gerrit/gerrit@09534cb]: gerrit 2.14.7
  • 15:20 marostegui: Drop empty (confirmed) table slots from s3 - T190153
  • 14:59 herron: codfw puppet masters upgraded to puppetdb4. placing puppet agents into icinga downtime and beginning puppet —noop runs (to send facts to new puppetdb) T177253
  • 14:58 marostegui: Drop empty (confirmed) table slots from s7 - T190153
  • 14:55 marostegui: Drop empty (confirmed) table slots from s6 - T190153
  • 14:53 twentyafterfour@tin: testing scap on tin
  • 14:53 marostegui: Drop empty (confirmed) table slots from s8 - T190153
  • 14:52 marostegui: Drop empty (confirmed) table slots from s5 - T190153
  • 14:52 marostegui: Drop empty (confirmed) table slots from s4 - T190153
  • 14:47 godog: upload scap 3.7.7-1 - T189306
  • 14:42 marostegui: Drop empty (confirmed) table slots from s2 - T190153
  • 14:40 marostegui: Drop empty (confirmed) table slots from s1 - T190153
  • 14:14 moritzm: rolling restart of elasticsearch in deployment-prep for new Java update
  • 14:03 ema: cp3007: upgrade varnish to 5.1.3-1wm5
  • 14:00 ema: upload varnish_5.1.3-1wm5 to apt.w.o
  • 13:59 ayounsi@tin: Finished deploy [netbox/deploy@7e29963]: Fixing netbox typo LDAP (duration: 00m 28s)
  • 13:59 ayounsi@tin: Started deploy [netbox/deploy@7e29963]: Fixing netbox typo LDAP
  • 13:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065, give main traffic to db1106 - T183469 (duration: 00m 58s)
  • 13:29 herron: depooling codfw puppet masters via dns T177253
  • 12:59 moritzm: restarting apache on bohrium/piwik to pick up curl security update
  • 12:53 jynus: applying schema change to wikishared.cx_translations T190133
  • 12:50 arturo: reboot labtestservices2003 for T189722
  • 12:33 arturo: reboot labtestservices2002 for T189722
  • 12:04 arturo: reboot labtestservices2001 for T189722
  • 11:28 godog: run compiler-update-facts
  • 11:07 arturo: reboot labtestnet2002 for T189722
  • 11:03 jynus: upgrade and reboot db1095 - this can create temp. lag on wikireplicas
  • 10:50 arturo: reboot again labtestnet2001 for T189722. Now with a proper grub menu
  • 10:44 jynus: upgrade and reboot db1102 - this can create tempory lag on wikireplicas
  • 10:44 arturo: reboot labtestnet2001 for T189722
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1012 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 09:28 jynus: repool labsdb1009 after upgrade
  • 09:11 moritzm: restarting apache on netmon* to pick up curl security updates
  • 09:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1012 after kernel, mariadb and socket location upgrade (duration: 00m 57s)
  • 09:01 hashar: restarting Jenkins for java update
  • 08:50 marostegui: Stop MySQL on es1012 for mariadb, kernel and socket location upgrade
  • 08:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1012 for kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool pc1004 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:41 jynus: upgrade and restart labsdb1009
  • 08:34 jynus: depool labsdb1009
  • 08:25 moritzm: installing curl security updates
  • 08:23 marostegui: Stop MySQL on pc1004 for mariadb, kernel and socket location upgrade
  • 08:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool pc1004 for kernel, mariadb and socket location upgrade (duration: 00m 57s)
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1106 in s1 - T183469 (duration: 00m 58s)
  • 06:18 marostegui: Deploy schema change on s4 primary master db1068 - T187089 T185128 T153182
  • 04:18 krinkle@tin: Synchronized wmf-config/throttle-analyze.php: (no justification provided) (duration: 00m 58s)
  • 04:17 krinkle@tin: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 00m 58s)
  • 03:56 Krinkle: Deleting stale webpagetest.* metrics on graphite1001 and graphite2001 (any wsp file last modified 600+ days ago) – T179622
  • 02:32 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 06m 40s)
  • 00:00 reedy@tin: Synchronized wmf-config/CommonSettings.php: Allow protocol-relative URLs in TemplateStyles (duration: 00m 59s)

2018-03-19

  • 23:43 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable Wikidata description override on testwiki (duration: 00m 58s)
  • 23:39 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Log ReadingLists warning (duration: 00m 58s)
  • 23:36 ayounsi@tin: Finished deploy [netbox/deploy@f7faa04]: Fixing netbox deploy issue (duration: 00m 38s)
  • 23:35 reedy@tin: Synchronized multiversion/MWRealm.php: T45956 (duration: 00m 57s)
  • 23:35 ayounsi@tin: Started deploy [netbox/deploy@f7faa04]: Fixing netbox deploy issue
  • 23:27 ayounsi@tin: Finished deploy [netbox/deploy@bed8da1]: Fixing netbox deploy issue (duration: 00m 37s)
  • 23:27 ayounsi@tin: Started deploy [netbox/deploy@bed8da1]: Fixing netbox deploy issue
  • 20:32 mutante: signing puppet certs for new host bast1002. initial puppet run, will replace bast1001 soon (T186623)
  • 20:19 bblack: discarding unused vcl on all cp frontends, 1-at-a-time
  • 20:14 bblack: discarding unused vcl on all cp backends, 1-at-a-time
  • 19:53 andrew@tin: Synchronized wmf-config/wikitech.php: fix for T189347 take 2 (duration: 00m 57s)
  • 19:42 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751) (duration: 10m 28s)
  • 19:38 andrew@tin: Synchronized wmf-config/wikitech.php: fix for T189347 (duration: 00m 57s)
  • 19:31 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751)
  • 19:25 mobrovac@tin: (no justification provided)
  • 19:24 herron: upgraded compiler03.puppet3-diffs.eqiad.wmflabs (depooled) to puppetdb4/postgres backend
  • 19:14 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751) (duration: 08m 30s)
  • 19:05 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): update dev environment to current production (T186751)
  • 19:01 mutante: DNS - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones on ns servers to recreate zone files to add new language "gor" to langs.tmpl (T189109)
  • 19:00 mutante: adding gor.wikipedia.org - new language Gorontalo https://www.ethnologue.com/language/gor | https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikipedia_Gorontalo
  • 18:44 smalyshev@tin: Finished deploy [wdqs/wdqs@d6bc746]: GUI update (duration: 02m 24s)
  • 18:43 eevans@tin: Finished deploy [restbase/deploy@8dbc93c] (dev-cluster): bring dev environment current w/ production (T186751) (duration: 10m 16s)
  • 18:42 smalyshev@tin: Started deploy [wdqs/wdqs@d6bc746]: GUI update
  • 18:33 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable $wgFlowReadOnly on commonswiki (T186463) (duration: 00m 57s)
  • 18:33 eevans@tin: Started deploy [restbase/deploy@8dbc93c] (dev-cluster): bring dev environment current w/ production (T186751)
  • 18:27 catrope@tin: Synchronized dblists/: Uninstall Flow from wikis where it was never used (T188812) (duration: 00m 57s)
  • 18:14 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable mapframe on knwiki (T189883) (duration: 00m 58s)
  • 18:08 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add enwiki and commons as import sources to mrwikisource (T188486) (duration: 00m 58s)
  • 15:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1091 (duration: 00m 59s)
  • 15:23 elukey: reboot kafka1003 for kernel upgrades (jobqueues/eventbus)
  • 15:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1091 (duration: 01m 01s)
  • 15:05 hashar: upgrading java on contint1001 / contint2001
  • 14:42 akosiaris: T184919 pool all kubernetes for service mathoid.
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1003.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1002.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2004.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2003.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:40 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2002.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:34 elukey: reboot kafka1002 (eventbus/jobqueue) for kernel upgrades
  • 14:28 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes2001.codfw.wmnet (tags: ['dc=codfw', 'cluster=scb', 'service=mathoid'])
  • 14:28 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: kubernetes1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=scb', 'service=mathoid'])
  • 14:18 ema: cp3040: discard old VCL T189892
  • 14:09 moritzm: restarting apache on contint1001 to pick up curl security update
  • 13:48 anomie: Cleaning up orphaned image_comment_temp rows on all wikis for T189985
  • 13:44 anomie@tin: Synchronized php-1.31.0-wmf.25/includes/filerepo/file/LocalFile.php: Applying fix for T189985 (duration: 00m 58s)
  • 13:22 zeljkof: EU SWAT finished
  • 13:20 zfilipin@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: Revert "Restrict FlaggedRevs to only operated on NS_MAIN on arwiki" (T148603 T189224) (duration: 00m 58s)
  • 13:10 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollbacker user right at arwikiquote (T189732) (duration: 00m 57s)
  • 13:09 moritzm: reimage mw1294-1296 as video scalers
  • 13:02 arturo: labtestcontrol2001: set GRUB_TIMEOUT=30 in /etc/default/grub, the previous value (10) wasn't enough to display the menu via mgmt
  • 12:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1091 (duration: 00m 57s)
  • 12:40 arturo: T189722 reboot labtestcontrol2001
  • 12:37 moritzm: installing curl security updates
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore es1016 original weight after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 11:55 _joe_: stopping hhvm on terbium for a test.
  • 11:44 moritzm: reimage mw1293 as video scaler
  • 11:29 godog: point codfw puppet to puppetmaster2001
  • 11:27 hashar@tin: Synchronized docroot/wwwportal/portal: (no justification provided) (duration: 00m 57s)
  • 11:17 ema: cache_misc@esams: upgrade to varnish 5.1.3-1wm4
  • 11:14 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Switching portals submodule to portals-deploy (T180777) (duration: 00m 58s)
  • 11:13 jdrewniak@tin: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Switching portals submodule to portals-deploy (T180777) (duration: 00m 58s)
  • 11:06 moritzm: uploaded openjdk-8 8u162-b12-1~bpo8+1 for jessie-wikimedia to apt.wikimedia.org
  • 10:58 godog: point eqsin puppet to puppetmaster2001
  • 10:53 moritzm: restarting jenkins on releases1001 to pick up Java security update
  • 10:47 godog: point ulsfo puppet to puppetmaster2001
  • 10:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T183469 (duration: 00m 58s)
  • 10:25 marostegui: Remove db1009 from tendril - T189216
  • 10:14 ema: cp3008: upgrade to varnish 5.1.3-1wm4
  • 09:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1009 from config - T189216 (duration: 00m 57s)
  • 09:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1009 from config - T189216 (duration: 00m 58s)
  • 09:45 marostegui: Stop MySQL on db1009 - T189216
  • 09:37 elukey: restart hadoop daemons on analytics1070 for openjdk upgrades (canary)
  • 09:27 godog: reimage puppetmaster2001 with stretch - T184562
  • 09:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1016 after kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 09:10 godog: depool codfw puppetmaster - T184562
  • 09:08 marostegui: Stop MySQL on es1016 for kernel, mariadb and socket location upgrade
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1016 for kernel, mariadb and socket location upgrade (duration: 00m 58s)
  • 08:57 moritzm: installing openjdk-8 security updates
  • 08:41 elukey: reboot thorium for kernel security upgrades (hosts all analytics websites, they will go down temporary)
  • 08:26 moritzm: installing libvorbis security updates
  • 08:22 elukey: revert previous state on aqs1004, the new pkg might need some more work - T189529
  • 08:19 marostegui: Reset slave on db1106 to get it ready for s1 - https://phabricator.wikimedia.org/T183469
  • 08:11 marostegui: Reboot db1106 for kernel upgrade
  • 07:58 elukey: manually installed cassandra-2.2.6-wmf3 on aqs1004 - T189529
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T183469 (duration: 00m 57s)
  • 07:47 elukey: drain cassandra instances and reboot aqs1004 for kernel upgrades
  • 07:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Move db1106 from s5 to s1 - T183469 (duration: 01m 00s)
  • 07:27 marostegui: Reload dbproxy1002 and dbproxy1007 to get the new config - T189773
  • 06:20 marostegui: Deploy schema change on db1091 - T187089 T185128 T153182
  • 06:13 marostegui: Stop MySQL on db1091 for kernel and mariadb upgrade
  • 06:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1091 for schema change, kernel upgrade and mariadb upgrade (duration: 00m 58s)
  • 02:39 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.25) (duration: 10m 54s)

2018-03-17

  • 18:41 elukey: executed apt-get clean on scb1004 to free some space (root partition disk space warning)
  • 03:09 krinkle@tin: Synchronized docroot/noc/db.php: noc: I410a56431a (duration: 00m 59s)
  • 00:13 mutante: running puppet on all cache::misc to rename director bromine to webserver_misc_static (T188163)

2018-03-16

  • 23:32 mutante: signing puppet cert for vega.codfw.wmnet, initial puppet run after fresh stretch install (T188163)
  • 18:43 mutante: creating new ganeti VM vega.codfw.wmnet to be equivalent of bromine, 1G RAM, 30G disk, 1vCPU (T189899)
  • 18:13 jynus: switching back wikireplica cloud dns to the original config
  • 17:32 jynus: reimage dbproxy1010
  • 16:29 jynus: updating wikireplica_dns 2/3
  • 16:22 moritzm: installing curl security updates
  • 16:09 marostegui: Stop MySQL on db1020 - T189773
  • 14:48 andrewbogott: reset contintcloud quotas as per https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Troubleshooting#incorrect_quota_violations
  • 14:48 jynus: reimage dbproxy1011
  • 14:27 andrewbogott: restarting nodepool on nodepool1001
  • 14:25 elukey: reboot druid1002 for kernel updates
  • 14:14 andrewbogott: restarting rabbitmq on labcontrol1001
  • 13:57 andrewbogott: stopping nodepool temporarily during changes to nova.conf
  • 13:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2050 (duration: 00m 58s)
  • 13:15 chasemp: disable puppet across cloud things for safe rollout
  • 12:52 moritzm: uploaded libsodium23/php-acpu/php-mailparse to thirdparty/php72 (deps/extentions needed by Phabricator)
  • 12:51 ema: text-esams: reboot for kernel upgrades T188092 and to mitigate https://grafana.wikimedia.org/dashboard/db/varnish-failed-fetches?panelId=7&fullscreen&orgId=1&from=1518746284946&to=1521204628041
  • 12:12 marostegui: Reboot dbproxy1005 for kernel upgrade
  • 12:02 marostegui: Run pt-table-checksum on m2
  • 12:00 marostegui: Run pt-table-checksum on m5
  • 11:11 hashar: zuul: reenqueue all coverage jobs lost when restarting Zuul
  • 10:53 hashar: Upgrading zuul to zuul_2.5.1-wmf4 to resolve a mutex deadlock T189859
  • 10:45 jynus: disable puppet and load balance between 3 wikirreplicas on dbproxy1010
  • 10:19 jynus: upgrade and restart of dbproxy1009 (passive)
  • 10:01 elukey: restart eventlogging_sync on db1108 (eventlogging db slave) as precautions after the change of m4-master.eqiad.wmnet's CNAME
  • 10:00 moritzm: reverting the HHVM/ICU 57 setup on mwdebug2001 which was used for the dry run tests
  • 09:57 elukey: restart eventlogging-consumer@mysql-eventbus on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:56 hashar: Zuul coverage pipeline is deadlocked on an unreleased mutex. Will need a new Zuul version.
  • 09:51 elukey: restart eventlogging-consumer@mysql-m4 on eventlog1002 to force the DNS resolution of m4-master (changed from dbproxy1009 -> dbproxy1004)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1015 after kernel, mariadb and socket upgrade (duration: 00m 57s)
  • 09:27 oblivian@tin: Finished deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2 (duration: 00m 29s)
  • 09:26 oblivian@tin: Started deploy [netbox/deploy@ccc342a]: Re-deploying with the newly built artifacts/2
  • 09:17 oblivian@tin: (no justification provided)
  • 09:17 oblivian@tin: Finished deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts (duration: 00m 47s)
  • 09:16 oblivian@tin: Started deploy [netbox/deploy@f3e0159]: Re-deploying with the newly built artifacts
  • 09:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T183469 (duration: 00m 57s)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1015 after kernel, mariadb and socket upgrade (duration: 00m 56s)
  • 08:49 jynus: upgrade and restart of dbproxy1004 (passive)
  • 08:41 marostegui: Stop MySQL on es1015 for maintenance
  • 08:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1015 for kernel, mariadb and socket upgrade (duration: 00m 58s)
  • 08:40 elukey: reboot druid1006 for kernel updates
  • 08:29 elukey: reboot druid1005 for kernel updates
  • 07:53 moritzm: reimage mc2036 after mainboard replacement (T185587)
  • 07:15 marostegui: Stop MySQL on es2017 (es3 codfw master) for maintenance
  • 07:06 marostegui: Stop MySQL on es2016 (es2 codfw master) for maintenance
  • 06:52 marostegui: Stop MySQL on db2048 (s1 codfw master) for maintenance
  • 06:41 marostegui: Stop MySQL on db2051 (s4 codfw master) for maintenance
  • 06:28 marostegui: Stop MySQL on db2045 (s8 codfw master) for maintenance
  • 06:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1084 (duration: 00m 58s)
  • 01:46 XioNoX: librenms IRC bot moved to -operations channel. Doc on how to turn it off is on https://wikitech.wikimedia.org/wiki/LibreNMS#IRC_Alerting
  • 01:00 reedy@tin: Synchronized php-1.31.0-wmf.25/includes/specials/pagers/NewFilesPager.php: Fix T189846 (duration: 00m 58s)

2018-03-15

  • 23:25 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: Fix display issues (duration: 00m 59s)
  • 23:20 ebernhardson@tin: Synchronized php-1.31.0-wmf.25/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT: T187148: Turn off Cirrus AB test (duration: 00m 58s)
  • 22:58 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/AbuseFilter/: add some missing globals (duration: 00m 58s)
  • 20:38 demon@tin: Synchronized robots.txt: minor tidying (duration: 00m 58s)
  • 20:05 chasemp: disable puppet for cloud things for a safe rollout
  • 19:50 XenoRyet: updated civicrm from 9e79d63426 to 3291ad35c9
  • 19:14 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.25
  • 18:51 niharika29@tin: Synchronized php-1.31.0-wmf.25/extensions/MobileApp/: https://gerrit.wikimedia.org/r/#/c/419785/; https://gerrit.wikimedia.org/r/#/c/419784/; https://gerrit.wikimedia.org/r/#/c/419776/ (duration: 01m 14s)
  • 18:25 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/417329/ (duration: 01m 15s)
  • 18:11 maxsem@tin: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 16s)
  • 18:09 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/419492/ (duration: 01m 15s)
  • 17:27 ppchelko@tin: Finished deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources (duration: 01m 23s)
  • 17:26 bsitzmann@tin: Finished deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327) (duration: 05m 38s)
  • 17:25 ppchelko@tin: Started deploy [changeprop/deploy@9f4f380]: Purge media endpoint and update sources
  • 17:20 bsitzmann@tin: Started deploy [mobileapps/deploy@97d9085]: Update mobileapps to c5e1522 (T184327)
  • 17:18 moritzm: installing dbus updates from stretch 9.4 point release
  • 16:43 ppchelko@tin: Finished deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints (duration: 15m 22s)
  • 16:28 ppchelko@tin: Started deploy [restbase/deploy@8dbc93c]: Release lint and media endpoints
  • 16:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2050 for data checks (duration: 01m 15s)
  • 15:58 volans: updated facts on both CI puppet-compilers
  • 15:56 moritzm: pruning obsolete packages from jessie-wikimedia/experimental
  • 15:56 marostegui: Stop MySQL on s5 codfw master (db2052) this will break replication on s5 codfw
  • 15:51 godog: repool puppetmaster1002
  • 15:47 moritzm: installing libvirt security updates
  • 15:20 elukey: reboot druid1003 for kernel updates
  • 15:13 marostegui: Stop MySQL on s6 codfw master (db2039) this will break replicaiton on s6 codfw
  • 15:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after socket path location update (duration: 01m 15s)
  • 15:05 _joe_: restarted jobrunner, jobchron on the eqiad jobrunners
  • 14:30 elukey: reboot druid1004 for kernel updates
  • 13:51 elukey: reboot kafka1001 (eventbus/job-queues eqiad) for kernel updates
  • 13:49 zeljkof: EU SWAT finished
  • 13:48 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 14s)
  • 13:33 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout, again. Last time didn't pick the right partman config
  • 13:14 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change autoconfirmed settings and Enable flood group at zhwikiquote (T189289) (duration: 01m 15s)
  • 13:09 moritzm: restarting HHVM on canaries to pick up curl security update
  • 13:09 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule, clean expired rules (T189442) (duration: 01m 15s)
  • 12:54 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 12:36 moritzm: installing curl security updates on jessie/stretch
  • 12:26 arturo: T189682 reimage labtestmetal2001 with jessie and a new partition layout
  • 12:08 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1007 after kernel security update (duration: 01m 14s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after socket path location update (duration: 01m 14s)
  • 11:59 moritzm: rebooting rdb1007 for kernel security update
  • 11:56 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1007 for kernel security update (duration: 01m 14s)
  • 11:52 marostegui: Stop MySQL on es1013 for socket path upgrade
  • 11:51 moritzm: rebooted rdb1005 for kernel security update
  • 11:49 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1005 after kernel security update (duration: 01m 14s)
  • 11:48 godog: reimage puppetmaster1002 with stretch
  • 11:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for socket path location update (duration: 01m 14s)
  • 11:42 godog: depool puppetmaster1002 for stretch reimage
  • 11:29 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1005 for kernel security update (duration: 01m 10s)
  • 11:16 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1003 after kernel security update (duration: 01m 14s)
  • 11:04 moritzm: rebooting rdb1003 for kernel security update
  • 11:01 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1003 for kernel security update (duration: 01m 14s)
  • 10:48 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling rdb1001 after kernel security update (duration: 01m 14s)
  • 10:32 moritzm: rebooting rdb1001 for kernel security update
  • 10:24 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling rdb1001 for kernel security update (duration: 01m 14s)
  • 10:22 ema: apt.w.o: upload varnish=5.1.3-1wm4 to jessie-wikimedia/main (upstream "extrachance" fixes) T174932
  • 10:12 gehel@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=elastic1021.eqiad.wmnet
  • 09:56 ema: apt.w.o: move varnish=5.1.3-1wm3, varnish-modules=0.12.1-1+wmf1, libvmod-netmapper=1.6-1 from jessie-wikimedia/experimental to jessie-wikimedia/main T188545
  • 09:56 moritzm: installing curl security updates on Debian
  • 09:30 godog: repool puppetmaster2002
  • 09:16 jynus: reset slave all @db1051
  • 08:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore normal weight for es1017 (duration: 01m 14s)
  • 08:44 godog: roll-restart thumbor in eqiad/codfw to enable access to swift private container
  • 08:42 jynus: end of maintenance for m2
  • 08:31 jynus: setting m2 as read only
  • 08:29 gilles: setZoneAccess done
  • 08:28 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 08:18 jynus: disable puppet on db1051, db1020 for switchover preparation
  • 08:06 ayounsi@tin: Finished deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1 (duration: 01m 02s)
  • 08:05 ayounsi@tin: Started deploy [netbox/deploy@278aec4]: Upgrading Netbox to v2.3.1
  • 08:01 jynus: switching db2044 to be a direct replica of db1051
  • 07:49 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 01m 07s)
  • 07:48 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 05s)
  • 07:30 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:30 ayounsi@tin: Finished deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1 (duration: 00m 39s)
  • 07:29 ayounsi@tin: Started deploy [netbox/deploy@7310860]: Upgrading Netbox to v2.3.1
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1017 (duration: 01m 14s)
  • 07:21 moritzm: reimaging mc2036 after hardware replacement T185587
  • 07:07 marostegui: Stop mariadb on es1017 for kernel, mariadb and socket location upgrade
  • 07:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1017 (duration: 01m 14s)
  • 07:01 marostegui: Deploy schema change on db1084 - T187089 T185128 T153182
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 01m 15s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1064 (duration: 01m 15s)
  • 06:29 marostegui: Stop MySQL on db1064 for mariadb upgrade
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 10m 10s)
  • 00:25 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/client/includes/RecentChanges/ExternalChangeFactory.php: T189320 Use only local part of username when building the RC line (duration: 01m 18s)
  • 00:22 tgr@tin: Synchronized php-1.31.0-wmf.24/includes/user/ExternalUserNames.php: T189320 Add ExternalUserNames::getLocal() to get local part of username (duration: 01m 15s)
  • 00:20 ejegg: updated payments-wiki from 9068692c32 to 30f5f3edfb
  • 00:08 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/: VE fixes followup (duration: 01m 15s)
  • 00:03 tgr@tin: Synchronized php-1.31.0-wmf.25/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 15s)
  • 00:02 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw: VE fixes: T189267, T189381 (duration: 01m 16s)

2018-03-14

  • 23:45 XenoRyet: updated payments-wiki from 86715f6e9e to 9068692c32
  • 23:45 tgr@tin: Synchronized wmf-config/Wikibase.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 14s)
  • 23:43 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:41 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T184000 Enable Wikidata description override on beta cluster (duration: 01m 15s)
  • 23:21 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 20s)
  • 23:18 tgr@tin: Synchronized wmf-config/InitialiseSettings-labs.php: T181159 Update ORES threshold config to the new syntax (duration: 01m 15s)
  • 22:13 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/Thanks: T189752 (duration: 01m 16s)
  • 21:27 hoo: Ran scap pull on mwdebug1001 after testing https://gerrit.wikimedia.org/r/417180
  • 21:26 andrewbogott: rebuilding labtestweb2001 with Debian Stretch
  • 20:34 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.25
  • 20:32 demon@tin: Synchronized php: symlink bump to wmf.25 (duration: 01m 14s)
  • 20:27 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c (duration: 05m 37s)
  • 20:24 demon@tin: Finished scap: trying a php5/hhvm theory (duration: 06m 37s)
  • 20:21 mholloway-shell@tin: Started deploy [mobileapps/deploy@0f9625a]: Update mobileapps to 9f4a80c
  • 20:17 demon@tin: Started scap: trying a php5/hhvm theory
  • 20:16 demon@tin: Finished scap: scapping, pt. 2. prior one failed because i tested something (duration: 69m 43s)
  • 19:06 demon@tin: Started scap: scapping, pt. 2. prior one failed because i tested something
  • 19:06 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "demon"; reason is "rebuilding l10n" (duration: 00m 00s)
  • 18:20 jynus: running pt-table-checksum on all m2, some lag will happen on passive replicas
  • 18:16 jynus: running pt-table-checksum on all m1, some lag will happen on passive replicas
  • 17:56 demon@tin: Started scap: rebuilding l10n
  • 17:55 reedy@tin: Synchronized php-1.31.0-wmf.25/extensions/CentralNotice: updates! (duration: 01m 16s)
  • 17:54 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "reedy"; reason is "updates!" (duration: 00m 00s)
  • 17:54 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralNotice: updates! (duration: 01m 18s)
  • 17:26 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/416489 (duration: 01m 14s)
  • 17:18 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/419077 (duration: 01m 15s)
  • 16:58 hoo: Manually running extensions/Wikibase/repo/maintenance/dispatchChanges.php on terbium, so that dispatching can catch up
  • 16:56 jynus: deploying new firewall rules to dbproxy1001 and 7
  • 16:40 moritzm: installing cron updates from stretch 9.4 point release
  • 16:35 demon@tin: Synchronized .gitignore: ignore scap logs (duration: 01m 15s)
  • 16:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore db1074 original weight (duration: 01m 13s)
  • 16:12 godog: temporarily add back puppetmaster2002 as a low-weight backend
  • 15:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 15:47 andrew@tin: Synchronized multiversion/MWMultiVersion.php: wikitech cleanup (duration: 01m 14s)
  • 15:25 XioNoX: Re-enabling BGP on cr2-codfw Zayo transit - T189452
  • 15:12 XioNoX: Disabling BGP on cr2-codfw Zayo transit - T189452
  • 15:02 jynus: disabling puppet in preparation for reimage of dbproxy1002 and 6
  • 14:59 moritzm: installing virt-what updates from stretch point release
  • 14:58 paravoid: rebooting furud
  • 14:44 ottomata: beginning migration of eventlogging analtyics from Kafka analytics to Kafka jumbo: T183297
  • 14:33 godog: depool puppetmaster2002 for reimage
  • 14:06 Reedy: created wbc_entity_uages on ruwikimedia T188456
  • 13:50 zeljkof: EU SWAT finished
  • 13:49 zfilipin@tin: Synchronized dblists/wikidataclient.dblist: SWAT: Revert "Add ruwikimedia to wikidataclient" (T188456) (duration: 01m 14s)
  • 13:42 zfilipin@tin: Synchronized docroot/noc/conf/: SWAT: Revert "Publish throttle-analyze at noc" (T187894) (duration: 01m 15s)
  • 13:21 ppchelko@tin: Finished deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull (duration: 00m 33s)
  • 13:21 ppchelko@tin: Started deploy [cpjobqueue/deploy@c879056]: Deduplicate based on the root job dt and sha1 combination. Forgot to pull
  • 13:21 zfilipin@tin: Synchronized docroot/noc/conf/throttle-analyze.php.txt: SWAT: Publish throttle-analyze at noc (T187894) (duration: 01m 13s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination (duration: 00m 38s)
  • 13:20 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Deduplicate based on the root job dt and sha1 combination
  • 13:12 zfilipin@tin: Synchronized dblists/commonsuploads.dblist: SWAT: Disable upload for non-admins on kowikiversity (T189021) (duration: 01m 14s)
  • 13:06 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Remove obsolete throttle rules, add one new (T189241) (duration: 01m 15s)
  • 12:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 14s)
  • 12:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1074 (duration: 01m 15s)
  • 12:22 kartik@tin: Finished deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c (duration: 03m 12s)
  • 12:19 kartik@tin: Started deploy [cxserver/deploy@c204d9c]: Update cxserver to c355d0c
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1074 (duration: 01m 14s)
  • 11:45 marostegui: Stop db1074 for kernel upgrade
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for data checks and kernel upgrade (duration: 01m 14s)
  • 11:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore original weight for es1018 after kernel and mariadb upgrade (duration: 01m 15s)
  • 11:02 moritzm: rebooting einsteinium / icinga.wikimedia.org for kernel security update
  • 10:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slwoly repool es1018 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:37 marostegui: Stop mariadb on es1018 for kernel and mariadb upgrade + change socket location
  • 10:35 moritzm: rebooting hydrogen for kernel security update
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1018 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2006 after kernel and mariadb upgrade (duration: 01m 14s)
  • 10:22 jynus: dropping testotrs from m2
  • 10:16 jynus: archiving and dropping bugzilla_testing from m2
  • 10:10 marostegui: Stop mariadb on pc2006 for kernel and mariadb upgrade + change socket location
  • 10:09 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2006 for kernel and mariadb upgrade (duration: 01m 14s)
  • 10:07 jynus: archiving and dropping testblog from m2
  • 10:03 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2005 after kernel and mariadb upgrade (duration: 01m 15s)
  • 09:50 marostegui: Stop mariadb on pc2005 for kernel and mariadb upgrade + change socket location
  • 09:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2005 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:44 moritzm: installing samba security update (just the client side libraries)
  • 09:40 marostegui: Stop mysql on es2015 to upgrade socket path
  • 09:37 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool pc2004 after kernel and mariadb upgrade (duration: 01m 14s)
  • 09:34 marostegui: Stop mysql on es2014 to upgrade socket path
  • 09:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool pc2004 for kernel and mariadb upgrade (duration: 01m 14s)
  • 09:23 marostegui: Stop mariadb on pc2004 for kernel upgrade
  • 09:13 marostegui: Stop mysql on es2013 to upgrade socket path
  • 09:08 marostegui: Stop mysql on es2012 to upgrade socket path
  • 08:57 ema: cp3041: restart varnish-be
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool es1013 after kernel and mariadb upgrade (duration: 01m 15s)
  • 08:28 ema: cp3040: restart varnish-be
  • 08:21 hashar: Restarting the CI Jenkins
  • 07:45 marostegui: Reboot es2004 for kernel upgrade
  • 07:45 marostegui: Reboot es2003 for kernel upgrade
  • 07:34 marostegui: Reboot es2002 for kernel upgrade
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool es1013 after kernel and mariadb upgrade (duration: 01m 14s)
  • 07:03 marostegui: Stop mariadb on es1013 for mariadb and kernel upgrade
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1013 for kernel and mariadb upgrade (duration: 01m 14s)
  • 06:45 marostegui: Deploy schema change on db1064 with replication (this will generate lag on s4 on labs hosts) - T187089 T185128 T153182
  • 06:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064 for alter table (duration: 01m 14s)
  • 06:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 after alter table (duration: 01m 15s)
  • 03:13 mutante: bacula is working again - restored missing file set (https://gerrit.wikimedia.org/r/419341 )
  • 02:49 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 40s)
  • 02:44 Jamesofur: deleted 46 archived files
  • 02:18 mutante: helium - running bacula-dir with -f in foreground revealed: ERROR TERMINATION at parse_conf.c:485 - Config error: Could not find config Resource mysql-srv-backups - line 7, col 33 of file /etc/bacula/jobs.d/bohrium.eqiad.wmnet-mysql-predump-piwik-Weekly-Wed-production.conf
  • 02:17 mutante: helium - bacula director process failed (Bacula interrupted by signal 11: Segmentation violation), icinga alerted. attempted to restart it. then: bacula-dir - the configtest failed!
  • 00:01 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: crwiki logo (duration: 01m 15s)
  • 00:00 reedy@tin: Synchronized static/images/project-logos/crwiki.png: (no justification provided) (duration: 01m 14s)

2018-03-13

  • 23:46 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/MobileFrontend/: T188825 (duration: 01m 18s)
  • 23:43 mutante: tin: chmod -R g+w /srv/mediawiki-staging/.git/objects/* ; chmod -R g+w /srv/mediawiki-staging/php-1.31.0-wmf.24/.git/objects/*
  • 23:35 Reedy: that was Enable VirtualPageViews on Hungarian Wikipedia T184793
  • 23:35 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 01m 15s)
  • 23:26 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: moar logos (duration: 01m 15s)
  • 23:24 reedy@tin: Synchronized static/images/project-logos/: YOU GET A LOGO, YOU GET A LOGO. YOU ALL GET LOGOS (duration: 01m 16s)
  • 23:11 reedy@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHTML on 96 wikis T188010 (duration: 01m 16s)
  • 23:10 mutante: restbase-dev1006 - reinstalling, manually skipping " Volume group name already in use" (T185494)
  • 22:52 eileen: civicrm revision changed from c8458c4a2f to 9e79d63426, config revision is 08b7e6216e (Benevity comma fix)
  • 20:40 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.25
  • 20:09 demon@tin: Finished scap: bootstrap wmf.25 (duration: 67m 17s)
  • 19:02 demon@tin: Started scap: bootstrap wmf.25
  • 18:47 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:46 demon@tin: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "awight"; reason is "Beta: Fix ORES thresholds and enable JADE, T181159, T176333" (duration: 00m 00s)
  • 18:42 gehel: repool wdqs1004 & wdqs2001 now that data reload is completed T189548
  • 18:39 XenoRyet: updated civicrm from 8652db05f5 to c8458c4a2f
  • 18:37 moritzm: installing reportbug updates from stretch point release
  • 18:32 moritzm: installing w3m updates from stretch point release
  • 17:55 moritzm: installing ncurses updates from stretch point release
  • 17:53 moritzm: installing ncurses updates from stretch point release
  • 17:19 awight@tin: Started scap: Beta: Fix ORES thresholds and enable JADE, T181159, T176333
  • 17:06 godog: cleanup integration-slave-jessie-1001:/srv/pbuilder/build - T189587
  • 16:45 marostegui: Clean iptables rules on dbproxy1001 to leave it as dbproxy1006
  • 16:33 marostegui: Retroactive: cleared iptables rules on dbproxy1007
  • 16:32 jynus: restarting gerring on cobalt, stalled
  • 16:26 jynus: restarting gerring on cobalt, stalled
  • 16:18 jynus: update CNAME for m1-master and m2-master
  • 15:50 marostegui: Deploy schema change on db1097:3314 - T187089 T185128 T153182
  • 15:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 for alter table (duration: 00m 56s)
  • 15:39 jynus: upgrade and restart dbproxy1007
  • 15:33 vgutierrez: upgrading eqiad LVSs to pybal 1.15.2
  • 15:32 jynus: upgrade and restart dbproxy1001
  • 14:55 vgutierrez: upgrading codfw LVSs to pybal 1.15.2
  • 14:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1081 after alter table (duration: 00m 57s)
  • 14:51 jynus: stopping db2044 (this will make proxies complain about redundancy)
  • 14:42 moritzm: rebooting chromium for kernel security update
  • 14:11 chasemp: add chico to wmf-nda (verified nda things with moritz and all the goodness)
  • 13:29 jynus: stop db1001 for maintenance (proxies will temporarely complain about lack of redundancy)
  • 13:20 zeljkof: EU SWAT finished
  • 13:20 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: wmf-config: enable Singapore oversample as default on all wikis (T188652) (duration: 00m 57s)
  • 12:32 akosiaris: reboot ganeti VMs on row_A in eqiad for cache=none setting. T181121
  • 12:26 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 12:04 reedy@tin: Synchronized wmf-config/interwiki.php: T188537 (duration: 00m 57s)
  • 11:59 moritzm: rebooting DNS recursors in codfw for kernel security update
  • 11:43 _joe_: include our own etcd package (3.2.16) on stretch
  • 11:37 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: name=argon.eqiad.wmnet,service=kubemaster
  • 11:33 kartik@tin: Finished deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc (duration: 03m 30s)
  • 11:30 kartik@tin: Started deploy [cxserver/deploy@30ff3b1]: Update cxserver to bd2ccfc
  • 11:23 jynus: ran update-netboot-stretch.sh
  • 11:21 moritzm: rebooting DNS recursors in esams for kernel security update
  • 10:22 moritzm: rebooting DNS recursors in ulsfo and eqsin for kernel security update
  • 10:17 vgutierrez: upgrading esams LVSs to pybal 1.15.2
  • 10:08 jynus: stopping mysql on db1063 and db1051 to validate the depool before full reimage
  • 10:07 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1001 after kernel security update (duration: 00m 57s)
  • 10:00 gehel: shuttind down blazegraph on wdqs2001 for data transfer to wdqs1004 - T189548
  • 09:48 vgutierrez: upgrading ulsfo LVSs to pybal 1.15.2
  • 09:37 moritzm: rebooting poolcounter1001 for kernel security update
  • 09:15 jmm@tin: Synchronized wmf-config/ProductionServices.php: Depooling poolcounter1001 for kernel security update (duration: 00m 56s)
  • 09:05 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 09:02 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1051 and db1063 (duration: 00m 56s)
  • 08:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 (duration: 00m 57s)
  • 06:58 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:56 marostegui: Deploy schema change on db1081 - T187089 T185128 T153182
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 for alter table (duration: 00m 56s)
  • 06:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 after alter table (duration: 01m 19s)
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 05m 30s)

2018-03-12

  • 22:52 eileen: update civicrm revision changed from a819d64d98 to 8652db05f5, config revision is 08b7e6216e - update civicrm.settings.php
  • 20:44 arlolra: Updated Parsoid to 16ced34 (T188670, T90902)
  • 20:37 arlolra@tin: Finished deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34 (duration: 10m 16s)
  • 20:36 andrewbogott: updated wikitech-static as detailed in https://wikitech.wikimedia.org/wiki/Wikitech-static#Manual_updates
  • 20:27 arlolra@tin: Started deploy [parsoid/deploy@174c87d]: Updating Parsoid to 16ced34
  • 20:26 andrewbogott: apt-get upgrade and reboot on wikitech-static
  • 20:25 andrewbogott: stopping apache2 on Silver in anticipation of it being decommissioned
  • 20:16 mholloway-shell@tin: Finished deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7 (duration: 05m 29s)
  • 20:11 mholloway-shell@tin: Started deploy [mobileapps/deploy@c764714]: Update mobileapps to 5c90db7
  • 19:53 MaxSem: disabled 2FA for User:Ctac (T189520)
  • 19:48 chasemp: labstore1003:~# service nfs-kernel-server restar
  • 19:44 chasemp: labstore1003:~# exportfs -ra
  • 18:53 Krinkle: Clean up left-over .wsp.bak files under frontend.navtiming* on graphite1001 (following T179622)
  • 18:44 mutante: added to DNS: romd.wikimedia.org (and romd.m) for Wikimedians of Romania and Moldova User Group
  • 18:43 mutante: added to DNS: hi.wikimedia.org (and hi.m) for Hindi Wikimedian User Group
  • 18:25 ppchelko@tin: Finished deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries (duration: 15m 25s)
  • 18:09 ppchelko@tin: Started deploy [restbase/deploy@754aa8c]: Enable ensure_content_type filter for summaries
  • 17:48 ottomata: removed kafka.protocol.version setting for varnishkafka webrequest instances; version should now be properly negotiated
  • 17:29 gehel@tin: Finished deploy [wdqs/wdqs@ce72538]: new wdqs updater (duration: 04m 47s)
  • 17:27 _joe_: poweroff mw2097-2134, T189111
  • 17:24 gehel@tin: Started deploy [wdqs/wdqs@ce72538]: new wdqs updater
  • 16:34 joal@tin: Finished deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug (duration: 08m 50s)
  • 16:25 joal@tin: Started deploy [analytics/refinery@1ef2e27]: Deploy patch over regula rdeploy bug
  • 15:56 mepps: updated payments-wiki from ce68e8e80b to 86715f6e9e
  • 15:51 gehel: restart blazegraph on wdqs2001 to validate new config - T175919
  • 15:43 vgutierrez: eqsin LVSs: upgrade pybal to 1.15.2
  • 15:39 ottomata: bouncing kafka main-eqiad -> jumbo-eqiad mirror maker instances
  • 15:37 ottomata: disabling puppet on kafka1020,1022,1023 to test partition.assigment.strategy change for mirror maker
  • 15:28 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swift user for private containers (T187822) (duration: 00m 54s)
  • 15:26 demon@tin: Pruned MediaWiki: 1.31.0-wmf.23 [keeping static files] (duration: 01m 19s)
  • 15:24 vgutierrez: lvs1007,lvs1010 upgraded pybal to 1.15.2
  • 15:17 demon@tin: Pruned MediaWiki: 1.31.0-wmf.22 [keeping static files] (duration: 01m 22s)
  • 15:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 (duration: 02m 35s)
  • 15:12 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120 (duration: 00m 31s)
  • 15:11 ppchelko@tin: Started deploy [cpjobqueue/deploy@5686f16]: Decrease refreshLinks concurrency to 120
  • 15:08 joal: Provide correct log message for analytics/refinery scap deploy: Regular deploy of analytics-hadoop code
  • 15:07 joal@tin: Finished deploy [analytics/refinery@fd0a90f]: Regular a (duration: 04m 54s)
  • 15:07 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 (duration: 03m 58s)
  • 15:02 joal@tin: Started deploy [analytics/refinery@fd0a90f]: Regular a
  • 14:42 jynus: upgrade and restart es2001
  • 14:09 sbisson@tin: Finished deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test* (duration: 00m 34s)
  • 14:09 sbisson@tin: Started deploy [tilerator/deploy@4bcae95]: Deploying tilerator#update-deps for testing on maps-test*
  • 14:02 zeljkof: EU SWAT finished
  • 13:59 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 57s)
  • 13:31 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 08s)
  • 13:24 moritzm: synchronised PHP 7.2.3 to thirdparty/php72 for stretch-wikimedia
  • 13:17 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 03m 09s)
  • 12:44 godog: start a catalog compilation on elnath to check for puppetdb4 diffs - T177253
  • 11:26 jmm@tin: Synchronized wmf-config/ProductionServices.php: Repooling poolcounter1002 after kernel security update (duration: 03m 09s)
  • 11:14 moritzm: reboot poolcounter1002 for kernel security update
  • 11:10 jmm@tin: Synchronized wmf-config/ProductionServices.php: depooling poolcounter1002 for kernel security update (duration: 03m 09s)
  • 10:39 _joe_: running decommission_appserver on mw2097-2134 T189111
  • 10:23 XioNoX: labs->cloud vlan rename in eqiad - T187933
  • 09:56 elukey: restart kafka mirror maker (main eqiad -> jumbo) on kafka1020 (all consumers not assigned to any partition on kafka102*)
  • 09:53 moritzm: installing util-linux security updates
  • 09:31 _joe_: decommission mw2097-mw2134 from conftool T189111
  • 08:40 moritzm: rebooting iron for kernel security update
  • 08:32 ema: cp3033/cp3031: restart varnish-be
  • 08:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2015 after kernel upgrade (duration: 00m 58s)
  • 08:20 ema: cp3033/cp3031: set transaction_timeout to 60s
  • 08:14 marostegui: Stop MySQL on es2015 for kernel upgrade
  • 08:06 ema: cp3042: restart varnish-be
  • 08:03 ema: cp3042: set transaction_timeout to 30s
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2015 for kernel upgrade (duration: 00m 58s)
  • 07:38 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2014 after kernel upgrade (duration: 01m 01s)
  • 07:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 59s)
  • 07:26 marostegui: Stop MySQL on es2014 for kernel upgrade
  • 07:24 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2014 for kernel upgrade (duration: 00m 58s)
  • 07:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3316 as vslow,dump in s6 - T184161 (duration: 00m 58s)
  • 06:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Pool db1113:3315 as vslow,dump in s5 - T184161 (duration: 00m 58s)
  • 06:27 marostegui: Deploy schema change on db1103:3314 - T187089 T185128 T153182
  • 06:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 for alter table (duration: 01m 06s)
  • 02:52 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.24) (duration: 11m 56s)

2018-03-11

  • 08:50 elukey: executed sudo rm /etc/logrotate.d/kafkatee-webrequest-analytics on oxygen/rhenium to stop daily cronspam

2018-03-10

  • 14:56 ema: cp1053: restart varnish-be
  • 13:29 ema: cp1068/cp1055: restart varnish-be

2018-03-09

  • 23:29 tgr@tin: Synchronized php-1.31.0-wmf.24/extensions/ReadingLists/src/Api/ApiQueryReadingListEntries.php: T189272 fix stupid ReadingLists typo breaking production (duration: 00m 54s)
  • 19:43 foks: changed global email for User:Mathmensch
  • 19:19 MaxSem: restarted my script on tin, now with more aggressive writes
  • 18:26 reedy@tin: Synchronized php-1.31.0-wmf.24/extensions/AbuseFilter/includes/AbuseFilter.class.php: Unbreak AbuseFilter tagging T189299 (duration: 00m 59s)
  • 17:35 andrew@tin: Finished deploy [horizon/deploy@9c234d6]: Another try at fixing T188458 (duration: 03m 00s)
  • 17:32 andrew@tin: Started deploy [horizon/deploy@9c234d6]: Another try at fixing T188458
  • 16:14 andrewbogott: test log
  • 16:07 bblack@neodymium: conftool action : set/pooled=no; selector: name=cp3034.esams.wmnet
  • 15:59 ppchelko@tin: Finished deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303 (duration: 00m 38s)
  • 15:59 andrewbogott: moving wikitech dns record to point to misc-web and the new labweb cluster, https://gerrit.wikimedia.org/r/#/c/417926/
  • 15:59 ppchelko@tin: Started deploy [cpjobqueue/deploy@5795526]: Enable root_claim_ttl for refreshLinks T189303
  • 15:54 andrew@tin: Finished deploy [horizon/deploy@f59f568]: rolling out a fix for T188458 (duration: 03m 11s)
  • 15:51 andrew@tin: Started deploy [horizon/deploy@f59f568]: rolling out a fix for T188458
  • 15:30 moritzm: installing zsh security update on trusty
  • 15:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 after cloning db1113:3316 - T184161 (duration: 00m 58s)
  • 15:15 moritzm: installing sensible-utils security update on trusty (Debian already fixed)
  • 15:11 ema: cp-upload_esams: reboot for retpoline kernel updates T188092
  • 13:12 marostegui: Compress s6 on db1113:3316 - T184161
  • 12:41 elukey: manually executed systemctl reset-failed to some old (not present anymore) units on kafka analytics hosts
  • 12:26 marostegui: Compress s5 on db1113:3315 - T184161
  • 12:16 marostegui: Stop mysql on db1063 to clone db1113:3316 - T184161
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 to clone db1113:3316 - T184161 (duration: 00m 58s)
  • 12:11 jynus: dropping test databases on dbstore2* instances
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 12:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db1051 after cloning db1113:3315- T184161 (duration: 00m 58s)
  • 11:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add initial config for db1113 multiinstance - T184161 (duration: 00m 58s)
  • 11:15 marostegui: Stop MySQL on db1051 to clone db1113 - https://phabricator.wikimedia.org/T184161
  • 11:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 to clone db1113 - T184161 (duration: 00m 58s)
  • 09:51 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with normal load (duration: 00m 58s)
  • 09:22 ema: cp-misc_esams: reboot for retpoline kernel updates T188092
  • 08:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2058 and db2084 (duration: 00m 58s)
  • 08:27 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1114 with low load (duration: 00m 58s)
  • 07:35 marostegui: Stop mariadb on db2058 and db2084 for mariadb+kernel upgrade
  • 07:34 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2058 and db2084 (duration: 00m 58s)
  • 07:33 marostegui: Logging for the record: es2013 was stopped and rebooted for mariadb and kernel upgrade
  • 07:22 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2013 (duration: 00m 58s)
  • 07:07 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool es2012, depool es2013 (duration: 00m 58s)
  • 06:52 marostegui: Stop MariaDB on es2012 to upgrade mariadb and kernel
  • 06:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool es2012 for kernel and mariadb upgrade (duration: 00m 58s)
  • 06:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Restore es1019 normal weight (duration: 00m 59s)
  • 05:00 andrew@tin: Finished deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278 (duration: 02m 59s)
  • 04:57 andrew@tin: Started deploy [horizon/deploy@930009e]: rebuilding venvs to avoid rogue configs, as was causing T189278
  • 00:40 thcipriani@tin: Synchronized static/images/project-logos: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART IV (duration: 00m 58s)
  • 00:38 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART III (duration: 00m 58s)
  • 00:36 thcipriani@tin: Synchronized static/images/project-logos/urwiki-2x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART II (duration: 00m 58s)
  • 00:33 thcipriani@tin: Synchronized static/images/project-logos/urwiki-1.5x.png: SWAT: Update logos for Banyumasan and Urdu Wikipedias T189155 PART I (duration: 00m 59s)
  • 00:03 urandom: set compression chunk length to 32, parsoid tables (group "enwiki") - T189057

2018-03-08

  • 23:10 urandom: set compression chunk length to 32, parsoid tables (group "wikipedia") - T189057
  • 22:31 urandom: set compression chunk length to 32, parsoid tables (group "commons") - T189057
  • 22:16 reedy@tin: Synchronized php-1.31.0-wmf.24/includes/specials/pagers/BlockListPager.php: T189251 (duration: 00m 59s)
  • 22:07 MaxSem: guess what? trying T187516 again
  • 21:41 urandom: set compression chunk length to 32, parsoid tables (group "others") - T189057
  • 21:15 otto@tin: Synchronized wmf-config/ProductionServices.php: Revert: point monolog avro producer back at Kafka analytics. Too many TCP connections? T188136 (duration: 00m 58s)
  • 21:09 sbisson@tin: Finished deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3) (duration: 04m 42s)
  • 21:04 sbisson@tin: Started deploy [kartotherian/deploy@6dcacbc]: Deploying kartotherian with updated dependencies and zoom level 19 to maps-test* (take 3)
  • 20:40 urandom: set compression chunk length to 32, mobile tables - T189057
  • 20:34 urandom: set compression chunk length to 32, page_summary tables - T189057
  • 20:30 thcipriani@tin: rebuilt and synchronized wikiversions files: All wikis to php-1.31.0-wmf.24
  • 20:26 thcipriani@tin: Synchronized php: Ensure symlink for 1.31.0-wmf.24 is up-to-date (duration: 01m 15s)
  • 19:52 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/Echo/: https://gerrit.wikimedia.org/r/#/c/417330/ and https://gerrit.wikimedia.org/r/#/c/417340/ (duration: 01m 21s)
  • 19:33 anomie: Running `cleanupUsersWithNoId.php --table recentchanges --prefix wikidata --force` on wikidata client wikis for T181731. This shouldn't create any local SUL accounts.
  • 19:29 niharika29@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/: Hooks: Don't register beta features if they're enabled for all https://gerrit.wikimedia.org/r/#/c/417277/ (duration: 01m 14s)
  • 19:24 sbisson@tin: Finished deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test* (duration: 02m 40s)
  • 19:23 niharika29@tin: Synchronized wmf-config/CommonSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 15s)
  • 19:22 sbisson@tin: Started deploy [kartotherian/deploy@a839a16]: Deploying kartotherian with updated dependencies and zoom lovel 19 to maps-test*
  • 19:21 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: NavigtationTiming: Enable oversampling for Singapore T188652 (duration: 01m 16s)
  • 18:43 bsitzmann@tin: Finished deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167 (duration: 06m 14s)
  • 18:37 bsitzmann@tin: Started deploy [mobileapps/deploy@d6819a0]: Update mobileapps to afb0167
  • 17:19 andrew@tin: Synchronized wmf-config/wikitech.php: wikitech varnish updates (duration: 01m 15s)
  • 17:05 jynus: stop and reboot db1114 for kernel regression
  • 16:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool es1019 with less weight after HW maintenance (duration: 01m 15s)
  • 16:32 bd808: Running wikireplica_dns from labcontrol1001
  • 16:14 cmjohnson: wdqs1004 down for systemboard replacement
  • 15:56 andrewbogott: restarting nova-fullstack on labnet1001
  • 15:54 andrewbogott: restarting nodepool again
  • 15:42 andrewbogott: stopping nodepool again because something isn't quite right
  • 15:41 marostegui: Power off es1019 - T187530
  • 15:32 otto@tin: Synchronized wmf-config/ProductionServices.php: Point Mediawiki Monolog at new Kafka jumbo-eqiad cluster: T188136 (duration: 01m 16s)
  • 15:29 ottomata: merging and then deploying mediawiki-config to point monolog avro kafka producer at new kafka jumbo cluster: https://phabricator.wikimedia.org/T188136
  • 15:29 andrewbogott: disabling puppet on labnodepool1001
  • 15:17 andrewbogott: silencing nova and other openstack alerts in anticipation of service interruptions for https://phabricator.wikimedia.org/T189005
  • 15:01 marostegui: Disable puppet on db1073 - T189005
  • 15:00 marostegui: Change topology in m5, db2037 to become a slave of db1073 - T189005
  • 14:56 oblivian@tin: Synchronized wmf-config/CommonSettings.php: Use EtcdConfig everywhere (duration: 01m 15s)
  • 14:38 zeljkof: EU SWAT finished
  • 14:38 marostegui: Stop mysql on es1019 - T187530
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: SWAT: Blacklist Web of Trust junk from being added to pages (T189148) (duration: 01m 15s)
  • 14:35 zfilipin@tin: Synchronized php-1.31.0-wmf.24/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ArticleTarget.js: SWAT: Follow-up I5357a909: Fix logic for autosave from edited state (T189071) (duration: 01m 16s)
  • 14:28 mobrovac@tin: Finished deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052 (duration: 00m 33s)
  • 14:27 mobrovac@tin: Started deploy [cpjobqueue/deploy@4fa1cf0]: Lower the refreshLinks concurrency to 175 - T185052
  • 14:26 vgutierrez: uploaded pybal_1.15.2_all.deb to apt.wikimedia.org jessie-wikimedia
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: 2017 wikitext editor: Enable by default on officewiki (T188028) (duration: 01m 16s)
  • 14:15 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create the rollbacker group at ar.wikinews (T189206) (duration: 01m 16s)
  • 13:56 gehel: restart wdqs-updater on wdqs1005 to validate new config option - T188716
  • 13:52 sbisson@tin: Finished deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers (duration: 08m 31s)
  • 13:44 moritzm: depooling mwdebug2001, the host will temporarily be using an HHVM build linked against libicu57 to perform some tests
  • 13:43 sbisson@tin: Started deploy [kartotherian/deploy@42b3280]: Deploying kartotherian with updated dependencies and zoom lovel 19 to test servers
  • 13:40 elukey: eventlogging analytics migrated from eventlog1001 to eventlog1002
  • 13:35 ariel@tin: Finished deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly (duration: 00m 03s)
  • 13:35 ariel@tin: Started deploy [dumps/dumps@f26c114]: fix prefetch stubs; retrieval globals more robustly
  • 13:29 ema: cp-ulsfo: reboot for retpoline kernel updates T188092
  • 12:50 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:47 oblivian@puppetmaster1001: conftool action : edit; selector: scope=codfw,name=ReadOnly
  • 12:06 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 fully (duration: 01m 16s)
  • 11:32 moritzm: installing isc-dhcp security updates
  • 10:43 moritzm: installing libvpx security updates
  • 10:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Change db1114 load (duration: 01m 16s)
  • 10:14 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on T181121
  • 10:13 akosiaris: conduct IO stresstests on ganeti1005 (sca1004 VM) with cache=none KVM flag on
  • 09:57 dcausse: restaring mjolnir-kafka-daemon.service on relforge1002 to switch to kafka jumbo
  • 09:56 dcausse: restaring mjolnir-kafka-daemon.service on relforge1001 to switch to kafka jumbo
  • 09:56 _joe_: decommissioning mw2017-2099 T187467
  • 09:46 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1114 partially (duration: 01m 16s)
  • 09:44 moritzm: rearming keyholder on neodymium after reboot
  • 09:40 moritzm: rebooting neodymium for kernel security update
  • 09:22 ema: cp-eqsin: reboot for retpoline kernel updates T188092
  • 09:12 ema: cp3043: varnish-be-restart T189085
  • 09:08 moritzm: rebooting bast1001 for kernel security update
  • 08:58 elukey: restart varnish backend on cp3041 (failed fetches)
  • 08:58 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2046, db2053 and db2060 after kernel upgrade (duration: 01m 15s)
  • 08:58 moritzm: reset RAC on bast1001, serial console was stuck
  • 08:50 elukey: rebooting analytics1003 (Hadoop Hive, Oozie, etc..) for kernel updates
  • 08:32 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2046, db2053 and db2060 for kernel upgrade (duration: 01m 17s)
  • 08:31 elukey: reboot analytics1002 (Hadoop master standby) for kernel upgrades
  • 08:28 marostegui: Stop MySQL on db2046, db2053 and db2060 for kernel upgrade
  • 08:19 elukey: reboot analytics1001 (Hadoop master) for kernel upgrade (temp failover to analytics1002)
  • 08:09 ema: cp3040: varnish-be-restart T189085
  • 08:00 ema: cp3032: varnish-be-restart T189085
  • 07:44 elukey: reboot kafka2003 (eventbus codfw) for kernel updates
  • 07:24 elukey: reboot kafka2002 (eventbus codfw) for kernel updates
  • 07:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool es1019 for maintenance - T187530 (duration: 01m 16s)
  • 07:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Revert: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 31s)
  • 04:27 Krinkle: Running whisper-mass-resize for ResourceLoader.* metrics on graphite1001 and graphite2001 (T179622)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 07m 37s)
  • 02:15 tgr@tin: Synchronized wmf-config/throttle.php: T189161 Temporarely remove account creation limit for event on Portuguese Wikipedia on March 08, 2018 (duration: 01m 10s)
  • 01:17 twentyafterfour: phabricator update completed
  • 01:13 twentyafterfour: preparing for phabricator update 2018-03-07/1
  • 00:37 thcipriani@tin: Synchronized wmf-config/db-eqiad.php: SWAT: wikitech: use FQDNs for m5 cluster members (duration: 01m 16s)
  • 00:28 thcipriani@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Add configuration for CirrusSearch to instantly index new Wikidata items T183053 (duration: 01m 15s)
  • 00:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable loginOnly mode for local auth provider on group 2 T57420 (duration: 01m 16s)

2018-03-07

  • 23:36 MaxSem: aborted due to growing DB lag
  • 23:08 MaxSem: running script for T187516
  • 23:00 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/AntiSpoof/: https://gerrit.wikimedia.org/r/#/c/417013/ (duration: 01m 16s)
  • 22:52 maxsem@tin: Synchronized php-1.31.0-wmf.24/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/417014/ (duration: 01m 20s)
  • 22:44 MaxSem: dumping centralauth.spoofuser from db1079
  • 22:27 ejegg: deployed patch for T171987 to 1.31.0-wmf.23
  • 22:23 ejegg: deployed patch for T171987 to 1.31.0-wmf.24
  • 21:51 herron: puppetdb server reboots complete — re-enabling puppet agents
  • 21:45 herron: temporarily disabling puppet agents while puppetdb servers nitrogen and nihal are rebooted for kernel updates
  • 21:24 thcipriani@tin: Synchronized wmf-config: Improve load-order documentation for CommonSettings and InitialiseSettings noop doc change (duration: 01m 18s)
  • 21:05 andrew@tin: Synchronized wmf-config/InitialiseSettings.php: Switch wikitech to swift (duration: 01m 15s)
  • 20:58 andrew@tin: Synchronized wmf-config/filebackend.php: Preparing wikitech to use swift for images, step two (duration: 01m 12s)
  • 20:56 andrew@tin: Synchronized wmf-config/CommonSettings.php: Preparing wikitech to use swift for images, step one (duration: 01m 16s)
  • 20:45 andrew@tin: Synchronized multiversion/MWMultiVersion.php: (no justification provided) (duration: 01m 16s)
  • 20:27 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to php-1.31.0-wmf.24
  • 19:43 Amir1: ladsgroup@terbium:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https (T183019)
  • 19:35 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https on fawiki and hewiki (T183019)
  • 19:18 Amir1: ladsgroup@terbium:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=mediawikiwiki --force-protocol https (T183019)
  • 18:56 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: retry (duration: 01m 15s)
  • 18:42 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 16s)
  • 18:40 tgr@tin: Synchronized static/images/project-logos: T189116 Update logos for Limburgish and Picardic Wikipedias (duration: 01m 17s)
  • 18:30 tgr@tin: Synchronized debug.json: T187468 Switch to mwdebug hosts in codfw too (duration: 01m 15s)
  • 18:26 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: T57420 Enable loginOnly mode for local auth provider on group 1 (duration: 01m 20s)
  • 17:41 moritzm: rebooting restbase-test* for kernel security update
  • 16:55 ema: cp5001: reboot for retpoline kernel updates T188092
  • 16:46 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052 (duration: 00m 33s)
  • 16:46 ppchelko@tin: Started deploy [cpjobqueue/deploy@ff41710]: Increase refreshLinks concurrency to 250 T185052
  • 16:08 elukey: updating pcc facts for new hosts
  • 15:54 moritzm: rebooting rdb* fallback hosts in eqiad for kernel security update
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1064, it is not performing well with 2 failed disks - T188685 (duration: 01m 16s)
  • 15:26 marostegui: Set disk 32:2 on db1064 as offline
  • 15:20 moritzm: rebooting krypton (running grafana among others) for kernel security update
  • 15:17 reedy@tin: Synchronized wmf-config/throttle.php: T189121 (duration: 01m 15s)
  • 14:45 Amir1: EU SWAT is done
  • 14:42 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052 (duration: 00m 36s)
  • 14:41 ppchelko@tin: Started deploy [cpjobqueue/deploy@aee2eb1]: Increase refreshLinks concurrency to 150 T185052
  • 14:37 moritzm: rebooting rdb* hosts in codfw for kernel security update
  • 14:37 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 16s)
  • 14:35 ladsgroup@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/tests/phpunit/Sites/SiteMatrixParserTest.php: Add code of special wikis as interwiki when populating sites table, part II (T183019) (duration: 01m 15s)
  • 14:27 ladsgroup@tin: Synchronized php-1.31.0-wmf.24/extensions/Wikibase/lib/includes/Sites/SiteMatrixParser.php: Add code of special wikis as interwiki when populating sites table (T183019) (duration: 01m 16s)
  • 14:19 _joe_: adding mwdebug200{1,2} to ganeti in codfw, T187468
  • 14:17 urandom: reducing compression chunk length to 32kb on "wikipedia_T_page__summary".data - T189057
  • 14:10 zfilipin@tin: Synchronized wmf-config/: SWAT: Load Wikibase Quality extensions using extension registration (T106104) (duration: 01m 17s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T188626) (duration: 01m 18s)
  • 14:01 urandom: setting trace probability to 0.0, restbase eqiad cassandra cluster - T189057
  • 13:22 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all refreshLinks jobs to EventBus, file #2 - T185052 (duration: 01m 15s)
  • 13:22 moritzm: rebooting tungsten for kernel security update
  • 13:21 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all refreshLinks jobs to EventBus - T185052 (duration: 01m 15s)
  • 13:20 ppchelko@tin: Finished deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052 (duration: 00m 43s)
  • 13:20 moritzm: rebooting install2002 for kernel security update
  • 13:19 ppchelko@tin: Started deploy [cpjobqueue/deploy@d84286a]: Switch all refreshLinks jobs to kafka T185052
  • 10:55 marostegui: Deploy schema change on codfw s4 master (db2051) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182
  • 10:54 moritzm: rearmed keyholders on netmon1002 and netmon2001
  • 10:50 elukey: reboot stat100[56] for kernel upgrades
  • 10:49 moritzm: reboot memcached hosts in codfw for kernel security update
  • 10:34 moritzm: rebooting netmon2001 for kernel security update
  • 10:29 moritzm: rebooting netmon1002 for kernel security update
  • 10:26 moritzm: rebooting boron for kernel security update
  • 10:11 moritzm: rebooting openldap/WMCS servers for kernel security update
  • 10:05 moritzm: rebooting openldap/corp servers for kernel security update
  • 10:03 elukey: reboot analytics10[35,52] for kernel updates - hadoop hdfs journal nodes (didn't manage to complete the work yesterday)
  • 10:03 moritzm: rebooting pool counters in codfw for kernel security update
  • 10:02 akosiaris: upload apertium-rus-ukr_0.2.0~r82706-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:56 moritzm: rebooting tureis/roentgenium for kernel security update
  • 09:53 akosiaris: upload apertium-rus_0.2.0~r82706-1+wmf1 and apertium-ukr_0.1.0~r82563-1+wmf1 on apt.wikimedia.org/jessie-wikimedia/main. T184901
  • 09:46 moritzm: rebooting etherpad1001 (etherpad.wikimedia.org) for kernel security update
  • 09:31 moritzm: rebooting darmstadtium (docker registry) for kernel security update
  • 09:24 moritzm: rearming keyholder on sarin after reboot
  • 09:16 moritzm: rebooting sarin for kernel security update
  • 08:57 ema: cp3033: restart varnish-be, backend connections piling up (~12k)
  • 08:40 marostegui: Deploy schema change on s7 primary master db1062 - T153182 T185128
  • 08:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1079 after alter table (duration: 01m 16s)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2089,db2079 and db2065 after mariadb and kernel upgrade (duration: 01m 16s)
  • 07:30 marostegui: Stop mariadb on db2089,db2079 and db2065 for kernel upgrade
  • 07:28 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2089,db2079 and db2065 (duration: 01m 15s)
  • 06:49 marostegui: Deploy schema change on db1079 with replication enabled (this will generate lag on labs) - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1079 for alter table (duration: 01m 16s)
  • 02:27 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.23) (duration: 06m 03s)
  • 00:57 Amir1: Evening SWAT is done
  • 00:32 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Re-enable Wikidata descriptions (T188182) (duration: 01m 16s)

2018-03-06

  • 23:10 MaxSem: cancelled
  • 23:05 MaxSem: refreshing spoofuser
  • 23:00 MaxSem: dumping centralauth.spoofuser from db1094
  • 21:22 mutante: restbase-dev1006 powercycled via console (T185494)
  • 20:49 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.24
  • 20:44 ottomata: reverted change to point mediawiki monolog kafka producers at kafka jumbo-eqiad until deployment train is done T188136
  • 20:36 mutante: phab1001 (phabricator) - rebooting for maintenance
  • 20:35 ottomata: pointing mediawiki monolog kafka producers at kafka jumbo-eqiad cluster: T188136
  • 20:08 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache (duration: 29m 13s)
  • 19:39 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.24 and rebuild l10n cache
  • 18:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af (duration: 05m 28s)
  • 18:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@5986ab7]: Update mobileapps to afbe9af
  • 18:22 godog: puppet-merge Revert: Use hiera3 role/nuyaml backends on >= stretch
  • 17:58 marostegui: Reload haproxy on dbproxy1004 and dbproxy1009
  • 17:53 thcipriani: starting branch cut for 1.31.0-wmf.24
  • 17:53 andrewbogott: disabling puppet and apache on labpuppetmatser1001 and 1002
  • 17:47 moritzm: rebooting dbmonitor1001 for kernel security update
  • 17:42 moritzm: rebooting dbmonitor2001 for kernel security update
  • 17:38 moritzm: rebooting hassaleh for kernel security update
  • 17:34 vgutierrez: update pybal to 1.15.1 on lvs5003
  • 17:32 vgutierrez: update pybal to 1.15.1 on lvs1010
  • 17:28 vgutierrez: uploaded pybal_1.15.1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 after alter table (duration: 00m 58s)
  • 16:58 cmjohnson1: powering off rhenium to reset the idrac
  • 16:44 sbisson@tin: Finished deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch (duration: 05m 47s)
  • 16:38 sbisson@tin: Started deploy [kartotherian/deploy@255401a]: Testing update-deps2 branch
  • 16:11 oblivian@tin: Synchronized wmf-config: Fetch data from etcd on all appservers (duration: 01m 01s)
  • 16:01 marostegui: Deploy schema change on db1069 - T187089 T185128 T153182
  • 16:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 for alter table (duration: 00m 57s)
  • 15:54 jynus: deploying new query killer logic to all wikidata (s8) db replicas T188505
  • 15:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1094 after alter table (duration: 00m 57s)
  • 15:51 moritzm: installing libvpx security updates
  • 15:50 oblivian@tin: Synchronized wmf-config: Expose etcd last modified index (duration: 01m 00s)
  • 15:45 moritzm: rebooting ununpentium for kernel security update
  • 15:39 oblivian@tin: Finished scap: Deploying Expose the latest modified index seen by EtcdConfig (duration: 09m 49s)
  • 15:29 oblivian@tin: Started scap: Deploying Expose the latest modified index seen by EtcdConfig
  • 15:28 moritzm: rebooting bromine for kernel security update
  • 15:19 mobrovac@tin: Synchronized php-1.31.0-wmf.23/includes/jobqueue/JobQueueSecondTestQueue.php: [JobQueueSecondTestQueue] Support read-only mode - T185052 (duration: 00m 58s)
  • 15:09 vgutierrez: update to pybal 1.15.0 on lvs5003
  • 15:02 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Article counts: Change 'comma' method to 'any' - T188472 (duration: 01m 00s)
  • 14:50 vgutierrez: update pybal to 1.15.0 on lvs1010
  • 14:46 hashar: tin: /srv/mediawiki-staging/php-1.31.0-wmf.23 rebased on tip of https://gerrit.wikimedia.org/r/#/c/416686/ (that revert a merge of master branch)
  • 14:42 gehel: rebooting maps1* (eqiad) for kernel security update completed
  • 14:36 ottomata: beginning migration of webrequest text varnishkafka logs from Kafka analytics to Kafka jumbo-eqiad T185136
  • 14:21 moritzm: rebooting labweb* for kernel security update
  • 14:13 moritzm: rebooting sca* for kernel security update
  • 14:07 gehel: rebooting maps1* (eqiad) for kernel security update
  • 14:07 moritzm: rebooting pybal-test for kernel security update
  • 14:00 _joe_: SWAT is suspended for investigation on tin's git status
  • 14:00 moritzm: rebooting oxygen for kernel security update
  • 13:16 moritzm: powercycling ms-be1038, stuck after reboot
  • 13:10 marostegui: Deploy schema change on db1094 - T187089 T185128 T153182
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 for alter table (duration: 00m 58s)
  • 12:55 moritzm: rebooting URL downloaders for kernel security update
  • 12:51 mobrovac@tin: Finished deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052 (duration: 00m 34s)
  • 12:50 mobrovac@tin: Started deploy [cpjobqueue/deploy@9b0b947]: refreshLinks: Increase concurrency to 100 - T185052
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1086 after alter table (duration: 00m 58s)
  • 12:33 moritzm: rebooting mwlog* for kernel security update
  • 12:04 moritzm: rebooting graphite hosts in eqiad for kernel security update
  • 11:29 moritzm: rebooting k8s masters for kernel security update
  • 11:05 elukey: reboot analytics10[28,35,52] for kernel updates (one at the time, hadoop hdfs journal nodes)
  • 10:46 moritzm: powercycling ms-be1021, stuck after reboot
  • 10:45 akosiaris@tin: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 01m 22s)
  • 10:43 moritzm: rearming keyholder on naos after reboot
  • 10:39 akosiaris: emergency add a captcha in metawiki contact pages like https://meta.wikimedia.org/wiki/Special:Contact/Stewards to stop bot abuse. phab Task to be filed later on
  • 10:39 godog: reboot ms-be1013 to try fix disk ordering
  • 10:35 moritzm: rebooting naos for kernel security update
  • 10:32 moritzm: rearming keyholder on tin after reboot
  • 10:30 gehel: kafka poller active on all production wdqs nodes - T188252
  • 10:28 moritzm: rebooting tin for kernel security update
  • 10:20 gehel: reboot completed for maps2* and maps-test*
  • 09:51 moritzm: rebooting graphite hosts in codfw for kernel security update
  • 09:42 marostegui: Stop MySQL on db1107 for mariadb and kernel upgrade
  • 09:41 vgutierrez: pybal_1.15.0_all.deb to apt.wikimedia.org jessie-wikimedia
  • 09:40 marostegui: Start proxysql on wasat
  • 09:38 moritzm: rebooting wezen for kernel security update
  • 09:27 elukey: reboot kafka2001 (eventbus codfw) for kernel updates
  • 09:24 marostegui: Deploy schema change on db1086 - T187089 T185128 T153182
  • 09:18 marostegui: Stop and reboot db1086 for kernel and mariadb upgrade
  • 09:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1086 for alter table (duration: 00m 57s)
  • 09:17 moritzm: rebooting swift backend servers in eqiad for kernel security update
  • 09:17 moritzm: rebooting wwift backend servers in eqiad for kernel security update
  • 09:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 after alter table (duration: 00m 57s)
  • 09:05 gehel: rolling restart of maps* for kernel upgrade
  • 08:50 elukey: reboot meitnerium (archiva) for kernel updates
  • 08:38 paravoid: rebooting furud
  • 08:35 moritzm: rebooting wasat for kernel security update
  • 08:30 elukey: drain+reboot analytics[1065-1067] for kernel updates
  • 08:26 marostegui@tin: Synchronized wmf-config/db-codfw.php: Update db1069 IP (duration: 00m 57s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Update db1069 IP (duration: 00m 57s)
  • 08:15 moritzm: rebooting ruthenium for kernel security update
  • 08:14 marostegui@tin: Synchronized wmf-config/db-codfw.php: Revert depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 57s)
  • 08:10 moritzm: rebooting bast5001 for kernel security update
  • 08:01 elukey: drain+reboot analytics[61,63,64] for kernel updates
  • 07:59 moritzm: rebooting tegmen for kernel security update
  • 07:43 marostegui: Stop mysql on db2090 db2080 db2076 db2073 db2067 for mariadb and kernel upgrade
  • 07:43 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool some codfw hosts for kernel and mariadb upgrade (duration: 00m 58s)
  • 07:36 moritzm: rebooting remaining swift backend servers in codfw for kernel security update
  • 07:18 marostegui: Stop MySQL on db2093 to get some data from the event scheduler
  • 06:56 marostegui: Deploy schema change on db1101:3317 - T187089 T185128 T153182
  • 06:51 marostegui: Stop mysql on db2037 to upgrade it
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 for alter table (duration: 00m 58s)
  • 05:00 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend.php: T180183: I6d72873b9d3 (duration: 00m 56s)
  • 04:59 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 - Ie5a164a9e2b (duration: 00m 57s)
  • 04:58 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta: no-op (duration: 00m 54s)
  • 04:57 krinkle@tin: Synchronized wmf-config/PhpAutoPrepend-labs.php: beta: no-op (duration: 00m 57s)
  • 04:29 bblack: eqsin router maintenance starting soon-ish. all of eqsin will be offline and isn't in production service to begin with. We've tried to downtime all the things, but don't be shocked at spurious alerts! - T187807
  • 04:08 krinkle@tin: Synchronized multiversion/MWMultiVersion.php: Ia2acf57c6 (duration: 00m 57s)
  • 04:01 krinkle@tin: Synchronized wmf-config/profiler.php: T180183 (duration: 01m 33s)
  • 02:26 tgr@tin: Synchronized wmf-config/CommonSettings.php: T186296 Increase ReadingLists list size limit to 5k (duration: 01m 06s)
  • 02:07 tgr@tin: Finished scap: T187226#4025352 update ReadingLists (duration: 18m 49s)
  • 01:48 tgr@tin: Started scap: T187226#4025352 update ReadingLists
  • 01:00 tgr@tin: Synchronized wmf-config/InitialiseSettings.php: refresh wmf-config/InitialiseSettings, seems to have stuck in old state on some servers after doing the initial sync in the wrong order (duration: 00m 57s)
  • 00:54 tgr@tin: Synchronized wmf-config: T57420 Enable loginOnly mode for local auth provider on group 0 (duration: 01m 00s)
  • 00:41 krinkle@tin: Synchronized wmf-config/CommonSettings.php: no-op I33f09b164e7 (duration: 00m 58s)
  • 00:38 krinkle@tin: Synchronized wmf-config/CommonSettings-labs.php: beta-only: I02a4d4 (duration: 00m 57s)

2018-03-05

  • 22:44 bawolff@tin: Synchronized php-1.31.0-wmf.23/includes/logging/LogPager.php: T188145 (duration: 00m 58s)
  • 21:32 arlolra: Updated Parsoid to d115592 (T188591)
  • 21:25 arlolra@tin: Finished deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592 (duration: 12m 12s)
  • 21:13 arlolra@tin: Started deploy [parsoid/deploy@232631f]: Updating Parsoid to d115592
  • 20:04 gehel@tin: Finished deploy [wdqs/wdqs@1983ddf]: wdqs GUI update (duration: 01m 36s)
  • 20:03 gehel@tin: Started deploy [wdqs/wdqs@1983ddf]: wdqs GUI update
  • 20:02 hashar@tin: Synchronized php-1.31.0-wmf.23/extensions/Wikibase: Fix empty condition list in metadata lookup - T188313 (duration: 01m 58s)
  • 19:51 maxsem@tin: Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/416219/ (duration: 00m 57s)
  • 19:43 maxsem@tin: Synchronized php-1.31.0-wmf.23/extensions/Cite: https://gerrit.wikimedia.org/r/#/c/416467/ (duration: 00m 58s)
  • 19:30 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update (duration: 02m 36s)
  • 19:28 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back previous GUI update
  • 19:23 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416456/ (duration: 00m 58s)
  • 19:21 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 01m 23s)
  • 19:20 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 19:14 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/416457/ (duration: 00m 58s)
  • 18:54 jynus: stop slave on db2044
  • 18:24 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken (duration: 00m 54s)
  • 18:23 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: rolling back to previous state, UI is broken
  • 18:20 gehel@tin: Finished deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version (duration: 03m 08s)
  • 18:16 gehel@tin: Started deploy [wdqs/wdqs@11c73f0]: new WDQS GUI and updater version
  • 17:34 elukey: drain + reboot analytics10[58-60] for kernel updates
  • 17:32 bd808: Added zhuyifei1999_ and chicocvenancio to the "toollabs-trusted" gerrit group
  • 16:32 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186699 (duration: 00m 57s)
  • 16:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1098:3317 after alter table (duration: 00m 57s)
  • 16:00 elukey: test
  • 15:56 akosiaris: upload tiller on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:56 akosiaris: upload helm on apt.wikimedia.org Component: main distros: jessie-wikimedia, stretch-wikimedia T189919
  • 15:55 urandom: setting trace probability to 0.001 (.1%), eqiad datacenter, restbase cassandra cluster
  • 15:52 urandom: updating `system_traces` keyspace replication strategy, restbase cassandra cluster
  • 15:51 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch all of the cdnPurge to EventBus, file 2/2 - T188540 (duration: 00m 57s)
  • 15:50 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 15:49 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch all of the cdnPurge to EventBus, file 1/2 - T188540 (duration: 00m 57s)
  • 15:45 ppchelko@tin: Finished deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka (duration: 00m 35s)
  • 15:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@346a2b6]: Switch all cdnPurge jobs to kafka
  • 15:42 marostegui: stop and poweroff db1069 for rack change - T186699
  • 15:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186699 (duration: 00m 57s)
  • 15:41 elukey: drain + reboot analytics 1055->57 for kernel updates
  • 15:38 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Switch 50% for refreshLinks to EventBus - T185052 (duration: 00m 57s)
  • 15:31 ppchelko@tin: Finished deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs (duration: 00m 39s)
  • 15:31 ppchelko@tin: Started deploy [cpjobqueue/deploy@fe5b1f3]: Enable refreshLinks for 50% of the jobs
  • 15:28 marostegui: Mark as failed disk 32:9 on db1068 (s4 primary master) - T188187
  • 15:20 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobExecutor.php: [JobExecutor] Wait for the replicas if the transaction takes too long (duration: 00m 57s)
  • 15:14 moritzm: rebooting webperf2001 for kernel security update
  • 14:57 hashar: European SWAT completed
  • 14:57 hashar@tin: Finished scap: 2017 wikitext editor: Simplify config part 2 (duration: 02m 57s)
  • 14:54 hashar@tin: Started scap: 2017 wikitext editor: Simplify config part 2
  • 14:52 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:48 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable translate extension in bdwikimedia - T188853 (duration: 00m 57s)
  • 14:44 hashar@tin: Synchronized wmf-config/InitialiseSettings.php: Enable rollbacker user right at arwikiversity - T188633 (duration: 00m 57s)
  • 14:41 hashar@tin: Finished scap: core + Flow, master/replicate race condition - T182358 T184670 (duration: 04m 24s)
  • 14:36 hashar@tin: Started scap: core + Flow, master/replicate race condition - T182358 T184670
  • 14:34 elukey: graphite metrics mw.error.* deprecated in T188749
  • 14:31 hashar@tin: Finished scap: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 23m 08s)
  • 14:11 hashar: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=bdwikimedia translate # T188853
  • 14:08 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 14:06 hashar@tin: scap aborted: Popups: Remove client side formatters in the REST formatter - T183833 (duration: 00m 16s)
  • 14:06 hashar@tin: Started scap: Popups: Remove client side formatters in the REST formatter - T183833
  • 13:55 moritzm: rolling reboot of swift backends in codfw for kernel security update
  • 13:49 moritzm: rebooting releases2001 for kernel security update
  • 13:37 moritzm: rebooting neon for kernel security update
  • 13:37 mobrovac@tin: Started restart [cpjobqueue/deploy@b5255f0]: Force RecordLintJob rebalance in Kakfa - T188870
  • 13:04 moritzm: rebooting bast4002 for kernel security update
  • 13:00 marostegui: Deploy schema change on db1098:3317 - T187089 T185128 T153182
  • 13:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for alter table (duration: 00m 57s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:45 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1011 from config - T184703 (duration: 01m 02s)
  • 12:40 moritzm: rebooting bast4001 for kernel security update
  • 12:30 marostegui: Remove db1011 from tendril as it will be decommissioned - T184703
  • 12:19 moritzm: installing libvpx security updates
  • 12:13 moritzm: installing wavpack security updates
  • 12:08 moritzm: installing freexl security updates
  • 11:59 moritzm: upgrading tor on radium
  • 11:40 moritzm: updating tor packages to 0.3.2.10
  • 11:19 moritzm: running "racadm racreset" on rhenium, mgmt inaccessible
  • 11:09 elukey: drain + reboot analytics10[50,51,53,54] for kernel updates
  • 10:53 moritzm: rebooting bast2001 for kernel security update
  • 10:46 moritzm: rebooting lithium for kernel security update
  • 10:24 elukey: drain + reboot analytics10[46-49] for kernel updates
  • 10:23 moritzm: rolling reboot of logstash* for kernel security update
  • 09:33 godog: roll restart swift in codfw to add thumbor private user
  • 09:15 marostegui: Deploy schema change on s7 codfw master (db2040), this will generate lag on codfw - T187089 T185128 T153182
  • 09:01 godog: roll-restart thumbor to apply https://gerrit.wikimedia.org/r/416240
  • 08:54 marostegui: Stop mariadb on db2037 to copy it to db1073
  • 08:25 marostegui: Stop MySQL on db2078 for mariadb and kernel upgrade
  • 07:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db1073 from config (duration: 00m 58s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db1073 from config (duration: 00m 59s)
  • 07:06 marostegui: Deploy schema change on s2 primary master db1054 - T185128 T153182
  • 02:08 l10nupdate@tin: LocalisationUpdate failed: git pull of extensions failed

2018-03-04

  • 20:16 tgr: T188721 ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --ignorestatus --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 18:05 musikanimal: T188721 Ran mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=wikidatawiki --logwiki=metawiki 'Erik Fastman' 'Glorious Engine'
  • 15:59 elukey: powercycle stat1004 - available via mgmt, root login freezes while trying

2018-03-03

  • 14:16 akosiaris: 13:56:20 ema: powercycle ganeti1005 T181121
  • 13:56 ema: powercycle ganeti1005
  • 13:25 andrewbogott: forced quota update in admin-monitoring as well; the reserved fixed_ip value was incorrect
  • 13:23 andrewbogott: forcing quota update in nova with update quota_usages set reserved='-1' where project_id='contintcloud';
  • 13:10 andrewbogott: restarting rabbitmq-server on labcontrol1001
  • 13:08 andrewbogott: retarting nodepool
  • 13:05 andrewbogott: restarting nova-conductor
  • 13:02 andrewbogott: stopping nodepool for a bit while investigating openstack issues
  • 02:14 chasemp: labnodepool1001:~# service nodepool start
  • 01:30 chasemp: root@labnet1001:~# service nova-fullstack restart
  • 01:21 chasemp: labnodepool1001:~# service nodepool stop

2018-03-02

  • 19:44 jynus: restarting labsdb1010
  • 17:22 mepps: updated payments-wiki 498f49a758 to ce68e8e80b
  • 15:19 elukey: drain + reboot analytics10[41-45] for kernel updates
  • 15:15 moritzm: rebooting auth* for kernel security updates
  • 13:46 elukey: drain + reboot analytics10[38,39,40,41] for kernel updates
  • 13:22 elukey: drain + reboot analytics10[33,34,36,37] for kernel updates
  • 13:17 moritzm: upgrading labtest trusty hosts to latest 4.4 kernel
  • 12:23 moritzm: rebooting kubetcd/kubestagetcd for kernel security update
  • 12:00 moritzm: rebooting etcd* for kernel security updates
  • 11:58 elukey: drain + reboot analytics10[29,31,32] for kernel updates
  • 11:33 moritzm: draining restbase1018 for eventual reboot for kernel security update
  • 11:28 akosiaris: upload to apt.wikimedia.org component thirdparty/ci distro jessie-wikimedia docker-ce_17.12.1~ce-0~debian_amd64 T177499
  • 11:07 moritzm: rebooting mwdebug* for kernel security update
  • 10:54 ema: spare LVSs lvs[1011-1012], lvs[4001-4004]: reboot for retpoline kernel updates T188092
  • 10:53 moritzm: draining restbase1017 for eventual reboot for kernel security update
  • 10:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1114 (duration: 00m 57s)
  • 10:18 moritzm: draining restbase1016 for eventual reboot for kernel security update
  • 10:18 jynus: shutting down labsdb1010
  • 10:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 56s)
  • 10:01 elukey: deleted /etc/burrow/* from zookeeper main eqiad/codfw after https://gerrit.wikimedia.org/r/415818 (garbage to cleanup)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1114 (duration: 00m 57s)
  • 09:40 moritzm: draining restbase1015 for eventual reboot for kernel security update
  • 09:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly pool db1114 in s1 after cloning it from db1073 - T183469 (duration: 01m 01s)
  • 08:57 moritzm: rebooting scb1004 for kernel security update (was omitted from earlier reboots due to hardware issues on scb1003)
  • 08:51 moritzm: repooling scb1003 after memory module was replaced (T188385)
  • 07:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 57s)
  • 07:21 marostegui@tin: Synchronized wmf-config/db-codfw.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Add db1114 to the config - T183469 (duration: 00m 57s)
  • 07:11 moritzm: rebooting xenon/praseodymium/cerium for kernel security update
  • 07:11 moritzm: rebooting xenon/praseodymium/xenon for kernel security update
  • 06:52 marostegui: Stop MySQL on db1073 to clone db1114 - T183469
  • 06:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 to clone db1114 - T183469 (duration: 00m 58s)
  • 02:48 legoktm: manually purged ExtensionDistributor cache (T188692)
  • 01:54 mutante: cobalt (gerrit) - rebooting for kernel upgrade
  • 01:46 mutante: LDAP: added lucaswerkmeister-wmde to 'wmde' and 'nda' groups (T188105)
  • 00:49 ebernhardson@tin: Synchronized wmf-config/flaggedrevs.php: SWAT: T148603: (duration: 00m 57s)
  • 00:48 herron: fermium (lists) and mx systems rebooted for kernel update
  • 00:46 ebernhardson@tin: Synchronized php-1.31.0-wmf.23/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.searchSatisfaction.js: SWAT T187148: Start cirrus query explorer AB test (duration: 00m 57s)
  • 00:25 ebernhardson@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: T187148 Configure Cirrus AB test (step 2) (second try) (duration: 00m 57s)
  • 00:23 ebernhardson@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: T187148 Configure Cirrus AB test (step 1) (second try) (duration: 00m 57s)
  • 00:12 ebernhardson@tin: Synchronized wmf-config/: REVERT SWAT: T187148 Configure Cirrus AB test (duration: 00m 59s)
  • 00:09 ebernhardson@tin: Synchronized wmf-config/: SWAT: T187148 Configure Cirrus AB test (duration: 01m 00s)

2018-03-01

  • 22:35 gehel: rolling restart of elsticsearch / cirrus - eqiad complete, cluster is green
  • 21:45 thcipriani@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.23
  • 21:33 bsitzmann@tin: Finished deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833) (duration: 05m 15s)
  • 21:28 bsitzmann@tin: Started deploy [mobileapps/deploy@bd9924e]: Update mobileapps to 1056fde (T183833)
  • 21:17 thcipriani@tin: Synchronized php-1.31.0-wmf.23/extensions/GeoData/includes/api/ApiQueryGeoSearchElastic.php: Fix undefined property error in ApiQueryGeoSearchElastic T188659 (duration: 01m 15s)
  • 20:30 thcipriani@tin: Synchronized php: php link to 1.31.0-wmf.23 (duration: 01m 12s)
  • 20:29 andrewbogott: restarting labweb1002
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.23
  • 20:15 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/specials/pagers/NewPagesPager.php: SWAT: NewPagesPages: Use array_merge rather than + for RC query info fields T188555 (duration: 01m 14s)
  • 20:15 andrewbogott: rebooting labweb1001
  • 19:56 thcipriani@tin: Synchronized langlist-labs: SWAT: beta: add nlwiki to langlist T188582 (beta-only change) (duration: 01m 13s)
  • 19:50 gehel: new kafka based poller for wdqs now enabled on wdqs2001 - T188252
  • 19:48 thcipriani@tin: Synchronized wmf-config/throttle-analyze.php: SWAT: Revert "Automatically include commons and wikidata in $wmgThrottlingExceptions" (duration: 01m 14s)
  • 19:36 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollback for editors at zh_classicalwiki T188064 (duration: 01m 14s)
  • 19:31 gehel@tin: Finished deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues (duration: 02m 12s)
  • 19:29 gehel@tin: Started deploy [wdqs/wdqs@86da751]: new updater to fix kafka poller issues
  • 19:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable responsive references by default on rowiki T187997 (duration: 01m 15s)
  • 19:21 mutante: scb1003 depooled scb1003 from all services on scb because it went down, including mgmt
  • 19:20 dzahn@neodymium: conftool action : set/pooled=no; selector: name=scb1003.eqiad.wmnet
  • 19:17 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Make last throttle limit raise work accross all wikis T188630 (duration: 01m 13s)
  • 19:15 mutante: powercycling crashed scb1003
  • 19:13 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Fix throttle date for outreach dashboard T188630 (duration: 01m 13s)
  • 18:47 demon@tin: Synchronized wmf-config/: killing extension-list-labs (duration: 01m 17s)
  • 18:45 demon@tin: Synchronized wmf-config/InitialiseSettings.php: disable performance inspector in prod explicitly (duration: 01m 14s)
  • 18:43 demon@tin: Synchronized docroot/noc/: killing extension-list-labs (duration: 01m 14s)
  • 18:13 bsitzmann@tin: Finished deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833) (duration: 06m 01s)
  • 18:07 bsitzmann@tin: Started deploy [mobileapps/deploy@ada38aa]: Update mobileapps to 0db4a60 (T183833)
  • 17:51 gehel: depooling wdqs2001 and switching to kafka poller - T188252
  • 17:47 gehel: restarting wdqs-updater on wdqs1004 -T188045
  • 17:46 mutante: re-enabling icinga notifications for wdqs1004 services, ethernet cable has been replaced (T188045)
  • 17:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 01m 14s)
  • 17:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 28s)
  • 17:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 01m 13s)
  • 16:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 01m 13s)
  • 16:41 jynus: reimporting database testreduce_0715 from db1009 to db2037
  • 16:36 marostegui: Restart mariadb on db1093 for binlog format change - T186321
  • 16:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 - T186321 (duration: 01m 13s)
  • 16:14 moritzm: rebooting hafnium for kernel security update
  • 16:06 marostegui: Fix s7 replication on labsdb1010 - T186579
  • 16:00 moritzm: rebooting radium (tor relay) for kernel security update
  • 15:52 moritzm: draining restbase1014 for eventual reboot for kernel security update
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1060 as API (duration: 01m 13s)
  • 15:32 bblack: disabling puppet on A:cp for deploy of https://gerrit.wikimedia.org/r/#/c/415204/ and friends
  • 15:30 mobrovac@tin: Synchronized php-1.31.0-wmf.23/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.23) - T188540 (duration: 01m 14s)
  • 15:26 mobrovac@tin: Synchronized php-1.31.0-wmf.22/extensions/EventBus/includes/JobQueueEventBus.php: EventBus: Specify that EventBus queue supports delayed jobs (wmf/1.31.0-wmf.22) - T188540 (duration: 01m 13s)
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1074 after alter table (duration: 01m 13s)
  • 15:22 moritzm: draining restbase1013 for eventual reboot for kernel security update
  • 15:19 zeljkof: EU SWAT finished
  • 15:18 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/Popups: SWAT: Fix: dont assume thumbnail URLs contain pixel size (T187955) (duration: 01m 14s)
  • 15:17 moritzm: rolling restart of swift frontends in eqiad for kernel security update
  • 15:12 godog: upload puppetdb 4.4.0-1~wmf1 to component/puppetdb4 - T177253
  • 15:00 ema: eqiad LVSs: reboot for retpoline kernel updates T188092
  • 14:36 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Import sources on maiwikimedia (T188374) (duration: 01m 13s)
  • 14:28 moritzm: rolling restart of swift frontends in codfw for kernel security update
  • 14:26 moritzm: draining restbase1012 for eventual reboot for kernel security update
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Quiz Extension at zhwikibooks (T188213) (duration: 01m 14s)
  • 14:12 mobrovac@tin: Synchronized wmf-config/jobqueue.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 2/2 - T188540 (duration: 01m 13s)
  • 14:10 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: JobQueue: Enable EventBus for cdnPurge for all but wikipedia, commons and wikidata, file 1/2 - T188540 (duration: 01m 14s)
  • 14:02 ppchelko@tin: Finished deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540 (duration: 00m 44s)
  • 14:01 ppchelko@tin: Started deploy [cpjobqueue/deploy@b5255f0]: Enable kafka queue for cdnPurge for all but wikipedia, commons and wikidata. T188540
  • 13:54 moritzm: draining restbase1011 for eventual reboot for kernel security update
  • 13:50 ema: codfw LVSs: reboot for retpoline kernel updates T188092
  • 13:33 gehel: force merging enwiki_general index on codfw to reclaim space
  • 13:18 moritzm: draining restbase1010 for eventual reboot for kernel security update
  • 13:17 elukey: reboot kafka-jumbo100[5,6] for kernel updates
  • 13:16 ema: esams LVSs: reboot for retpoline kernel updates T188092
  • 12:44 moritzm: draining restbase1009 for eventual reboot for kernel security update
  • 12:39 moritzm: rolling reboot of parsoid in eqiad for kernel security update
  • 12:27 elukey: reboot kafka-jumbo1004 for kernel updates
  • 12:21 elukey: reboot kafka1023 for kernel updates
  • 11:59 moritzm: draining restbase1008 for eventual reboot for kernel security update
  • 11:48 moritzm: powercycling wtp2013, stuck in reboot
  • 11:36 elukey: reboot kafka-jumbo1003 for kernel updates
  • 11:33 jynus: restarting labsdb1011
  • 11:32 elukey: reboot kafka1022 for kernel updates
  • 11:20 elukey: reboot kafka-jumbo1002 for kernel security updates
  • 11:15 moritzm: draining restbase1007 for eventual reboot for kernel security update
  • 11:13 ema: ulsfo LVSs: reboot for retpoline kernel updates T188092
  • 11:08 elukey: reboot kafka1020 for kernel updates
  • 10:38 ema: eqsin LVSs: reboot for retpoline kernel updates T188092
  • 10:32 moritzm: rolling reboot of parsoid in codfw for kernel security update
  • 10:27 moritzm: draining restbase2012 for eventual reboot for kernel security update
  • 10:20 moritzm: rebooting labnodepool1001 for kernel security update
  • 10:02 moritzm: rebooting contint1001 for kernel security update
  • 09:59 elukey: reboot kafka1014 for kernel security updates
  • 09:57 moritzm: draining restbase2011 for eventual reboot for kernel security update
  • 09:43 elukey: reboot kafka1013 for kernel security updates
  • 09:29 elukey: rebooting analytics1030 for kernel updates
  • 09:17 moritzm: draining restbase2010 for eventual reboot for kernel security update
  • 08:52 moritzm: rebooting prometheus servers in eqiad for kernel security update
  • 08:41 moritzm: draining restbase2009 for eventual reboot for kernel security update
  • 08:34 elukey: reboot kafka1012 for kernel updates - T188594
  • 08:20 gehel: banning elastic1021 from cluster (failed memory) - T188595
  • 07:55 elukey: reboot kafka-jumbo1001 for kerne updates - T188594
  • 07:52 elukey: run kafka preferred-replica-election on kafka1012 to force broker 18 to get back among Kafka topic leaders
  • 07:26 gehel: starting rolling reboot of elasticsearch / cirrus - eqiad (kernel upgrade and config changes)
  • 07:24 demon@tin: Synchronized php-1.31.0-wmf.22/maintenance/sql.php: adding --json output mode (duration: 01m 15s)
  • 06:59 chasemp: restart nova-api on labnet1001
  • 06:57 madhuvishy: Restart nova-conductor on labcontrol1001
  • 06:26 marostegui: Deploy schema change on db1074 - T187089 T185128 T153182
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1074 for alter table (duration: 01m 14s)
  • 06:09 marostegui: Reload haproxy on dbproxy1005
  • 02:28 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 23s)
  • 02:05 demon@tin: Synchronized wmf-config/: removing extension-list-wikitech (duration: 01m 13s)
  • 02:03 demon@tin: Synchronized docroot/noc/: cleanup extension-list-wikitech removal (duration: 01m 12s)
  • 01:49 demon@tin: Synchronized wmf-config/: Undeploying EmailAuth from beta, no-op (duration: 01m 16s)
  • 01:32 eileen: update civicrm revision changed from 341c734a79 to a819d64d98, config revision is 62631813fc (add geocoder extension)
  • 00:43 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Clean up $wgEchoPerUserBlacklist setting (duration: 01m 14s)
  • 00:36 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Remove $wgUsejQueryThree (duration: 01m 14s)
  • 00:27 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswikibooks (T145394) (duration: 01m 13s)
  • 00:17 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on eswiki (T130279) (duration: 01m 14s)

2018-02-28

  • 23:27 eileen: civicrm revision changed from a47eafcbad to 341c734a79, config revision is 62631813fc (update civicrm submodule & vendor but not geocoder extension as yet)
  • 22:11 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 back to 1.31.0-wmf.22 T188555
  • 22:00 ejegg: updated payments-wiki from 1acfc4a9a0 to 498f49a758
  • 21:57 milimetric@tin: Finished deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment (duration: 04m 19s)
  • 21:56 thcipriani@tin: rebuilt and synchronized wikiversions files: Group1 to 1.31.0-wmf.23
  • 21:53 milimetric@tin: Started deploy [analytics/refinery@fdd6c25]: Fix error due to invalid docopts comment
  • 21:46 arlolra: Updated Parsoid to 1415a2a (T58756, T169006)
  • 21:26 arlolra@tin: Finished deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a (duration: 08m 46s)
  • 21:17 arlolra@tin: Started deploy [parsoid/deploy@d376a3c]: Updating Parsoid to 1415a2a
  • 20:53 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 (back) to 1.31.0-wmf.23
  • 20:28 thcipriani@tin: rebuilt and synchronized wikiversions files: testwiki to 1.31.0-wmf.23
  • 20:20 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/page/WikiPage.php: WikiPage: Avoid $user variable reuse in doDeleteArticleReal() T188479 (duration: 00m 57s)
  • 19:52 demon@tin: Synchronized README: no-op, forcing co-master sync (duration: 00m 57s)
  • 19:29 gehel: rolling reboot of elasticsearch / cirrus - codfw completed
  • 18:56 demon@tin: Finished deploy [gerrit/gerrit@f16f4a4]: GO plugin (duration: 00m 10s)
  • 18:55 demon@tin: Started deploy [gerrit/gerrit@f16f4a4]: GO plugin
  • 18:53 niharika29@tin: Synchronized wmf-config/throttle.php: Clean obsolete rules and add a new one - T188529 (duration: 00m 56s)
  • 18:44 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:42 niharika29@tin: Synchronized wmf-config/Wikibase.php: Reduce the batch size of statment usage tracking to 33 T151717 (duration: 00m 57s)
  • 18:32 godog: puppet reenable on einsteinium
  • 18:30 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading from full term entity id everywhere T114903 (duration: 00m 57s)
  • 18:23 niharika29@tin: Synchronized wmf-config/Wikibase-production.php: Enable Wikibase RC injection for ruwiki [mediawiki-config] - https://gerrit.wikimedia.org/r/415078 (duration: 00m 57s)
  • 18:19 niharika29@tin: Synchronized wmf-config/InitialiseSettings.php: Deploy Compact Language Links out of Beta on English Wikipedia T187677 (duration: 00m 58s)
  • 18:17 mutante: gerrit2001 - reboot for kernel upgrade
  • 18:12 godog: force a puppet run on failed hosts in eqiad for recovery
  • 18:09 apergos: rebooting dataset1001 (dumps.wm.o) for new kernel
  • 18:06 godog: stop and restart apache2 on puppetmaster1002
  • 17:58 godog: restart apache2 on puppetmaster1002
  • 17:46 milimetric@tin: Finished deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact (duration: 06m 45s)
  • 17:46 kart_: Finished running CLL preference migration script on terbium (T187677)
  • 17:39 milimetric@tin: Started deploy [analytics/refinery@e551744]: Update sqoop job and orm artifact
  • 17:38 mutante: phab2001 - downtimed, rebooting for kernel upgrade
  • 16:44 moritzm: draining restbase2008 for eventual reboot for kernel security update
  • 16:10 moritzm: rebooting prometheus servers in codfw for kernel security update
  • 16:10 ppchelko@tin: Finished deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons (duration: 00m 41s)
  • 16:09 ppchelko@tin: Started deploy [cpjobqueue/deploy@3622e38]: Enable refreshLinks for all but wikipedia, wiktionary and commons
  • 16:02 moritzm: draining restbase2007 for eventual reboot for kernel security update
  • 15:45 godog: repool rhodium as puppet master backend
  • 15:22 moritzm: rebooting ores in eqiad for kernel security update
  • 15:22 ema: upgrade cache_text@eqiad to varnish 5
  • 15:20 moritzm: draining restbase2006 for eventual reboot for kernel security update
  • 15:16 zeljkof: EU SWAT finished
  • 15:15 zfilipin@tin: Synchronized php-1.31.0-wmf.23/extensions/WikibaseQualityConstraints/: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) Bump cache key for check results (T188384) (duration: 01m 02s)
  • 15:11 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Bump cache key for check results (T188384) (duration: 01m 02s)
  • 14:54 moritzm: rebooting ores in codfw for kernel security update
  • 14:53 jynus: stopping labsdb1011 to clone it to labsdb1010 T186579
  • 14:50 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Drop the medlem user group and editallpages user right (T184981) (duration: 00m 57s)
  • 14:48 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints: SWAT: Don’t query WikiPageEntityMetaDataAccessor with empty list (T188311) (duration: 01m 02s)
  • 14:47 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: SWAT: Only filter statuses after collecting metadata (T188384) (duration: 01m 03s)
  • 14:38 jynus: dropping sqldata on dbstore1001
  • 14:32 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable HTML Previews on all wikipedias (T182319) (duration: 00m 57s)
  • 14:28 moritzm: rebooting kubestage* for kernel security update
  • 14:25 gehel@tin: Finished deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator (duration: 04m 27s)
  • 14:22 moritzm: draining restbase2005 for eventual reboot for kernel security update
  • 14:21 gehel@tin: Started deploy [tilerator/deploy@455a31a]: adding Brighmed, Meddo and ClearTables to tilerator
  • 14:17 zfilipin@tin: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: beta: enable VirtualPagePreviews events on beta cluster (T184793 T186728) (duration: 00m 57s)
  • 13:13 moritzm: draining restbase2004 for eventual reboot for kernel security update
  • 12:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2011 - T187886 (duration: 00m 59s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2011 - T187886 (duration: 00m 58s)
  • 12:35 moritzm: draining restbase2003 for eventual reboot for kernel security update
  • 12:00 marostegui: Reboot db1115 tendril master to pick up new my.cnf options - T184704
  • 11:49 moritzm: draining restbase2002 for eventual reboot for kernel security update
  • 11:37 marostegui: Reset slave all on db2093 - T184704
  • 11:35 moritzm: rebooting eqiad job runners for kernel security update
  • 11:18 moritzm: powercycling restbase2001, stuck in reboot
  • 11:10 godog: rollout thumbor 1.15 to codfw/eqiad
  • 10:59 godog: upload python-thumbor-wikimedia 1.15 - T187822 T187350
  • 10:59 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw1261.eqiad.wmnet
  • 10:54 moritzm: draining restbase2001 for eventual reboot for kernel security update
  • 10:43 moritzm: rebooting remaining mediawiki app servers in eqiad
  • 09:27 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2083, db2082 and db2081 after kernel upgrade (duration: 00m 57s)
  • 09:25 ema: upgrade cache_text@codfw to varnish 5
  • 09:06 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2083, db2082 and db2081 for kernel upgrade (duration: 00m 56s)
  • 09:06 marostegui: Reboot db2083, db2082 and db2081 for kernel and mariadb upgrade
  • 08:55 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2069 - T162807 (duration: 00m 57s)
  • 08:42 filippo@neodymium: conftool action : set/pooled=yes; selector: name=neodymium.eqiad.wmnet
  • 08:42 filippo@neodymium: conftool action : set/pooled=no; selector: name=neodymium.eqiad.wmnet
  • 08:34 marostegui: Reboot db2069 for kernel upgrade
  • 08:33 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2069 - T162807 (duration: 00m 57s)
  • 08:23 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2062 - T162807 (duration: 00m 57s)
  • 08:10 moritzm: rebooting remaining mediawiki API servers in eqiad
  • 07:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2062 - T162807 (duration: 00m 57s)
  • 07:51 marostegui: Reboot db2062 for mariadb and kernel upgrade
  • 07:36 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2085 (duration: 00m 57s)
  • 07:15 marostegui: Upgrade kernel and mariadb on db2085
  • 07:15 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2085 for mariadb and kernel upgrade (duration: 01m 00s)
  • 06:32 marostegui: Deploy schema change on db1060 (with replication) - this will cause lag on labs servers - T187089 T185128 T153182
  • 06:31 kart_: (Re)Starting CLL preference migration script on terbium (T187677)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 06:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1060 for alter table (duration: 00m 57s)
  • 05:43 demon@tin: rebuilt and synchronized wikiversions files: (no justification provided)
  • 04:55 krinkle@tin: Synchronized wmf-config/profiler.php: Iba417de75a and Ied984d (duration: 01m 06s)
  • 03:01 kart_: Starting CLL preference migration script on terbium (T187677)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 21s)
  • 00:55 demon@tin: Synchronized scap/plugins/wmfbetaautoupdate.py: no-op (duration: 01m 14s)
  • 00:24 papaul: OS install on wdqs200[4-6]
  • 00:03 thcipriani@tin: Synchronized php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameUserJob.php: LocalRenameUserJob: escape backreferences in replacement title T188171 (duration: 01m 13s)

2018-02-27

  • 23:38 krinkle@tin: Synchronized dblists/: remove pp_stage1_raw.dblist (duration: 01m 14s)
  • 21:23 thcipriani@tin: Synchronized php-1.31.0-wmf.23/includes/user/User.php: Add a missing check of $wgActorTableSchemaMigrationStage T188437 (duration: 01m 14s)
  • 20:42 ppchelko@tin: Finished deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull (duration: 02m 29s)
  • 20:39 ppchelko@tin: Started deploy [eventstreams/deploy@14e0b03]: Set correct CSP headers, forgot to git pull
  • 20:37 ppchelko@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers (duration: 00m 25s)
  • 20:36 ppchelko@tin: Started deploy [eventstreams/deploy@8f2eec4]: Set correct CSP headers
  • 20:31 thcipriani@tin: rebuilt and synchronized wikiversions files: Group0 to 1.31.0-wmf.23
  • 20:08 herron: eqiad puppet master reboots finished -- re-enabling puppet agents
  • 20:02 herron: temporarily disabling puppet agents and rebooting eqiad puppet masters for kernel update
  • 20:02 thcipriani@tin: Finished scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache (duration: 32m 10s)
  • 19:30 thcipriani@tin: Started scap: testwiki to php-1.31.0-wmf.23 and rebuild l10n cache
  • 19:08 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (duration: 04m 16s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241
  • 19:03 otto@tin: Finished deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only) (duration: 00m 22s)
  • 19:03 otto@tin: Started deploy [eventstreams/deploy@8f2eec4]: Publish page change related streams: T187241 (scb2002 only)
  • 18:32 otto@tin: Started restart [eventstreams/deploy@7629e16]: service restart to publish page change related streams: T187241 (scb2001 only)
  • 18:32 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only) (duration: 00m 03s)
  • 18:32 otto@tin: Started deploy [eventstreams/deploy@7629e16]: Config deploy to publish page change related streams: T187241 (scb2001 only)
  • 18:02 moritzm: rebooting kubernetes workers in eqiad for kernel security update
  • 17:46 moritzm: rebooting kubernetes workers in codfw for kernel security update
  • 17:41 jynus: restarting ferm on db2049, seems failed one day ago
  • 17:38 gehel: restarting wdqs-updater on wdqs1004 - T188045
  • 17:32 thcipriani: starting branch cut for 1.31.0-wmf.23 T183962
  • 17:14 godog: upload puppetdb 2.3.8-1~wmf1+stretch to stretch-wikimedia - T184562
  • 17:10 urandom: restarting Cassandra, restbase1007-a to test jmx_exporter
  • 16:53 elukey: restart cassandra-a on aqs1004 to test the prometheus jmx agent before complete rollout - T184795
  • 16:52 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH everywhere (duration: 00m 56s)
  • 16:50 ema: lvs1010: retpoline kernel/libs upgrade T188092
  • 16:46 ema: cp1008: retpoline kernel/libs upgrade T188092
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1081 (duration: 02m 04s)
  • 16:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 55s)
  • 16:26 moritzm: rebooting mw1293-mw1298 for kernel security update
  • 16:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:10 thcipriani: restarting jenkins for plugin update
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1081 (duration: 00m 56s)
  • 16:06 moritzm: rebooting restbase-dev for kernel security update
  • 15:49 awight: Restarting ORES celery workers, changing from 35 -> 45 workers per node.
  • 15:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1081 - T186321 (duration: 00m 56s)
  • 15:37 marostegui: Stop MySQL and reboot db1081 for kernel ugprade, mariadb upgrade and binlog format change - T186321
  • 15:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1081 - T186321 (duration: 00m 55s)
  • 15:33 moritzm: installing squid security updates
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 57s)
  • 15:20 moritzm: powercycling thumbor1004, stuck during reboot
  • 15:19 ottomata: beginning migration of varnishkafka webrequest upload from Kafka analytics to kafka jumbo
  • 15:11 ema: upgrade cache_text@esams to varnish 5 T184448
  • 15:02 gilles: EU SWAT finished
  • 15:02 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Set up separate Thumbor Swit user for private containers (T187822) (duration: 00m 55s)
  • 15:00 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: (T187822) (duration: 00m 56s)
  • 14:52 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix: Add missed line in wgLogo (T185977) (duration: 00m 56s)
  • 14:44 moritzm: rebooting thumbor in eqiad for kernel security update
  • 14:31 bblack: puppet disable on RPS-using hosts to be careful with RPS hosts https://gerrit.wikimedia.org/r/#/c/414676/ - cp*, lvs*, labstore
  • 14:27 chasemp: silence labvirt1019/1020 in icinga
  • 14:24 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation (duration: 00m 04s)
  • 14:23 ariel@tin: Started deploy [dumps/dumps@9b7841f]: fix off-by-one error in prefetch stubs generation
  • 14:15 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T188292) New throttle rule for cswiki (T187990) New throttle rule (T188034) (duration: 00m 57s)
  • 14:05 marostegui: Update tendril shard table for the "tendril" replication topology - T184704
  • 13:33 gehel: starting rolling restart of elasticsearch / cirrus codfw (config changes + kernel upgrade)
  • 13:25 moritzm: rebooting thumbor in codfw for kernel security update
  • 13:22 godog: upload ruby-mysql 2.9.1-1~bpo9+1 to stretch-wikimedia - T184562
  • 13:00 Amir1: inserting wikidata-related interwikis to site_identifiers table using eval.php in enwiki (T183019)
  • 12:35 marostegui: Remove /srv/tmp/dbstore1001 files from es1017 to free up space - T186596
  • 12:16 Hauskatze: The global rename: Darkweasel94 → Tokfo has FINISHED - T187629
  • 11:56 moritzm: rebooting mw1221-mw1235 (API servers) for kernel security update
  • 11:08 moritzm: rebooting mw1240-mw1258 (app servers) for kernel security update
  • 11:00 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=scb1003.eqiad.wmnet
  • 10:57 moritzm: keeping scb1003 depooled for T188385
  • 10:51 _joe_: updating python-conftool everywhere to 1.0.0
  • 10:51 _joe_: uploaded python-conftool 1.0.0 to stretch-wikimedia
  • 10:49 moritzm: powercycling scb1003, stuck during reboot
  • 10:29 Hauskatze: Starting big global rename: Darkweasel94 → Tokfo - with DBA/OPS green light - T187629
  • 10:07 akosiaris: poweroff sca1004 for T181121 tests
  • 10:05 moritzm: reboot scb in eqiad for kernel security updates
  • 10:03 _joe_: uploading conftool-1.0.0-1 to jessie-wikimedia
  • 09:16 godog: reimage rhodium - T184562
  • 08:42 gehel: powercycling wdqs1004 - T188045
  • 08:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1084 (duration: 00m 56s)
  • 08:24 gilles@tin: Synchronized private/PrivateSettings.php: Separate Thumbor Swift user for private containers (duration: 00m 56s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 and db1103:3312 (duration: 00m 56s)
  • 07:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 (duration: 00m 56s)
  • 07:04 marostegui: Stop MySQL on db1084 for kernel and mariadb upgrade
  • 07:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 00m 56s)
  • 07:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1084 (duration: 00m 56s)
  • 06:59 demon@tin: Synchronized README: no-op (duration: 00m 56s)
  • 06:51 marostegui@tin: Synchronized wmf-config/db-codfw.php: Increase traffic for db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Slowly repool db1103:3312 (duration: 00m 56s)
  • 06:33 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 06:21 marostegui: Stop MySQL on db1115 to copy it to db2093 - tendril (dbtree) service will be down for this maintenance - T184704
  • 06:20 marostegui: Reload haproxy on dbproxy1005
  • 05:26 krinkle@tin: Synchronized wmf-config/profiler.php: I1e7dc263b43 (duration: 00m 56s)
  • 05:00 krinkle@tin: Synchronized wmf-config/profiler.php: I34687c0569af (duration: 00m 57s)
  • 03:28 krinkle@tin: Synchronized wmf-config/profiler.php: various refactor and clean up for T180183 (no-op) (duration: 00m 54s)
  • 03:12 krinkle@tin: Synchronized wmf-config/profiler-labs.php: beta only (no-op) (duration: 00m 56s)
  • 02:58 demon@tin: Pruned MediaWiki: 1.31.0-wmf.21 [keeping static files] (duration: 01m 24s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 06m 11s)
  • 01:39 mutante: install1002 - re-enabling disabled puppet
  • 00:55 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: add very likely bad faith filter on svwiki (T174560) (duration: 00m 57s)
  • 00:49 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on svwiki (T174560) (duration: 00m 56s)
  • 00:40 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ORES filters on simplewiki (T182012) (duration: 00m 56s)
  • 00:39 demon@tin: Synchronized wmf-config/CommonSettings.php: beta-only change: lsctorestaticarray (duration: 00m 56s)
  • 00:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on all wikinews wikis (T188000), all private wikis (T188009), test2wiki, loginwiki, votewiki and wikimania2017wiki (T188008) (duration: 00m 56s)

2018-02-26

  • 23:37 bd808@tin: Finished scap: wikitech: use 'labswiki' database on m5-master (T188029) (duration: 03m 21s)
  • 23:34 bd808@tin: Started scap: wikitech: use 'labswiki' database on m5-master (T188029)
  • 23:31 bd808: Pulled T188029 change to silver
  • 22:57 demon@tin: Synchronized wmf-config/: fileimporter/fileexporter improvements (duration: 00m 58s)
  • 22:56 demon@tin: Synchronized wmf-config/InitialiseSettings.php: fileimporter/fileexporter improvements (duration: 00m 57s)
  • 22:09 andrewbogott: hotfixed mediawiki on silver to use m5-master for wikitech. This will be finalized with the merge of https://gerrit.wikimedia.org/r/#/c/414733/
  • 22:07 andrewbogott: made mysql on silver read-only, hopefully for good. T188029
  • 22:05 andrewbogott: logging a log to test logging a log
  • 22:03 andrewbogott: testing the log by logging a test
  • 19:46 catrope@tin: Synchronized php-1.31.0-wmf.22/extensions/WikibaseQualityConstraints/: T184937 (duration: 01m 03s)
  • 19:46 mutante: running puppet on cache::misc servers to add new director for design.wm
  • 19:29 catrope@tin: Synchronized wmf-config/CommonSettings.php: Simplify 2017 wikitext editor config (part 1) (duration: 00m 54s)
  • 19:26 catrope@tin: Synchronized wmf-config/throttle.php: Add throttle rule (T188129) (duration: 00m 56s)
  • 19:22 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Add mushroomobserver.org to wgCopyUploadsDomains (T188203) (duration: 00m 57s)
  • 19:08 herron: codfw puppet master kernel updates complete re-enabling puppet agents
  • 18:31 gehel@tin: Finished deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh (duration: 06m 28s)
  • 18:24 gehel@tin: Started deploy [wdqs/wdqs@f74cbd1]: new forAllCategoryWikis.sh
  • 18:13 demon@tin: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor: Ignore empty repositories (duration: 00m 56s)
  • 17:34 jynus: deploying new query killer to db1109
  • 17:32 akosiaris: shutdown sca1004 on ganeti1005 for T181121
  • 16:39 andrewbogott: making wikitech read-only (via a local patch) while I migrate the database to m5
  • 16:33 marostegui: Reboot db1111 storage crashed - T187526
  • 16:31 papaul: Maintenance: removing Msw-d4-codfw for replacement:T187534
  • 16:29 mutante: restarted stashbot on toolforge because it didn't react to !log
  • 16:26 mutante: test !log
  • 16:08 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2049 - T187534 (duration: 00m 56s)
  • 15:45 andrewbogott: made wikitech read/write again pending a bit more preliminary work
  • 15:43 cmjohnson1: swapping failed disk db1068
  • 15:42 andrewbogott: marking wikitech read-only (via a local edit to CommonSettings.php) for https://phabricator.wikimedia.org/T188029
  • 15:32 addshore: EU SWAT done
  • 15:31 addshore@tin: Finished scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations (duration: 11m 29s)
  • 15:19 addshore@tin: Started scap: Updated mediawiki/extensions/AdvancedSearch i18n files for some translations
  • 15:12 Amir1: This might have performance implications roll it back if it affects these wikis too much
  • 15:12 gehel: reboot of relforge completed, cluster is green again
  • 15:11 ladsgroup@tin: Synchronized wmf-config/Wikibase-production.php: Enable reading full entity id from wb_terms table in three wikis (T114903) (duration: 00m 56s)
  • 14:54 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Add patrol rights/groups to fawikisource (T187662) (duration: 00m 56s)
  • 14:52 gehel: rebooting relforge for kernel upgrade
  • 14:50 godog: upload puppetdb 4.4.0-1~wmf1 to stretch-wikimedia - T177253
  • 14:48 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable statement usage tracking in several wikis (T151717) (duration: 00m 57s)
  • 14:40 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespaces to urwiktionary (T186393) (duration: 00m 56s)
  • 14:28 zfilipin@tin: Synchronized wmf-config/Wikibase-production.php: SWAT: Enable caching of constraint check results (T184812) (duration: 00m 55s)
  • 14:15 moritzm: rebooting scb in codfw for kernel security updates
  • 14:10 zfilipin@tin: Synchronized php-1.31.0-wmf.22/extensions/UniversalLanguageSelector/maintenance/ULSCompactLinksDisablePref.php: SWAT: Added option to continue script from particular User ID Use a replica dedicated to slow queries (if available) (T187880) (duration: 00m 58s)
  • 13:09 moritzm: rebooting video scalers in eqiad for kernel security update
  • 11:12 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:11 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:01 moritzm: powercycling mw1264 (stuck after reboot)
  • 10:10 moritzm: rebooting mw canaries for kernel security update
  • 09:29 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2055 and db2070 (duration: 00m 55s)
  • 09:23 elukey: copied burrow 0.1 from jessie-wikimedia to stretch-wikimedia
  • 08:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1103:3314 (duration: 00m 56s)
  • 07:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1103:3314 (duration: 00m 56s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1103:3314 after mariadb and kernel upgrade (duration: 00m 56s)
  • 07:08 marostegui: Deploy schema change on db1103:3312 - T187089 T185128 T153182
  • 06:59 marostegui: Stop MySQL on db1103:3312 and 3314 to upgrade it and kernel
  • 06:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 (duration: 00m 54s)
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 (duration: 00m 56s)
  • 06:35 marostegui: Stop MySQL db2070 and db2055 to copy data to db2055 (and upgrade kernel and mariadb)
  • 06:35 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2055 and db2070 (duration: 01m 07s)
  • 06:15 marostegui: Stop MySQL on db1115 tendril database to copy it to db2093. Tendril (dbtree) service will be down for maintenance - T184704
  • 02:55 XioNoX: labs->cloud vlan rename in codfw - T187933
  • 02:43 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.22) (duration: 07m 12s)
  • 02:15 XioNoX: disabling ALGs on MR routers

2018-02-25

  • 07:35 marostegui: Fix s7 replication on labsdb1010 - T186579

2018-02-24

  • 06:11 marostegui: Reload haproxy on dbproxy1005
  • 01:42 demon@tin: Synchronized docroot/noc/conf/highlight.php: one last time (duration: 00m 57s)
  • 01:18 demon@tin: Synchronized docroot/noc/conf/index.php: fix dblist links from listing (duration: 00m 56s)
  • 01:13 Reedy: added eqsin ipv6 range to botpasswords ip range restriction T188111
  • 01:08 demon@tin: Synchronized docroot/noc/: dblists cleanup (duration: 00m 57s)
  • 01:07 demon@tin: Synchronized tests/: no-op (duration: 00m 59s)

2018-02-23

  • 22:36 demon@tin: Finished deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file (duration: 00m 10s)
  • 22:35 demon@tin: Started deploy [gerrit/gerrit@010ad50]: no-op, removing permission from file
  • 21:27 demon@tin: Finished scap: pos mysql code (duration: 23m 09s)
  • 21:04 demon@tin: Started scap: pos mysql code
  • 20:48 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.22
  • 20:39 no_justification: wmf.21, that is
  • 20:38 demon@tin: rebuilt and synchronized wikiversions files: roll wikidatawiki back to wmf.11, busted
  • 20:35 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.22
  • 19:10 ebernhardson: restart relforge elasticsearch cluster to test entity extraction on larger dataest
  • 18:28 Amir1: mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=enwiki --force-protocol https (T183019)
  • 17:22 ema: libvmod-netmapper 1.6-1 uploaded to apt.w.o/experimental T188089
  • 16:37 moritzm: rebooting image scalers in codfw for kernel security updates
  • 16:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1083 (duration: 01m 14s)
  • 15:58 moritzm: rebooting job runners in codfw for kernel security updates
  • 15:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 02m 21s)
  • 15:15 jynus: about to deploy gerrit:413375 disabling puppet on affected hosts
  • 14:59 elukey: update facts on puppet compiler
  • 14:40 moritzm: installing kernel updates on API servers in codfw
  • 14:09 jynus: restarting tendril database- will case unavailability of dbtree for a while
  • 13:44 moritzm: reboot ocg1003 for tests
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 and fully repool db1076 (duration: 01m 13s)
  • 12:28 hashar@tin: Synchronized wmf-config/throttle.php: Define new throttle rule - T188090 (duration: 01m 11s)
  • 12:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1083 (duration: 01m 21s)
  • 12:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 - T186321 (duration: 01m 12s)
  • 11:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1076 - T186321 (duration: 01m 13s)
  • 11:29 marostegui: Restart mariadb on db1076 for binlog format change - T186321
  • 11:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for binlog format change - T186321 (duration: 01m 08s)
  • 11:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1090 after alter table (duration: 01m 12s)
  • 11:02 moritzm: installing kernel updates on mw* in codfw
  • 10:30 hashar: releases1001: sudo -u jenkins rm -fR /var/lib/jenkins/jobs/mediawiki-private-nightlies/workspace/BRANCH/REL1_??/mediawiki-snapshot-REL1_??-2018???? # T188080
  • 10:19 jynus@tin: Synchronized wmf-config/db-eqiad.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:08 jynus@tin: Synchronized wmf-config/db-codfw.php: Pool db2090 for the first time (duration: 01m 12s)
  • 10:01 elukey: restart hhvm on mw1230
  • 09:54 elukey: restart hhvm on mw1286
  • 09:50 elukey: restart hhvm on mw1227
  • 08:05 marostegui: MariaDB and kernel upgrade on db1083
  • 07:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1083, fully repool db1089 - T162807 (duration: 01m 12s)
  • 06:55 marostegui: Reboot db2093 to test /srv auto-mounting
  • 06:40 marostegui: Deploy schema change on db1090 - T187089 T185128 T153182
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 for alter table (duration: 01m 13s)
  • 05:58 mutante: puppetmaster1001 - signing puppet certs for kafkamon1001/kafkamon2001 - initial puppet runs, adding as role spare (T187901)
  • 05:40 mutante: ganeti1004 - initial startup of kafkamon1001 - booting to PXE, installing stretch (T187901)
  • 04:56 mutante: ganeti: ganeti2004 - creating new VM kafkamon2001 - vcpus=2,memory=8g,disk=60G, row_A codfw (T187901)
  • 04:53 mutante: ganeti: creating new VM kafkamon1001 - vcpus=2,memory=8g,disk=60G, row_A eqiad (T187901)
  • 02:46 demon@tin: Finished deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin (duration: 00m 10s)
  • 02:46 demon@tin: Started deploy [gerrit/gerrit@23ebf75]: deploying webhooks plugin
  • 02:10 demon@tin: Synchronized docroot/: mw.org docroot moving (duration: 01m 13s)
  • 01:45 eileen: update process control process-control config revision is 1605238b2e
  • 01:20 eileen: update civicrm revision changed from aa251f1a93 to a47eafcbad, config revision is c1787646bc
  • 01:19 demon@tin: Synchronized static/favicon/: smaller favicons (duration: 01m 12s)
  • 01:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: point mkwikt favicon to en version, dupe (duration: 01m 15s)
  • 01:08 demon@tin: Synchronized wmf-config/InitialiseSettings.php: rtl wikibooks logo (duration: 01m 13s)
  • 01:06 demon@tin: Synchronized static/favicon/wikibooks-rtl.ico: rtl wikibooks logo (duration: 01m 12s)
  • 00:52 demon@tin: Synchronized static/images/project-logos/: new project logos for urdu wikt (duration: 01m 13s)
  • 00:37 krinkle@tin: Synchronized docroot/m.wikipedia.org/w/mobilelanding.php: Ia54cd7 - rm use of MW_LANG (duration: 01m 13s)

2018-02-22

  • 22:33 demon@tin: Synchronized php-1.31.0-wmf.22/includes/filerepo/file/LocalFile.php: Id5cdd8ec (duration: 01m 12s)
  • 22:32 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: Id5cdd8ec (duration: 01m 12s)
  • 22:30 demon@tin: Synchronized php-1.31.0-wmf.22/includes/Storage/: Id5cdd8ec (duration: 01m 13s)
  • 22:16 maxsem@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 12s)
  • 22:14 maxsem@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi/: T188019 (duration: 01m 14s)
  • 21:51 demon@tin: Synchronized php-1.31.0-wmf.22/includes/externalstore/: I9334d36e (duration: 01m 15s)
  • 21:37 dzahn@puppetmaster1001: conftool action : set/pooled=no; selector: name=wdqs1004.eqiad.wmnet
  • 21:11 gehel: powercycling wdqs1004 (complete loss of network)
  • 20:39 demon@tin: Synchronized php-1.31.0-wmf.22/includes/libs/objectcache/WANObjectCache.php: betterer logging for cache ttl reduction, Iea029e78 (duration: 01m 13s)
  • 19:33 XioNoX: redirecting Facebook bots large source of traffic to codfw ( https://gerrit.wikimedia.org/r/#/c/413446/ )
  • 19:14 akosiaris: rolling restart of eqiad appservers. sudo cumin -b3 -s 30 'A:mw-eqiad' 'restart-hhvm' T188019
  • 19:12 twentyafterfour@tin: Synchronized php-1.31.0-wmf.22/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/SyntaxHighlight_GeSHi: deploy https://gerrit.wikimedia.org/r/#/c/413437/ (duration: 01m 13s)
  • 19:09 twentyafterfour: syncing https://gerrit.wikimedia.org/r/#/c/413437/
  • 19:03 chasemp: baham:~# authdns-update
  • 19:00 jynus@tin: Synchronized wmf-config/db-codfw.php: Repool db2073 (duration: 01m 12s)
  • 17:23 elukey: installed linux-perf-4.9 on phab1001 to experiment with perf tracing
  • 17:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1076 (duration: 01m 12s)
  • 17:05 XioNoX: rolling back "redirecting ns2 traffic to radon"
  • 17:02 ema: reboot eeden with new kernel 4.9.0-0.bpo.6
  • 16:58 XioNoX: redirecting ns2 traffic to radon
  • 16:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1076 (duration: 01m 12s)
  • 16:28 ejegg: updated CiviCRM from b27e6a5019 to aa251f1a93
  • 16:26 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Use EventBus for refreshLinks in test wikis, file 2/2 - T185052 (duration: 01m 12s)
  • 16:25 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: Use EventBus for refreshLinks in test wikis, file 1/2 - T185052 (duration: 01m 12s)
  • 16:23 ppchelko@tin: Finished deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052 (duration: 00m 36s)
  • 16:23 ppchelko@tin: Started deploy [cpjobqueue/deploy@ab3d002]: Enable refreshLinks for group0 wikis T185052
  • 16:22 mobrovac@tin: scap failed: average error rate on 8/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 16:13 jynus: tendril and dbtree database currently under maintanance
  • 16:04 ejegg: updated payments-wiki from fe311c2d26 to 1acfc4a9a0
  • 15:26 ema: finished upgrading cache_text@ulsfo to varnish 5
  • 15:24 elukey: manually removing from cp1008 and cache::misc old files related to the varnishkafka jumbo testing instance (after https://gerrit.wikimedia.org/r/413370)
  • 14:58 matthiasmullie: EU SWAT finished
  • 14:52 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable 3D file display (duration: 01m 12s)
  • 14:50 mlitn@tin: Synchronized php-1.31.0-wmf.21/extensions/3D/extension.json: Remove MMV dependency for 3D (duration: 01m 12s)
  • 14:41 ottomata: beginning migration of webrequest_misc from Kafka analytics to jumbo: T185136
  • 14:40 mlitn@tin: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 14:38 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable 3D file display (duration: 01m 13s)
  • 14:32 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2171.codfw.wmnet
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Show HTML summaries on cswiki (T182321) (duration: 01m 13s)
  • 13:41 ema: bounce pybal on lvs1003 to try establish missing etcd connections (zotero, thumbor, wdqs) https://phabricator.wikimedia.org/P6730
  • 13:30 moritzm: rebooting kubernetes1001
  • 13:21 ema: upgrade pybal on lvs1003 to 1.14.4
  • 12:42 _joe_: ended live-hacking on mwdebug1001 (T185078)
  • 12:24 _joe_: live-hacking ProductionServices.php on mwdebug1001 for testing (T185078)
  • 11:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and slowly repool db1076 (duration: 01m 12s)
  • 11:40 kartik@tin: Finished deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1 (duration: 03m 37s)
  • 11:39 akosiaris: purge ORES from scb hosts T168073 T171851
  • 11:37 kartik@tin: Started deploy [cxserver/deploy@300f728]: Update cxserver to b0404d1
  • 11:19 _joe_: upgrading python-conftool on all cache hosts
  • 10:55 ema: upgrading python-conftool on cp5007
  • 10:51 _joe_: upgrading python-conftool on cp1008
  • 10:42 jynus: stop db2073 for maintenance
  • 10:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1089 and fully repool db1104 (duration: 01m 13s)
  • 10:37 _joe_: benchmarking EtcdConfig failure scenarios on mwdebug1001, T185078
  • 10:22 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 14s)
  • 10:18 ema: upgrade cache_text @ ulsfo to varnish 5
  • 10:13 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2073 for maintenance (duration: 01m 12s)
  • 10:08 moritzm: uploaded Linux 4.9.82-1~wmf1 for jessie-wikimedia to apt.wikimedia.org (retpoline-enabled kernel)
  • 10:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low traffic and depool db1067 - T162807 (duration: 01m 12s)
  • 09:59 akosiaris: reboot kraz.wikimedia.org (irc.wikimedia.org)
  • 09:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 - T186321 (duration: 01m 12s)
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 - T186321 (duration: 01m 12s)
  • 09:20 marostegui: Stop MySQL on db1104 to switch its binlog to statement - T186321
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T186321 (duration: 01m 13s)
  • 09:19 moritzm: rebooting multatuli
  • 09:03 ema: eqiad LVSs: upgrade pybal to 1.14.4
  • 08:48 jynus: tendril and dbtree database currently under maintanance
  • 08:47 ema: codfw LVSs: upgrade pybal to 1.14.4
  • 08:35 marostegui: Stop tendril database (db1011) to copy it to db1115 - tendril will be offline while the copy is in progress - T184704
  • 08:32 ema: esams LVSs: upgrade pybal to 1.14.4
  • 08:24 ema: ulsfo LVSs: upgrade pybal to 1.14.4
  • 08:05 marostegui: Disable puppet on db1011 - T184704
  • 07:48 krinkle@tin: Synchronized wmf-config/FeaturedFeedsWMF.php: I73945d7d - minor clean-up (duration: 01m 13s)
  • 07:32 _joe_: starting tests on mwdebug1001 again
  • 07:32 marostegui: Deploy schema change on db1076 - T187089 T185128 T153182
  • 07:24 marostegui: Stop MySQL on db1076 for mariadb and kernel upgrade + alter table
  • 07:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1076 for alter table (duration: 01m 14s)
  • 06:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 (duration: 01m 13s)
  • 06:21 marostegui: Stop puppet and mysql on db1011 to get ready to copy its data to db1115 - T184704
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 53s)
  • 01:05 anomie: Running cleanupBlocks.php on more wikis for T187834: alswiki bgwiki bhwiki cawiki dewiki elwiki eswiki frwiki hewiki hiwiki huwiki hywiki jawiki jawikibooks jawikinews jawikiquote jawikisource jawiktionary kawiki kowiki mswiki mswiktionary rowiki sourceswiki
  • 01:01 anomie: Running cleanupBlocks.php on mediawikiwiki for T187834
  • 00:46 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 03m 07s)
  • 00:43 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:41 smalyshev@tin: Finished deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace (duration: 00m 27s)
  • 00:40 smalyshev@tin: Started deploy [wdqs/wdqs@5131080]: update whitelist to include categories namespace
  • 00:25 tgr@tin: Synchronized wmf-config/CommonSettings-labs.php: T57420 enable loginOnly flag in beta (duration: 01m 12s)
  • 00:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9 (duration: 06m 05s)
  • 00:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@8ffb03b]: Update mobileapps to a1339a9
  • 00:13 demon@tin: Synchronized php-1.31.0-wmf.22/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 13s)
  • 00:12 demon@tin: Synchronized php-1.31.0-wmf.21/includes/media/JpegMetadataExtractor.php: T184048 (duration: 01m 21s)
  • 00:00 mutante: LDAP - added uid 'raz-shuty' to group 'wmde' (T187442)

2018-02-21

  • 21:50 elukey: restart hhvm on mw1224 - high load alarms
  • 21:46 elukey: restart hhvm on mw1235 - high load alarms
  • 21:44 elukey: restart hhvm on mw1233 - high load alarms
  • 21:39 awight@tin: Finished deploy [ores/deploy@addba9c]: T187914 on the scb* cluster (duration: 10m 02s)
  • 21:34 elukey: restart hhvm on mw1232 - high load alarms
  • 21:30 ppchelko@tin: Finished deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636 (duration: 15m 59s)
  • 21:30 elukey: restart hhvm on mw1229 - high load alarms
  • 21:29 awight@tin: Started deploy [ores/deploy@addba9c]: T187914 on the scb* cluster
  • 21:28 awight@tin: Finished deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster (duration: 13m 03s)
  • 21:27 elukey: restart hhvm on mw1227 - high load alarms
  • 21:23 elukey: restart hhvm on mw1221 - high load alarms
  • 21:15 awight@tin: Started deploy [ores/deploy@7bbf21f]: T187914 on the ores* cluster
  • 21:14 ppchelko@tin: Started deploy [restbase/deploy@56fffcf]: Do not check for article deletion for update requests T181636
  • 20:53 twentyafterfour: MediaWiki Train for 1.31.0-wmf.22 is blocked by T187942
  • 20:39 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:38 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:34 twentyafterfour: rolling back group1 to wmf.21
  • 20:28 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.22 (duration: 01m 08s)
  • 20:27 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.22
  • 20:10 mutante: phab2001 - testing phab restart cron
  • 19:34 ebernhardson@tin: Synchronized wmf-config/PoolCounterSettings.php: Increase pool counter workers for cirrus namespace lookup (duration: 01m 13s)
  • 19:24 ottomata: applying changes to kafkatee module, first rhenium then oxygen. will require manual config fixings
  • 18:59 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for Burmese Wiktionary T187882 (duration: 01m 06s)
  • 18:48 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add namespace localization for sdwiki T186943 (duration: 01m 13s)
  • 18:39 thcipriani@tin: Synchronized wmf-config/throttle.php: SWAT: Added new throttle rule for Wikipedia Women in Red editathon T187803 (duration: 01m 12s)
  • 18:37 chasemp: labsdb rm -fR /usr/local/lib/mediawiki-config && puppet agent --test
  • 18:24 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Topic namespace alias of zhwiki T187546 (duration: 01m 13s)
  • 18:12 _joe_: stopped testing on mwdebug1001 for SWAT window
  • 17:43 ema: eqsin LVSs: upgrade pybal to 1.14.4
  • 17:34 _joe_: resuming tests on mwdebug1001
  • 17:17 ema: eqiad LVSs: bounce pybal for labweb proxfetch config changes
  • 17:12 ppchelko@tin: Finished deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437 (duration: 01m 23s)
  • 17:11 ppchelko@tin: Started deploy [changeprop/deploy@e9a6bb0]: Use post for ORES precache rules T158437
  • 17:07 _joe_: finished testing on mwdebug1001 for swat
  • 16:56 oblivian@puppetmaster1001: conftool action : edit; selector: name=ReadOnly,scope=eqiad
  • 16:40 _joe_: testing various etcd failure scenarios on mwdebug1001, T185078
  • 16:39 ppchelko@tin: Finished deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437 (duration: 01m 33s)
  • 16:37 ppchelko@tin: Started deploy [changeprop/deploy@1be63aa]: Simplify ORES precaching by using the new endpoint T158437
  • 16:27 ema: lvs1010: restart pybal
  • 16:00 godog: restart rsyslogd on lithium and wezen - T136312
  • 15:50 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve private wiki thumbnails with Thumbor (T169144) (duration: 01m 12s)
  • 15:44 no_justification: pruned old 1.29.x and 1.30.x versions that somehow stuck around. Also 1.31.0-wmf.* cache/ directories for unused branches. T157030
  • 15:37 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Serve officewiki thumbnails with Thumbor (T169144) (duration: 01m 11s)
  • 15:27 gilles@tin: Synchronized private/PrivateSettings.php.example: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 11s)
  • 15:24 chasemp: reboot labtestservices2002
  • 15:24 gilles@tin: Synchronized wmf-config/filebackend.php: Thumbor private wiki support deployment: Add Thumbor/Mediawiki shared secret (T169144) (duration: 01m 12s)
  • 15:19 gilles: Thumbor private wiki support deployment
  • 15:08 zeljkof: EU SWAT finished
  • 15:08 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Removing Mobile beta feedback link (T187712) (duration: 01m 12s)
  • 15:03 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Page Previews EventLogging instrumentation (T185973) (duration: 01m 13s)
  • 14:52 _joe_: rolling restart another 4 api appservers
  • 14:49 oblivian@tin: Synchronized wmf-config: Serve configuration to mwdebug hosts via etcd (duration: 01m 16s)
  • 14:42 _joe_: restarted hhvm on mwdebug1001 too
  • 14:38 _joe_: restarting hhvm on mwdebug1002
  • 14:06 _joe_: restarting hhvm on misbehaving api appservers
  • 14:02 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule (T187870) (duration: 01m 13s)
  • 13:28 marostegui: Reboot db2092 for a kernel upgrade
  • 13:26 moritzm: powercycling ganeti1007
  • 12:43 _joe_: rolling restart of hhvm on api servers under high load
  • 12:38 elukey: restart hhvm on mw1234 - high load
  • 12:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Clarify that db1067 is now s1 candidate master - T186321 (duration: 01m 13s)
  • 12:26 elukey: restart hhvm on mw1231 - high load, hhvm-dump-debug in /home/elukey/hhvm.6759.bt
  • 12:21 elukey: restart hhvm on mw1227 - high load, hhvm-dump-debug in /home/elukey/hhvm.23382.bt
  • 12:10 moritzm: uploading retpoline-enabled gcc-4.9 to apt.wikimedia.org / jessie-wikimedia to be able to use it on boron for building Linux (trying to adapt our pbuilder setup to also include security.debian.org ran into a few proxy-related problems and this is really a rare corner case anyway)
  • 12:02 ema: lvs5003: pybal upgraded to 1.14.4
  • 12:01 ema: pybal 1.14.4 uploaded to apt.w.o
  • 11:17 moritzm: installing db5.3 security updates
  • 11:12 jynus: cloning db2011 to db2044
  • 10:40 kart_: Finished running CLL preference migration script dry-run on terbium (T187677)
  • 10:33 marostegui: Reload haproxy on dbproxy1005 - T187722
  • 10:26 marostegui: Remove db2030 from tendril - T187768
  • 10:09 moritzm: installing openssh bugfix updates from jessie/stretch point releases
  • 10:01 kart_: Running CLL preference migration script dry-run on terbium (T187677)
  • 09:46 moritzm: installing dbus updates from stretch point release
  • 09:23 moritzm: installing sqlite security updates on stretch
  • 08:35 godog: roll-restart thumbor in codfw and eqiad to apply https://gerrit.wikimedia.org/r/c/412980
  • 08:20 gilles: foreachwikiindblist "% private.dblist" extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --backend=local-multiwrite --private
  • 07:20 marostegui: Stop Mariadb on db1108 for kernel upgrade
  • 06:36 marostegui: Deploy schema change on db1105:3312 - T187089 T185128 T153182
  • 06:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105 for alter table (duration: 01m 17s)
  • 05:00 eileen: enable major gifts address job
  • 04:41 eileen: update civicrm revision changed from 43a7641597 to b27e6a5019, config revision is ef884a2c5d
  • 04:13 andrew@tin: Finished deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more (duration: 02m 45s)
  • 04:10 andrew@tin: Started deploy [horizon/deploy@0e7783d]: updating branded graphics slightly more
  • 03:34 andrew@tin: Finished deploy [horizon/deploy@0e28f49]: updating branded graphics (duration: 02m 49s)
  • 03:31 andrew@tin: Started deploy [horizon/deploy@0e28f49]: updating branded graphics
  • 02:31 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 06m 18s)
  • 02:15 no_justification: running `initSiteStats.php --update` for all wikis in medium.dblist. T187845
  • 02:01 no_justification: running `initSiteStats.php --update` for all wikis in small.dblist. T187845
  • 01:54 no_justification: WikipediaMobileFirefoxOS submodule references caused labsdb* (and related) puppet failures. They should recover now (self reverted my docroot changes). Filed T187850
  • 01:51 demon@tin: Synchronized docroot/: revert docroot improvements. some servers don't like improvements (duration: 01m 12s)
  • 01:36 demon@tin: Synchronized docroot/: Swapping wikimedia.org docroot for symlink (second try, old WPFirefoxMobileOS cleanup was still needed) (duration: 01m 12s)
  • 01:16 eileen: update civicrm revision changed from efba904b06 to 43a7641597, config revision is ef884a2c5d
  • 01:10 cwd: disabled process-control
  • 01:08 eileen: start outage to upgrade civicrm to 4.7.31
  • 00:56 mutante: gerrit2001 - restarted gerrit to test that gerrit:411397 and gerrit:411394 don't break anything - didn't touch cobalt right now to minimize affecting users and their logins
  • 00:43 thcipriani@tin: Synchronized wmf-config/abusefilter.php: SWAT: Allow CheckUsers and Stewards to access private data from the AbuseLog T160357 (duration: 01m 12s)
  • 00:29 thcipriani@tin: Synchronized php-1.31.0-wmf.21/includes/page/WikiPage.php: SWAT: site_stats: Unbreak counting newly created pages (duration: 01m 12s)
  • 00:26 thcipriani@tin: Synchronized php-1.31.0-wmf.21/resources/src/mediawiki/mediawiki.ForeignStructuredUpload.js: SWAT: Follow-up I0bb4ed7f7: Use correct "this" T187523 (duration: 01m 13s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable x-kill feature everywhere T186714 T184322 (duration: 01m 13s)

2018-02-20

  • 22:58 ejegg: restarted donations queue consumer
  • 22:26 ejegg: turned off donations queue consumer for timing test
  • 22:25 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/Thanks/modules/ext.thanks.revthank.js: T187757 (duration: 01m 14s)
  • 22:20 chasemp: T184209 create labs-instance-transport1-b-codfw
  • 22:06 eileen: update civicrm revision changed from 915a4419c8 to efba904b06, config revision is 8c7ce87207 (extended report update for regex)
  • 21:44 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.22
  • 21:39 no_justification: ran `namespaceDupes.php --wiki=enwikiversity` for T187660
  • 21:18 twentyafterfour@tin: Finished scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961 (duration: 46m 59s)
  • 20:34 ejegg: updated CiviCRM from 31115684f6 to 915a4419c8
  • 20:31 twentyafterfour@tin: Started scap: Sync 1.31.0-wmf.22 and promote test wikis - refs T183961
  • 20:20 chasemp: labtestmetal2001:~# aptitude install linux-image-4.4.0-109-generic && aptitude install linux-image-extra-4.4.0-109-generic
  • 20:17 chasemp: labtestmetal mkfs -t xfs -i size=512 /dev/mapper/labtestmetal2001--vg-data
  • 20:16 andrew@tin: Finished deploy [horizon/deploy@b02c819]: trying to get a clean deploy (duration: 01m 54s)
  • 20:14 andrew@tin: Started deploy [horizon/deploy@b02c819]: trying to get a clean deploy
  • 20:10 andrew@tin: Finished deploy [horizon/deploy@b02c819]: a couple of bug fixes (duration: 02m 55s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@b02c819]: a couple of bug fixes
  • 20:07 andrew@tin: Started deploy [horizon/deploy@6a40f84]: a couple of bug fixes
  • 19:57 twentyafterfour: Cutting new branch wmf/1.31.0-wmf.22 - Deployment blockers: T183961
  • 19:45 demon@tin: Synchronized docroot/mediawiki/keys/: symlink magic (duration: 00m 56s)
  • 19:26 mobrovac@tin: Started restart [changeprop/deploy@5fdc03a]: (no justification provided)
  • 19:00 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2 (duration: 02m 47s)
  • 18:57 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 take 2
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875 (duration: 14m 02s)
  • 18:43 ppchelko@tin: Started deploy [restbase/deploy@e9bef90]: Do not return the response for summaery right away, store first T179875
  • 18:34 arlolra@tin: Finished deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113 (duration: 10m 37s)
  • 18:23 arlolra@tin: Started deploy [parsoid/deploy@5fbabfc]: Updating Parsoid to e5e8113
  • 18:03 ppchelko@tin: Finished deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875 (duration: 16m 01s)
  • 17:52 moritzm: installing cups updates from jessie point release
  • 17:50 gilles: mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --wiki=officewiki --backend=local-multiwrite --private
  • 17:47 ppchelko@tin: Started deploy [restbase/deploy@dca0290]: Switch summary implementation to MCS T179875
  • 17:41 andrew@tin: Finished deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts (duration: 00m 55s)
  • 17:40 andrew@tin: Started deploy [striker/deploy@3684a73]: rolling stretch-ready striker out to labweb hosts
  • 17:11 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 1 (duration: 00m 56s)
  • 16:33 godog: roll-restart thumbor in codfw/eqiad to apply https://gerrit.wikimedia.org/r/412935
  • 16:25 moritzm: installing initramfs-tools update from jessie point release
  • 16:17 jynus: drop s3 from dbstore2001
  • 16:14 gilles@tin: Synchronized private/PrivateSettings.php: Add Thumbor secret to Swift configuration (duration: 00m 56s)
  • 15:37 oblivian@puppetmaster1001: conftool action : edit; selector: dc=esams,name=cp3033.esams.wmnet
  • 15:36 bblack: eqsin: restarting all varnish backends for storage changes (not in prod traffic flow, yet!)
  • 15:27 _joe_: upgrading conftool on swift proxies, thumbor
  • 15:25 _joe_: upgrading conftool on parsoid,wdqs
  • 15:23 _joe_: upgrading conftool on aqs, restbase, ores clusters
  • 15:19 _joe_: upgrading conftool on the mediawiki appservers
  • 15:15 _joe_: upgrading conftool on the maps cluster
  • 15:10 _joe_: installing python-conftool on puppetmasters, cumin masters
  • 14:53 godog: roll-restart thumbor after rollback
  • 14:50 volans: running puppet on thumbor1002 (was already logged in)
  • 14:40 zeljkof: EU SWAT finished
  • 14:39 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update the sitename of newiki (T186952) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Draft namespace to hiwikiversity. (T187535) (duration: 00m 56s)
  • 14:16 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add suppressredirect to autoconfirmed at zhwikt (T187018) (duration: 00m 55s)
  • 14:10 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T187171) (duration: 00m 55s)
  • 14:03 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: throttle: add new rule for Wikidata edit-a-thon (T187655) (duration: 00m 56s)
  • 13:29 marostegui: Upgrade kernel and reboot db1113 and db1114
  • 13:23 marostegui: Stop MySQL and reboot db1111 for kernel and mariadb upgrade
  • 13:17 marostegui: Stop MySQL and reboot db1112 for kernel and mariadb upgrade
  • 13:03 moritzm: installing libav security updates
  • 12:11 _joe_: upgrading conftool to 1.0.0~beta2 on scb*
  • 11:24 jynus: upgrding mariadb-client on neodymium and sarin
  • 11:09 marostegui: Deploy schema change on labtestweb2001 - T153182 T185128 T187089
  • 11:00 marostegui: Deploy schema change on s2 codfw master (db2035) with replication, this will generate lag on codfw - T187089 T185128 T153182
  • 11:00 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2037 and db2044 (duration: 00m 55s)
  • 10:58 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2037 and db2044 (duration: 00m 53s)
  • 10:49 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2037 and db2044 (duration: 00m 55s)
  • 10:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Remove db2030 from config - T187768 (duration: 00m 55s)
  • 10:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Remove db2030 from config - T187768 (duration: 00m 56s)
  • 10:13 volans: unified python-requests-mock packages in apt.wikimedia.org jessie-wikimedia to be 1.3.0-3~wmf1, removed binaries for 1.3.0-3
  • 09:49 marostegui: Deploy schema change on s6 primary master db1061 - T185128 T153182
  • 09:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 after alter table (duration: 00m 55s)
  • 09:16 marostegui: Data checks for db2037 before removing it from s4 - T187722
  • 09:14 elukey: restart zookeeper on druid1001 (follower) to verify that the last changes are no-op
  • 09:12 marostegui: Deploy schema change on db1088 - T187089 T185128 T153182
  • 09:12 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 for alter table (duration: 00m 55s)
  • 09:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316 and db1085 (duration: 00m 55s)
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:03 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:02 oblivian@puppetmaster2001: conftool action : set/val=false; selector: scope=eqiad,name=ReadOnly
  • 09:01 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:56 oblivian@puppetmaster2001: conftool action : edit; selector: scope=codfw
  • 08:51 oblivian@puppetmaster2001: conftool action : edit; selector: scope=common
  • 08:32 _joe_: uploading conftool 1.0.0~beta1 on stretch
  • 08:26 _joe_: uploading conftool 1.0.0~beta1 to jessie
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 08:09 godog: powercycle ganeti1006
  • 08:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 01m 10s)
  • 07:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1085 (duration: 00m 55s)
  • 07:27 marostegui: Deploy schema change on db1096:3316 - T187089 T185128 T153182
  • 07:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 for alter table (duration: 00m 56s)
  • 07:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1085 (duration: 00m 55s)
  • 06:58 marostegui: Upgrade mariadb and kernel on db1085
  • 06:26 marostegui: Deploy schema change on db1085 (with replication - this will generate lag on labs hosts) - T187089 T185128 T153182
  • 06:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1085 for alter table (duration: 00m 56s)
  • 04:56 krinkle@tin: Synchronized docroot/mediawiki/keys/: Ie26638ed0c - rm old 2009 keys file (duration: 00m 56s)
  • 04:27 krinkle@tin: Synchronized w/extract2.php: Ib6d77e863b - clean up MW_LANG indirection (duration: 00m 55s)
  • 03:40 krinkle@tin: Synchronized wmf-config/CommonSettings.php: Ie4c7879f8ac - Clean up TemplateSandboxEditNamespaces config (duration: 00m 57s)
  • 03:37 Krinkle: It seems 'scap pull' on mwdebug1002 is acting weird (prompt doesn't return until 3-5 minutes after last line of "Finished rsync common")
  • 02:24 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 05m 50s)

2018-02-19

  • 23:21 eileen: re-enable omnirecipient jobs - process-control config revision is 8c7ce87207
  • 22:03 volans: uploaded cumin_3.0.1-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 20:03 volans: uploaded cumin_3.0.0-1_amd64.deb to apt.wikimedia.org jessie-wikimedia
  • 19:29 volans: uploaded python3-requests-mock, python-requests-mock and python-requests-mock-doc for version 1.3.0-3~wmf1 to apt.wikimedia.org jessie-wikimedia
  • 18:53 volans: disabled all notifications on Icinga for db2030
  • 18:04 volans: uploaded clustershell_1.8-1~wmf1_all.deb, python-clustershell_1.8-1~wmf1_all.deb and python3-clustershell_1.8-1~wmf1_all.deb to apt.wikimedia.org jessie-wikimedia
  • 17:04 elukey@tin: Finished deploy [eventlogging/analytics@8bebdf7]: (no justification provided) (duration: 00m 05s)
  • 17:04 elukey@tin: Started deploy [eventlogging/analytics@8bebdf7]: (no justification provided)
  • 16:29 _joe_: uploading conftool 1.0.0beta1 to reprepro for jessie
  • 16:22 andrew@tin: Finished deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002 (duration: 00m 10s)
  • 16:22 andrew@tin: Started deploy [striker/deploy@8a79195]: further attempt to cram striker onto labweb1001 and 1002
  • 16:11 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 22s)
  • 16:10 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 16:10 andrew@tin: Finished deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002 (duration: 00m 17s)
  • 16:09 andrew@tin: Started deploy [striker/deploy@8a79195]: deploying striker to labweb1001 and 1002
  • 14:59 jynus: testing new dbproxy1010 configuration locally to pool labsdb1010 for analytics
  • 13:44 godog: roll-restart prometheus after retention period bump
  • 13:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1063 (duration: 00m 55s)
  • 13:19 marostegui: Deploy schema change on dbstore1002 - T187089 T185128 T153182
  • 13:16 ema: upgrade cache_text@eqsin to varnish 5
  • 12:48 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1098 s6 and s7 (duration: 00m 55s)
  • 12:27 marostegui: Deploy schema change on db1063 - T187089 T185128 T153182
  • 12:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1063 for alter table (duration: 00m 55s)
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1093 (duration: 00m 55s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 56s)
  • 11:07 jdrewniak@tin: Synchronized portals: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 11:06 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia portals Update: Bumping portals to master (T128546) (duration: 00m 56s)
  • 10:45 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1098 s6 and s7 (duration: 00m 55s)
  • 10:35 marostegui: Deploy schema change on db1093 - T187089 T185128 T153182
  • 10:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 for alter table (duration: 00m 56s)
  • 10:27 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1098 s6 and s7 (duration: 00m 56s)
  • 10:10 marostegui: Upgrade mariadb and kernel on db1098
  • 09:59 marostegui: Enable GTID on dbstore2002:3313 and dbstore2001:3316
  • 09:57 marostegui: Enable GTID on dbstore2002 and dbstore2001 for x1
  • 09:55 jynus: reenable gtid replication on db1053 and db2042
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1260.eqiad.wmnet
  • 09:53 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: mw1259.eqiad.wmnet
  • 09:43 marostegui: Upgrade mariadb and kernel on db2033
  • 09:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1090 (duration: 00m 55s)
  • 09:18 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105 - T162807 (duration: 00m 55s)
  • 08:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 for mariadb and kernel upgrade (duration: 00m 55s)
  • 08:49 marostegui: Deploy schema change on db1098:3316 - T187089 T185128 T153182
  • 08:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 for alter table (duration: 00m 55s)
  • 08:24 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1090 (duration: 00m 55s)
  • 08:11 godog: repool mw1227 - T149287
  • 08:02 marostegui@tin: Synchronized wmf-config/db-codfw.php: Promote db2034 to x1 codfw master - T184888 (duration: 00m 56s)
  • 07:58 moritzm: installing werkzeug security updates on trusty
  • 07:42 marostegui: Change topology on x1 codfw - T184888
  • 07:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1090 (duration: 00m 55s)
  • 07:01 marostegui: Reboot db1090 for kernel ugprade, mariadb upgrade, socket path location upgrade
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1090 (duration: 00m 55s)
  • 06:44 marostegui: Stop MySQL on db1089 to update its socket path
  • 06:42 marostegui: Deploy schema change on s6 codfw master (db2039), this will generate lag on codfw - T187089 T185128 T153182
  • 06:40 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1105 - T162807 (duration: 00m 56s)
  • 05:29 andrew@tin: Finished deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes (duration: 03m 14s)
  • 05:26 andrew@tin: Started deploy [horizon/deploy@6a40f84]: rolling out several horizon bugfixes
  • 02:50 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.21) (duration: 10m 59s)

2018-02-18

  • 15:49 _joe_: rolling restart (1 at a time, staggered by 2 minutes) of 18 api appservers in equiad

2018-02-17

  • 17:33 twentyafterfour: restarting apache on phab1001 to clear deadlocked workers. refs T182832
  • 03:15 demon@tin: Pruned MediaWiki: 1.31.0-wmf.20 [keeping static files] (duration: 01m 17s)
  • 03:12 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 (duration: 04m 32s)

2018-02-16

  • 21:12 hashar: Upgraded Zuul to https://gerrit.wikimedia.org/r/#/c/411322/3 | T187567
  • 20:40 andrew@tin: Finished deploy [horizon/deploy@efcba2b]: sudo dashboard update (duration: 01m 16s)
  • 20:39 andrew@tin: Started deploy [horizon/deploy@efcba2b]: sudo dashboard update
  • 20:11 andrew@tin: Finished deploy [horizon/deploy@1fdd122]: two more small fixes (duration: 01m 21s)
  • 20:10 andrew@tin: Started deploy [horizon/deploy@1fdd122]: two more small fixes
  • 19:54 andrew@tin: Finished deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix (duration: 03m 12s)
  • 19:51 andrew@tin: Started deploy [horizon/deploy@bdcc12b]: ocata branch with sidebar fix
  • 18:34 hashar: upgraded zuul
  • 16:21 andrew@tin: Finished deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements (duration: 08m 00s)
  • 16:13 andrew@tin: Started deploy [horizon/deploy@16f3d8e]: ocata branch with upper new requirements
  • 16:06 cmjohnson1: labstore1006 and labstore1007 down for rack relocation
  • 16:03 andrew@tin: Finished deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints (duration: 02m 18s)
  • 16:00 andrew@tin: Started deploy [horizon/deploy@16d0b17]: ocata branch with upper constraints
  • 15:40 andrew@tin: Finished deploy [horizon/deploy@29f9afb]: second attempt at ocata branch (duration: 03m 22s)
  • 15:37 andrew@tin: Started deploy [horizon/deploy@29f9afb]: second attempt at ocata branch
  • 15:29 andrew@tin: Finished deploy [horizon/deploy@58d2718]: first attempt at ocata branch (duration: 01m 28s)
  • 15:28 andrew@tin: Started deploy [horizon/deploy@58d2718]: first attempt at ocata branch
  • 15:27 godog: shut ms-be1018 for bbu swap - T186988
  • 15:16 akosiaris: run T181121#3978654 oneliner once more on sca1004, this time the VM has no DRBD
  • 15:14 akosiaris: poweroff sca1004, switch from DRBD to plain disk template T181121
  • 14:15 akosiaris: doing more IO stress tests on ganeti1005. T181121. Seems like we can reproduce
  • 14:06 chasemp: T184209 initial setup of labs-instances2-b-codfw and hosts
  • 13:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1094 (duration: 00m 56s)
  • 13:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 and db1067 - T162807 (duration: 00m 55s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1094 (duration: 00m 56s)
  • 12:46 jynus: reload dbproxy1008 configuration
  • 12:44 jynus: reload dbproxy1003 configuration
  • 12:37 ema: cp3049: restart varnish-fe to clear 'child restarted' alert
  • 12:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1094 (duration: 00m 56s)
  • 12:23 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 56s)
  • 12:17 marostegui: Stop MySQL on db1094 for mariadb upgrade, kernel upgrade and socket location upgrade
  • 12:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1094 (duration: 00m 56s)
  • 12:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1093 (duration: 00m 56s)
  • 11:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:35 jynus: stopping mysql on db1043, db2012 for clonning data away
  • 11:33 jynus: changing socket location on phabricator db hosts T148507
  • 11:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1093 (duration: 00m 56s)
  • 11:28 ema: cp3036: restart varnish-fe to clear 'child restarted' alert
  • 11:28 hashar: Switching operations/mediawiki-config job for composer to Docker | https://gerrit.wikimedia.org/r/#/c/411206/
  • 11:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1093 (duration: 00m 56s)
  • 11:09 elukey: restart nfaccd on rhenium to see if it picks up the new kafka topic config (3 partitions)
  • 11:06 marostegui: Stop MySQL on db1093 for mariadb and kernel upgrade, also update socket path
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1093 (duration: 00m 56s)
  • 09:56 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099:3311 - T162807 (duration: 00m 56s)
  • 09:55 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db1053 (duration: 00m 56s)
  • 09:53 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db1053 (duration: 00m 56s)
  • 08:48 akosiaris: doing IO stress tests on ganeti1005. T181121
  • 08:34 akosiaris: manually allocate logstash1008 on ganeti1005 to undo the manual override of sensible allocation rules by ganeti
  • 08:30 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1053 (duration: 00m 57s)
  • 08:14 akosiaris: powercycle ganeti1006 T181121
  • 08:13 akosiaris: powercycle ganeti1006
  • 06:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 - T162807 (duration: 00m 59s)
  • 06:41 moritzm: installing installing quagga security updates
  • 06:35 marostegui: Deploy schema change on s5 primary master db1070 - T185128 T153182
  • 00:16 ebernhardson@tin: Synchronized php-1.31.0-wmf.21/extensions/ProofreadPage/modules/page/ext.proofreadpage.page.edit.js: SWAT: T187454 fix text selection on #wpTextbox1 (duration: 00m 58s)

2018-02-15

  • 23:43 demon@tin: Synchronized scap/plugins/clean.py: no-op (duration: 00m 56s)
  • 22:54 demon@tin: Synchronized php-1.31.0-wmf.21/extensions/MassMessage/includes/MassMessage.php: fix use statement, T187510 (duration: 00m 57s)
  • 21:50 ejegg: updated CiviCRM from 61acc9175e to 31115684f6
  • 20:22 twentyafterfour: 1.31.0-wmf.21 deployed: no apparent change in fatalmonitor error rate. refs T183960
  • 20:18 twentyafterfour@tin: rebuilt and synchronized wikiversions files: all wikis to 1.31.0-wmf.21
  • 20:11 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/TwoColConflict/includes/TwoColConflictHooks.php: sync https://gerrit.wikimedia.org/r/#/c/410809/ (duration: 01m 13s)
  • 20:09 twentyafterfour: syncing a patch before deploying 1.31.0-wmf.21 to all wikis.
  • 19:55 thcipriani@tin: Synchronized wmf-config/CommonSettings.php: SWAT: Follow-up 77be427a1: Enable the Beta Feature on all wikis T185708 (duration: 01m 12s)
  • 19:34 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set Portal and Portal talk namespace alias of zhwiki T184866 (duration: 01m 13s)
  • 19:13 thcipriani@tin: Synchronized wmf-config/CirrusSearch-common.php: SWAT: Set SPARQL endpoint for category search T184840 (duration: 01m 12s)
  • 18:42 arlolra@tin: Finished deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195 (duration: 08m 34s)
  • 18:33 arlolra@tin: Started deploy [parsoid/deploy@6da4591]: Updating Parsoid to 0650195
  • 18:11 bsitzmann@tin: Finished deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475) (duration: 05m 54s)
  • 18:06 bsitzmann@tin: Started deploy [mobileapps/deploy@0bfafa9]: Update mobileapps to d219d1b (T187475)
  • 17:24 foks: removed 2FA from User:Lea Lacroix (WMDE)
  • 17:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 (duration: 01m 12s)
  • 16:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1097:3315, db1089, db1066 (duration: 01m 12s)
  • 16:32 andrew@tin: Finished deploy [horizon/deploy@4e7ccc5]: lots of updates (duration: 03m 13s)
  • 16:29 andrew@tin: Started deploy [horizon/deploy@4e7ccc5]: lots of updates
  • 16:10 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3315 (duration: 01m 12s)
  • 15:34 ema: upgrade upload @ eqsin to varnish 5
  • 15:27 marostegui: Deploy schema change on db1051 - T187089 T185128 T153182
  • 15:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051, fully repool db1097:3314, increase weight for db1097:3315 (duration: 01m 13s)
  • 15:15 zeljkof: EU SWAT finished
  • 15:14 zfilipin@tin: Synchronized wmf-config/abusefilter.php: SWAT: Log accessing private abusefilter details (T160357) (duration: 01m 12s)
  • 14:58 moritzm: installing erlang security updates on labcontrol1001
  • 14:53 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable the visual diff beta feature (T185708) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized php-1.31.0-wmf.21/includes/Revision.php: SWAT: Log the reason why revision->getContent() returns null (T184670) (duration: 01m 12s)
  • 14:35 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable log channel T184670 (T184670) (duration: 01m 12s)
  • 14:22 jynus@tin: Synchronized wmf-config/db-eqiad.php: Remove db2042 (duration: 01m 11s)
  • 14:20 jynus@tin: Synchronized wmf-config/db-codfw.php: Remove db2042 (duration: 01m 12s)
  • 14:09 addshore@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add RevisionStore to wmgMonologChannels: (duration: 01m 13s)
  • 12:01 addshore: script run for T185738 done
  • 11:59 milimetric@tin: Finished deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars (duration: 09m 33s)
  • 11:58 addshore: addshore@terbium:~$ mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki elwiktionary --batchsize 1000 # T185738
  • 11:49 milimetric@tin: Started deploy [analytics/refinery@26d4e50]: Deploying Refinery jobs with new 0.0.58 jars
  • 10:58 marostegui: Stop replication in sync db1089 and db1066
  • 10:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 and slowly repool db1097:3315 (duration: 01m 12s)
  • 10:38 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 fully (duration: 01m 12s)
  • 10:28 marostegui: Upgrade mariadb on db1066
  • 10:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1097:3314 (duration: 01m 12s)
  • 09:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1097:3314 (duration: 01m 12s)
  • 09:48 marostegui: Deploy schema change on db1097:3315 - T187089 T185128 T153182
  • 09:39 marostegui: Upgrade kernel and mariadb on db1097
  • 09:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097 for s4 and s5 (duration: 01m 12s)
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1082 (duration: 01m 12s)
  • 09:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic db1082 (duration: 01m 12s)
  • 08:54 moritzm: installing erlang security updates on labtestcontrol*
  • 08:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1082 (duration: 01m 13s)
  • 08:18 marostegui: Upgrade kernel + mariadb on db1082 (sanitarium master in s5)
  • 07:55 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 07:55 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1066 - T162807 (duration: 01m 12s)
  • 07:39 marostegui: Deploy schema change on db1082 (sanitarium master) with replication, this will generate lag on labs - T187089 T185128 T153182
  • 07:39 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 13s)
  • 07:35 moritzm: installing libvorbis security updates on stretch
  • 07:30 twentyafterfour: phabricator upgrade finished. phd is back online.
  • 07:27 twentyafterfour: phabricator database migration finished
  • 07:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1110 (duration: 01m 12s)
  • 07:09 jynus: reimage dbproxy1003 to stretch
  • 07:04 twentyafterfour: Applying patch "phabricator:20180215.maniphest.02.populate.php" to host "m3-master.eqiad.wmnet"...
  • 07:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase weight for db1110 (duration: 01m 13s)
  • 06:57 twentyafterfour: apache restarted, update appears to be successful
  • 06:57 twentyafterfour: phabricator database migrations applied
  • 06:50 twentyafterfour: shutting down apache on phab1001 to deploy update, downtime should be only a couple of minutes
  • 06:49 twentyafterfour: starting phabricator upgrade tagged release/2018-02-15/1
  • 06:45 twentyafterfour: restarted apache on phab1001 and reset cluster.read-only to false
  • 06:44 jynus: set db1059 in read-write
  • 06:38 jynus: merging dns update for phabricator db
  • 06:35 jynus: set db1043 as read only
  • 06:34 twentyafterfour: set cluster.read-only in phabricator
  • 06:33 jynus: about to set phabricator.wikimedia.org as read only
  • 06:28 jynus: scheduling downtime for phabricator on phab1001
  • 06:17 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1110 (duration: 01m 13s)
  • 06:06 marostegui: Upgrade mysql on db1110
  • 05:57 jynus: restarting dbproxy1008 for kernel upgrade
  • 05:43 andrew@tin: Finished deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon (duration: 03m 06s)
  • 05:40 andrew@tin: Started deploy [horizon/deploy@c355366]: testing a couple of cherry-picks in horizon
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 07m 25s)
  • 02:01 mutante: phab1001 - restarted apache to fix server status page
  • 01:27 twentyafterfour: restarting apache2 on phab1001 to free deadlocked php processes.
  • 01:03 twentyafterfour: using the current phabricator maintenance window to deploy https://gerrit.wikimedia.org/r/#/c/410626/
  • 01:03 twentyafterfour: the scheduled phabricator upgrade is delayed until 06:00 UTC Thursday because of large database migrations. Doing the upgrade at a time when DBAs are available to assist.
  • 00:52 maxsem@tin: Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 14s)
  • 00:49 maxsem@tin: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/410267/ (duration: 01m 13s)

2018-02-14

  • 23:39 AaronSchulz: Running initSiteStats.php on s3 for T186947
  • 22:04 aaron@tin: Synchronized php-1.31.0-wmf.20/includes/SiteStats.php: f549559dc0 (duration: 01m 13s)
  • 21:52 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with full weight (duration: 01m 13s)
  • 21:38 mholloway-shell@tin: Finished deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f (duration: 06m 01s)
  • 21:32 mholloway-shell@tin: Started deploy [mobileapps/deploy@9bad612]: Update mobileapps to f23519f
  • 21:30 arlolra@tin: Finished deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed (duration: 15m 12s)
  • 21:15 arlolra@tin: Started deploy [parsoid/deploy@7961b3f]: Updating Parsoid to caee2ed
  • 21:00 ema: upgrade cp1099 to varnish 5 (last upload@eqiad host)
  • 20:54 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice: Sync CentralNotice again after proper rebase (duration: 01m 14s)
  • 20:43 ema: upgrade cp1074 to varnish 5
  • 20:42 twentyafterfour@tin: Synchronized php-1.31.0-wmf.21/extensions/CentralNotice/: sync https://gerrit.wikimedia.org/r/#/c/410346/ for Ejegg (duration: 01m 15s)
  • 20:40 twentyafterfour: Group1 wikis are now running MediaWiki 1.31.0-wmf.21 - still no blockers on T183960
  • 20:38 twentyafterfour@tin: Synchronized php: group1 wikis to 1.31.0-wmf.21 (duration: 01m 12s)
  • 20:37 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group1 wikis to 1.31.0-wmf.21
  • 20:33 ema: upgrade cp1073 to varnish 5
  • 20:05 ema: upgrade cp1072 to varnish 5
  • 19:44 ema: upgrade cp1071 to varnish 5
  • 19:25 XioNoX: enabling netflow on cr1-eqiad
  • 19:24 no_justification: ran namespaceDupes.php --fix for hiwiki
  • 19:24 demon@tin: Synchronized wmf-config/InitialiseSettings.php: portal aliases for hiwiki (duration: 01m 13s)
  • 19:22 ema: upgrade cp1064 to varnish 5
  • 19:20 no_justification: running updateCollation.php on nowikimedia
  • 19:19 demon@tin: Synchronized wmf-config/InitialiseSettings.php: nowikimedia collation, T185630 (duration: 01m 13s)
  • 19:16 andrewbogott: rebooting labvirt1019 so I can have a look at the raid setup, for T172538
  • 19:14 no_justification: ran namespaceDupes.php --fix on wawiktionary
  • 19:13 demon@tin: Synchronized wmf-config/InitialiseSettings.php: wawiktionary namespaces, T185289 (duration: 01m 13s)
  • 19:11 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Revert prior, busted the canaries (duration: 01m 15s)
  • 19:08 demon@tin: scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 19:06 demon@tin: rebuilt and synchronized wikiversions files: namespace aliases for zhwiki, T184866
  • 19:00 ema: upgrade cp1063 to varnish 5
  • 17:43 ema: upgrade cp1062 to varnish 5
  • 17:42 moritzm: updated jenkins packages on apt.wikimedia.org for stretch (thirdpary/ci) and jessie (thirdparty) to 2.89.4
  • 17:39 hashar: CI Jenkins seems all happy following the upgrade ^o^
  • 17:34 moritzm: updating remaining python-cryptography updates from jessie point release
  • 17:32 hashar: Upgrading Jenkins on contint1001 / contint2001
  • 17:30 godog: roll-restart ms-fe to pick up https://gerrit.wikimedia.org/r/c/410199/
  • 17:22 moritzm: installing uwsgi jessie update on graphite*
  • 17:20 godog: roll-upgrade thumbor 1.14 in eqiad/codfw
  • 16:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 09s)
  • 16:56 ema: upgrade cp1050 to varnish 5
  • 16:50 marostegui: Deploy schema change on db1110 - T187089 T185128 T153182
  • 16:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 01m 12s)
  • 16:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:35 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1088 with low weight (duration: 01m 12s)
  • 16:19 ema: upgrade cp1049 to varnish 5
  • 15:59 jynus: upgrade and restart db1088
  • 15:52 moritzm: rolling out debdeploy 0.0.99.2 (cumin masters already upgraded for a while, just synching the clients)
  • 15:51 andrewbogott: powering down labvirt1008 so chris can re-apply thermal paste
  • 15:45 moritzm: installing libgcrypt security updates on trusty
  • 15:31 zeljkof: EU SWAT finished
  • 15:24 godog: roll-upgrade thumbor to 1.13 - T187159 T179954 T187088
  • 15:19 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Add suppressredirect to autoconfirmed at zhwikt" (T187018) (duration: 01m 13s)
  • 15:18 ema: upgrade cp1048 to varnish 5
  • 14:47 moritzm: installing PHP security updates
  • 14:44 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable flood flag at zhwikt (T187018) (duration: 01m 12s)
  • 14:37 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Require 7 days & 10 edits for autoconfirmed at zhwiktionary (T187018) (duration: 01m 13s)
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make alias from old NS_PROJECT to new NS_PROJECT at hiwikiversity (T185347) (duration: 01m 12s)
  • 14:21 akosiaris: reboot ganeti1008 for kernel upgrade T181121
  • 14:14 zfilipin@tin: Synchronized wmf-config/reverse-proxy.php: SWAT: wgSquidServersNoPurge: add eqsin, remove dead IP (T156027) (duration: 01m 12s)
  • 14:11 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/mmv.3d.head.js: Fix 3D badge (duration: 01m 12s)
  • 14:10 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge and Webkit thumb load detection (duration: 01m 13s)
  • 13:44 elukey: rollback java 8 upgrade for archiva - issues with Analytics builds
  • 13:34 elukey: installed openjdk-8 on meitnerium, manually upgraded java-update-alternatives to java8, restarted archiva
  • 13:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104 original weight (duration: 01m 12s)
  • 13:16 jynus: stop slave and rolling schema change on db1059 m3 replica
  • 13:14 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1088 (duration: 01m 12s)
  • 13:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1106 (duration: 01m 12s)
  • 12:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Give more traffic to db1106 (duration: 01m 12s)
  • 12:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1106 (duration: 01m 12s)
  • 11:25 marostegui: Deploy schema change on db1106 - T187089 T185128 T153182
  • 11:16 marostegui: Stop MySQL and reboot db1106 for mysql and kernel upgrade
  • 11:16 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 (duration: 01m 12s)
  • 11:14 filippo@tin: Synchronized wmf-config/ProductionServices.php: repool poolcounter1002 after disk replacement (duration: 01m 12s)
  • 11:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 (duration: 01m 12s)
  • 10:46 jynus: dropping test databases from m5 T186585
  • 10:42 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 10:28 moritzm: installing libvorbis security updates on trusty systems
  • 10:13 marostegui: Deploy schema change on db1100 - T187089 T185128 T153182
  • 10:11 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 (duration: 01m 12s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3316,3315 (duration: 01m 12s)
  • 09:50 akosiaris: set standard weight for all ores* hosts
  • 09:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 09:49 akosiaris@puppetmaster1001: conftool action : set/weight=10; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 09:31 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1096:3316,3315 (duration: 01m 12s)
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool slowly db1096:3316,3315 (duration: 01m 13s)
  • 09:08 marostegui: Deploy schema change on s5 dbstore1002 https://phabricator.wikimedia.org/T187089 https://phabricator.wikimedia.org/T185128 https://phabricator.wikimedia.org/T153182
  • 09:02 marostegui: Stop MySQL on db1096:3315 and 3316 for mysql+kernel upgrade
  • 08:45 jynus@tin: Synchronized wmf-config/db-eqiad.php: Rebalance s8 (duration: 01m 13s)
  • 08:38 akosiaris: pybal restart on lvs1003 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:29 akosiaris: pybal restart on lvs1006, lvs1009, lvs1012 to pickup https://gerrit.wikimedia.org/r/410398
  • 08:08 _joe_: powercycled ganeti1008
  • 07:04 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 (duration: 01m 12s)
  • 06:44 marostegui: Stop replication in sync db1089 and db1067 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 01m 12s)
  • 06:31 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db1096:3315 for alter table (duration: 01m 13s)
  • 06:30 marostegui: Deploy schema change on db1096:3315 - T187089 T185128 T153182
  • 05:55 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 03m 13s)
  • 05:52 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 05:52 andrew@tin: Finished deploy [horizon/deploy@c355366]: updating sudo-dashboard (duration: 00m 20s)
  • 05:51 andrew@tin: Started deploy [horizon/deploy@c355366]: updating sudo-dashboard
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 39s)
  • 02:02 demon@tin: Synchronized fonts/: removing executable bits, no-op (duration: 01m 15s)
  • 01:33 demon@tin: Finished deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now) (duration: 00m 11s)
  • 01:32 demon@tin: Started deploy [gerrit/gerrit@b234c85]: rm reviewers plugin (for now)
  • 00:25 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add uploader user group to mznwiki and make it automagically added T187187 (duration: 01m 12s)
  • 00:12 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable xkill on top wikis that use x aspect T187265 (duration: 01m 14s)

2018-02-13

  • 21:19 twentyafterfour@tin: rebuilt and synchronized wikiversions files: group0 wikis to 1.31.0-wmf.21
  • 21:07 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 49s)
  • 21:07 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@c355366]: another try with static content (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@c355366]: another try with static content
  • 20:43 twentyafterfour@tin: Finished scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis (duration: 31m 01s)
  • 20:41 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 00m 21s)
  • 20:41 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:26 jynus: upgrading labsdb1010 database - proxies will complain for some time
  • 20:18 andrew@tin: Finished deploy [horizon/deploy@c355366]: updated static content collection process (duration: 01m 17s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c355366]: updated static content collection process
  • 20:12 twentyafterfour@tin: Started scap: T183960 Build l10n cache & Deploy wmf/1.31.0-wmf.21 to test wikis
  • 20:11 twentyafterfour: Currently there are no blockers listed on T183960 and the train is leaving the station.
  • 20:05 twentyafterfour: MediaWiki Train 1.31.0-wmf.21 branched, prepped and patched | Changelog uploaded to https://www.mediawiki.org/wiki/MediaWiki_1.31/wmf.21/Changelog | Blockers: T183960
  • 19:03 jynus: upgrade and restart db2042
  • 18:53 jynus@tin: Synchronized wmf-config/db-codfw.php: Depool db2042 (duration: 01m 58s)
  • 18:25 elukey: Analytics Hadoop cluster upgrade to Java 8 about to start - complete cluster shutdown is needed - T166248
  • 18:23 mholloway-shell@tin: Finished deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc (duration: 05m 28s)
  • 18:17 mholloway-shell@tin: Started deploy [mobileapps/deploy@e488cee]: Update mobileapps to 5851dfc
  • 18:00 twentyafterfour: Preparing to cut new MediaWiki branch wmf/1.31.0-wmf.21 - report deployment blockers for this branch in phabricator: T183960
  • 17:54 godog: repool mw1256 after disk swap - T186535
  • 17:20 demon@tin: Synchronized README: forcing git config sync, setting core.sharedRepository=group, T187076 (duration: 01m 12s)
  • 17:13 cmjohnson1: sorry snapshot1001 is going down for rack relocation
  • 17:12 cmjohnson1: stat1001 going down to for rack relocation
  • 17:04 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=codfw', 'cluster=ores', 'service=ores'])
  • 17:03 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: all (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 16:36 demon@tin: Synchronized scap/plugins/clean.py: no-op, consistency (duration: 00m 55s)
  • 16:23 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on group 0 (duration: 00m 56s)
  • 16:17 cmjohnson1: replacing disk poolcounte1002
  • 15:35 marostegui: Deploy schema change on s5 codfw master (db2052), this will generate lag on codfw - T187089 T185128 T153182
  • 15:30 bblack: deploying changes to URL-encoding normalization on caches - https://gerrit.wikimedia.org/r/407488
  • 15:20 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2066 (duration: 00m 55s)
  • 15:01 zeljkof: EU SWAT finished
  • 14:59 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 55s)
  • 14:58 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Update logo for urwikibooks, add hd logo (T185977) (duration: 00m 54s)
  • 14:37 zfilipin@tin: Synchronized static/images/project-logos/: SWAT: Change logos for sdwiki (T185865) (duration: 00m 55s)
  • 14:25 zfilipin@tin: Synchronized php-1.31.0-wmf.20/extensions/ContentTranslation/extension.json: SWAT: Add ext.cx.widgets.overlay dependency to template editor (T187119) (duration: 00m 55s)
  • 14:22 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sitename for sdwiki (T184521) (duration: 00m 57s)
  • 13:51 marostegui: Reboot db2066 to pick up new kernel
  • 13:50 marostegui: Deploy schema change on dbstore2001 - T187089 T185128 T153182
  • 12:51 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 56s)
  • 12:20 mlitn@tin: Synchronized wmf-config/InitialiseSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:19 mlitn@tin: Synchronized wmf-config/CommonSettings.php: Enable STL uploads on Commons (duration: 00m 55s)
  • 12:07 mlitn@tin: Synchronized php-1.31.0-wmf.20/extensions/3D/modules/ext.3d.js: Fix 3D badge (duration: 00m 56s)
  • 11:57 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2066 (duration: 00m 55s)
  • 11:56 marostegui: Deploy schema change on db2066 - T187089 T185128 T153182
  • 11:50 marostegui@tin: Synchronized wmf-config/db-codfw.php: Rpool db2038 and db2059 (duration: 00m 55s)
  • 11:47 jynus: reenabling puppet on all eqiad databases
  • 11:41 jynus@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099 (duration: 00m 56s)
  • 11:37 marostegui: Stop MySQL on db2059 and db2038 for kernel upgrade
  • 11:29 ema: lvs1003: restart pybal to reconnect to etcd
  • 11:27 ema: lvs1006/1010: restart pybal to reconnect to etcd
  • 11:26 ema: lvs4005: restart pybal to reconnect to etcd
  • 11:23 ema: esams primary LVSs: restart pybal to reconnect to etcd
  • 11:21 ema: esams secondary LVSs: restart pybal to properly reconnect to etcd
  • 11:14 ema: repool cp3007
  • 11:13 ema: depool cp3007 to test pybal's behavior on lvs3002
  • 10:51 filippo@tin: Synchronized wmf-config/ProductionServices.php: depool poolcounter1002 for disk replacement (duration: 00m 56s)
  • 10:28 jynus@tin: Synchronized wmf-config/db-eqiad.php: Depool db1099 (duration: 00m 54s)
  • 10:08 godog: roll-restart ms-fe in codfw/eqiad after applying https://gerrit.wikimedia.org/r/c/409942/
  • 10:03 ema: restart pybal on lvs2003
  • 09:58 ema: restart pybal on lvs2006
  • 09:52 filippo@neodymium: conftool action : set/pooled=no; selector: name=ms-fe2005.codfw.wmnet
  • 09:49 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2075, depool db2038 and db2059 (duration: 00m 55s)
  • 09:32 marostegui: Stop mysql on db2075 for mysql and kernel upgrade
  • 09:30 marostegui: Stop replication in sync on db1089 and dbstore1002 - T162807
  • 09:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1065 - T162807 (duration: 00m 54s)
  • 09:22 elukey: powercycle analytics1062 - not reachable via ssh, frozen via serial console
  • 09:22 jynus: disabling puppet on all eqiad databases
  • 09:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1065 - T162807 (duration: 00m 55s)
  • 09:20 marostegui: Stop replication in sync on db1089 and db1065 - T162807
  • 09:12 marostegui@tin: Synchronized wmf-config/db-codfw.php: Repool db2084:3315, depool db2075 (duration: 00m 55s)
  • 09:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3311 - T162807 (duration: 00m 55s)
  • 08:52 marostegui: Stop replication in sync on db1089 and db1099:3311 - T162807
  • 08:51 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089, db1099 - T162807 (duration: 00m 55s)
  • 08:41 marostegui@tin: Synchronized wmf-config/db-codfw.php: Depool db2084:3315 (duration: 00m 56s)
  • 08:37 hashar: tin.eqiad.wmnet: removing live hack in /srv/mediawiki-staging/scap/plugins/clean.py | T187160
  • 08:32 moritzm: installing wavpack security updates
  • 08:09 moritzm: installing exim security updates on trusty hosts
  • 07:02 marostegui: Deploy schema change on s5 db2089 db2084 db2075 db2039 db2059 - T187089
  • 06:28 marostegui: reload haproxy on dbproxy1005
  • 05:10 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp50(0[12345789]|1[12]).eqsin.wmnet
  • 02:34 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 05m 29s)
  • 00:24 cwd: re-enabled p-c
  • 00:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/modules/ve-mw/ui/pages/: T187112 (duration: 00m 56s)
  • 00:10 cwd: disabled p-c jobs for reboot
  • 00:04 demon@tin: Synchronized wmf-config/: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 57s)
  • 00:03 demon@tin: Synchronized wmf-config/InitialiseSettings.php: cleanup Sentry inclusion for labs, should be no-op (duration: 00m 56s)

2018-02-12

  • 23:47 demon@tin: Finished deploy [gerrit/gerrit@6adde70]: reviewers plugin (duration: 00m 12s)
  • 23:46 demon@tin: Started deploy [gerrit/gerrit@6adde70]: reviewers plugin
  • 23:32 mutante: terbium,wasat: touch /var/log/mediawwiki/purge_abusefilter.log ; set owner/permissions like other logfiles
  • 23:13 elukey: manual restart of Yarn Node Managers on analytics1058/31 (failed due to root partition filled up for the issue logged before)
  • 23:09 elukey: cleaned up tmp files on all analytics hadoop worker nodes, job filling up tmp
  • 21:27 andrew@tin: Finished deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content (duration: 03m 18s)
  • 21:24 andrew@tin: Started deploy [horizon/deploy@2f70002]: updating several submodules, probably breaking static content
  • 21:06 mholloway-shell@tin: Finished deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5 (duration: 05m 46s)
  • 21:00 mholloway-shell@tin: Started deploy [mobileapps/deploy@0639c31]: Update mobileapps to f14bdd5
  • 20:21 andrew@tin: Finished deploy [horizon/deploy@c009388]: updating puppet dashboard (duration: 03m 22s)
  • 20:17 andrew@tin: Started deploy [horizon/deploy@c009388]: updating puppet dashboard
  • 20:13 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/UUID.php: T186909 (duration: 00m 56s)
  • 20:08 andrew@tin: Finished deploy [horizon/deploy@cba66d2]: more submodule tinkering (duration: 01m 15s)
  • 20:07 ppchelko@tin: Finished deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API (duration: 15m 10s)
  • 20:07 andrew@tin: Started deploy [horizon/deploy@cba66d2]: more submodule tinkering
  • 20:01 andrew@tin: Finished deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks (duration: 01m 02s)
  • 20:00 andrew@tin: Started deploy [horizon/deploy@1fcd9ff]: fixes to post-install checks
  • 19:58 andrew@tin: Finished deploy [horizon/deploy@9d73005]: fixes to post-isntall checks (duration: 01m 01s)
  • 19:57 andrew@tin: Started deploy [horizon/deploy@9d73005]: fixes to post-isntall checks
  • 19:52 ppchelko@tin: Started deploy [restbase/deploy@b257b4f]: Support batching in the reading lists API
  • 19:50 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 45s)
  • 19:50 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:48 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two (duration: 00m 03s)
  • 19:47 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes -- take two
  • 19:44 andrew@tin: Finished deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes (duration: 01m 06s)
  • 19:43 andrew@tin: Started deploy [horizon/deploy@8cf0c3c]: updating with sudo dashboard fixes
  • 19:17 niharika29@tin: Synchronized wmf-config/filebackend.php: Proxy public wiki thumb.php requests through Thumbor T169144 (duration: 00m 55s)
  • 19:13 andrew@tin: Finished deploy [horizon/deploy@01021b4]: trying another force (duration: 00m 17s)
  • 19:13 andrew@tin: Started deploy [horizon/deploy@01021b4]: trying another force
  • 19:12 niharika29@tin: Synchronized php-1.31.0-wmf.20/extensions/PageAssessments/: Fix 500 error with PageAssessments API T185037 (duration: 00m 56s)
  • 19:07 niharika29@tin: Synchronized wmf-config/InitialiseSettings-labs.php: Stop PHP errors from going to the hhvm channel T45086 (duration: 00m 56s)
  • 18:58 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 07m 39s)
  • 18:50 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:48 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 14s)
  • 18:35 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:34 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 06m 47s)
  • 18:27 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 18:23 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: ores1001.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=ores', 'service=ores'])
  • 18:12 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 12m 30s)
  • 18:09 gehel@tin: Finished deploy [wdqs/wdqs@b6bd483]: new WDQS GUI (duration: 01m 53s)
  • 18:07 gehel@tin: Started deploy [wdqs/wdqs@b6bd483]: new WDQS GUI
  • 18:00 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:47 akosiaris@tin: Finished deploy [ores/deploy@f7e23f4]: T171851 (duration: 13m 18s)
  • 17:45 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards again (duration: 00m 17s)
  • 17:45 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards again
  • 17:34 gilles: added thumborUrl to PrivateSettings.php on labs, in preparation for https://gerrit.wikimedia.org/r/#/c/407611/
  • 17:34 akosiaris@tin: Started deploy [ores/deploy@f7e23f4]: T171851
  • 17:21 demon@tin: Pruned MediaWiki: 1.31.0-wmf.17 [keeping static files] (duration: 02m 08s)
  • 17:18 elukey: home dirs on stat1004 moved to /srv/home (/home symlinks to it)
  • 17:10 andrew@tin: Finished deploy [horizon/deploy@01021b4]: rolling out new dashboards (duration: 00m 54s)
  • 17:09 andrew@tin: Started deploy [horizon/deploy@01021b4]: rolling out new dashboards
  • 16:56 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns5001.wikimedia.org
  • 16:52 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/ApiVisualEditor.php: T186934 (duration: 00m 57s)
  • 16:27 andrew@tin: Finished deploy [horizon/deploy@4d1bdeb]: updating requirements.txt (duration: 01m 04s)
  • 16:26 andrew@tin: Started deploy [horizon/deploy@4d1bdeb]: updating requirements.txt
  • 16:16 andrew@tin: Finished deploy [horizon/deploy@de72527]: scap debugging run (duration: 00m 24s)
  • 16:16 andrew@tin: Started deploy [horizon/deploy@de72527]: scap debugging run
  • 15:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 00m 55s)
  • 15:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1105:3311 - T162807 (duration: 00m 55s)
  • 15:28 marostegui: Stop replication in sync on db1089 and db1105:3311 - T162807
  • 15:23 moritzm: installing libtasn security updates
  • 15:02 reedy@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/maintenance/: Fix maintenance scripts (duration: 00m 56s)
  • 15:01 godog: roll-upgrade thumbor to 1.12 - T186500 T186594 T186492
  • 14:54 elukey: upload prometheus-burrow-exporter 0.0.4 on jessie/stretch-wikimedia
  • 14:51 ottomata: emitting IP field from varnishkafka-eventlogging instance T186833
  • 14:51 zeljkof: EU SWAT finished
  • 14:47 filippo@neodymium: conftool action : set/pooled=no; selector: name=mw1227.eqiad.wmnet
  • 14:44 addshore@tin: Finished scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description) (duration: 19m 56s)
  • 14:44 andrew@tin: Finished deploy [horizon/deploy@de72527]: just checking that this still doesn't work (duration: 00m 04s)
  • 14:44 andrew@tin: Started deploy [horizon/deploy@de72527]: just checking that this still doesn't work
  • 14:38 moritzm: uploading cassandra 3.11.0-wmf5 to component/cassandra311 for stretch-wikimedia/apt.wikimedia.org (T186619)
  • 14:24 addshore@tin: Started scap: T186612 gerrit:409063 TwoColConflict wmf.20 (Remove hint and link from twoColConflict-beta-feature-description)
  • 14:22 otto@tin: Finished deploy [eventlogging/analytics@01d5761]: T186833 (duration: 00m 04s)
  • 14:22 otto@tin: Started deploy [eventlogging/analytics@01d5761]: T186833
  • 14:20 godog: grant group write for wikidev on tin on /srv/mediawiki-staging/php-1.31.0-wmf.20/.git
  • 13:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1105:3311 - T162807 (duration: 01m 06s)
  • 13:11 marostegui: Deploy schema change on db2084 and db2075 - T185128 T153182
  • 12:03 moritzm: upgrading jessie-based servers in deployment-prep/beta to the HHVM build using ICU 57 (component/icu57)
  • 11:15 jdrewniak@tin: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 58s)
  • 11:14 jdrewniak@tin: Synchronized portals/prod/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 57s)
  • 10:36 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1066 - T162807 (duration: 00m 55s)
  • 10:07 marostegui: Stop replication in sync on db1089 and db1066 - T162807
  • 10:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1066 - T162807 (duration: 00m 55s)
  • 09:57 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 00m 55s)
  • 09:51 elukey: reboot mw1302 (hhvm defunct processes, hungs registered in dmesg, very high load)
  • 09:46 marostegui: Stop replication in sync on db1089 and db1067 - T162807
  • 09:46 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1067 - T162807 (duration: 00m 56s)
  • 09:29 moritzm: installing libdatetime-timezone-perl SUA update
  • 09:25 godog: install swift stretch updates on ms-be eqiad - T177739
  • 09:19 marostegui: Deploy schema change on s5 - T185128 T153182
  • 09:05 marostegui: Stop replication in sync on db1089 and db2048 - T162807
  • 09:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T162807 (duration: 00m 55s)
  • 08:57 moritzm: installing glibc security updates on trusty (harmless in our environment; CVE-2018-1000001 is non-exploitable due to disabled unprivileged user name spaces)
  • 08:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1087 - T184599 (duration: 00m 55s)
  • 08:36 marostegui: Reboot db1087 to pick new kernel
  • 08:34 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1092, depool db1087 - T184599 (duration: 00m 55s)
  • 08:25 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318, depool db1092 - T184599 (duration: 00m 55s)
  • 08:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318, depool db1099:3318 - T184599 (duration: 00m 55s)
  • 08:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1104, depool db1101:3318 - T184599 (duration: 00m 55s)
  • 08:01 hashar: Upgrading CI Jenkins plugins
  • 07:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1109, depool db1104 - T184599 (duration: 00m 55s)
  • 07:46 moritzm: installing exim security updates on remaining hosts
  • 07:35 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:20 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1109 - T184599 (duration: 00m 55s)
  • 07:05 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1109 - T184599 (duration: 00m 55s)
  • 06:53 marostegui: Reboot db1109 to pick up new kernel
  • 06:53 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T184599 (duration: 00m 56s)
  • 06:40 marostegui: Drop dewiki database from s8 servers - T184599
  • 02:38 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.20) (duration: 11m 40s)

2018-02-11

  • 14:06 moritzm: installing exim4 security updates on MXs

2018-02-10

  • 16:51 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/specials/SpecialLog.php: SpecialLog: Fix results when no offender is specified - T186950 (duration: 00m 57s)
  • 01:10 demon@tin: Finished deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend (duration: 00m 10s)
  • 01:10 demon@tin: Started deploy [gerrit/gerrit@5d5193e]: one last gitiles rebuild before the weekend

2018-02-09

  • 23:28 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:26 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (now incldes actual code change!) (duration: 00m 55s)
  • 23:01 jynus: restart haproxy on dbproxy1005
  • 22:47 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again) (duration: 00m 03s)
  • 22:47 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again again)
  • 22:45 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/TextExtracts/includes/ApiQueryExtracts.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:43 tgr@tin: Synchronized php-1.31.0-wmf.20/extensions/MobileFrontend/includes/api/ApiMobileView.php: emergency fix for T186927 (duration: 00m 55s)
  • 22:42 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again) (duration: 00m 40s)
  • 22:42 tgr@tin: Synchronized php-1.31.0-wmf.20/includes/parser/ParserOutput.php: emergency fix for T186927 (duration: 00m 57s)
  • 22:42 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (trying again)
  • 22:36 halfak@tin: Finished deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901 (duration: 09m 59s)
  • 22:26 halfak@tin: Started deploy [ores/deploy@c98ec8b]: (non-production) experimenting with stretch deploy T185901
  • 22:10 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again (duration: 00m 03s)
  • 22:10 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches, again
  • 22:08 andrew@tin: Finished deploy [horizon/deploy@de72527]: Doing this while halfaker watches (duration: 00m 17s)
  • 22:08 andrew@tin: Started deploy [horizon/deploy@de72527]: Doing this while halfaker watches
  • 21:40 andrew@tin: Finished deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try (duration: 00m 14s)
  • 21:40 andrew@tin: Started deploy [horizon/deploy@de72527]: At this point I'm just hoping scap will really deploy the wheels on my second try
  • 21:28 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/AbuseFilter/includes/api/ApiQueryAbuseLog.php: T186914 (duration: 00m 54s)
  • 21:20 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Block/TopicList.php: T186911 (duration: 00m 55s)
  • 21:10 ejegg@tin: Synchronized php-1.31.0-wmf.20/extensions/CentralNotice/CentralNoticePageLogPager.php: Sync CentralNotice for banner content log fix (duration: 00m 56s)
  • 20:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/user/User.php: Avoid pointless DB_MASTER connections in User::clearSharedCache() (duration: 00m 55s)
  • 20:08 demon@tin: Synchronized php-1.31.0-wmf.20/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 55s)
  • 20:07 demon@tin: Synchronized php-1.31.0-wmf.20/includes/MediaWiki.php: Catch Error exceptions in MediaWiki::run() (duration: 00m 57s)
  • 19:16 demon@tin: Synchronized php-1.31.0-wmf.20/extensions/Scribunto/common/Hooks.php: silence divide by zero / no such index 0 errors (duration: 00m 56s)
  • 18:31 demon@tin: rebuilt and synchronized wikiversions files: group2 to wmf.20
  • 18:12 demon@tin: Synchronized php-1.31.0-wmf.20/includes/filerepo/file/LocalFile.php: Fix CommentStore->createComment() call in LocalFile.php (duration: 01m 12s)
  • 18:08 bblack: cp4023: after a brief period of levelling off a bit: sharp, steep recovery of mbox lag ramp back to ~6K. not sure if this is a new floor or will drop further, but seems pretty ok.
  • 18:03 bblack: cp4023: now seems to be leveling off on lag and decreasing objhdr locks. either expiry thread prio helped (which argues for our prio-related patches) or it was naturally going to end?
  • 17:44 bblack: cp4023: experimental, "renice -19 39007" (backend cache-timeout aka expiry thread), to see if mbox lag resolves on its own quicker
  • 17:19 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 16:53 andrew@tin: Finished deploy [horizon/deploy@de72527]: Rolling out pyldap wheel (duration: 02m 26s)
  • 16:51 andrew@tin: Started deploy [horizon/deploy@de72527]: Rolling out pyldap wheel
  • 16:38 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1067 - T162807 (duration: 01m 12s)
  • 16:29 demon@tin: Finished deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change (duration: 00m 10s)
  • 16:29 demon@tin: Started deploy [gerrit/gerrit@7ca3b02]: no-op to gerrit: deploying scap config change
  • 15:49 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1 T186866
  • 15:47 akosiaris: upgrade etherpad.wikimedia.org to 1.6.3-1
  • 15:47 akosiaris: upload etherpad-lite 1.6.3-1 to apt.wikimedia.org/jessie-wikimedia/main T186866
  • 15:00 herron: upgraded mailman on fermium for security updates
  • 14:24 demon@tin: Synchronized php-1.31.0-wmf.20/tests/phpunit/includes/db/LBFactoryTest.php: no-op to prior (duration: 01m 12s)
  • 13:33 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T162807 (duration: 01m 12s)
  • 13:33 demon@tin: Finished deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin (duration: 00m 10s)
  • 13:33 demon@tin: Started deploy [gerrit/gerrit@9c0acf6]: updating gitiles plugin
  • 10:51 marostegui: Stop replication in sync on db1067 and db1089 - T162807
  • 10:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1089 and db1067 for data checksumming - T162807 (duration: 01m 11s)
  • 10:36 moritzm: uploaded php-luasandbox 2.0.14~stretch2 for stretch-wikimedia to apt.wikimedia.org (this removes the php-luasandbox binary from our internal luasandbox build in favour of the php-luasandbox package maintained by legoktm from stretch-backports). As such the php-luasandbox source package we build internall now only provides the HHVM extension (and we can retire it entirely when migrating to PHP7)
  • 10:29 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1080 - T162807 (duration: 01m 11s)
  • 10:02 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 12s)
  • 09:37 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1080 - T162807 (duration: 01m 11s)
  • 09:06 marostegui: Fix data drifts on db1067 - T162807
  • 08:45 demon@tin: Synchronized wmf-config/: rm cleanchanges (duration: 01m 14s)
  • 08:44 demon@tin: Synchronized multiversion/submodules.json: rm CleanChanges (duration: 01m 13s)
  • 07:57 marostegui: Stop replication on labsdb1004 to fix replication issues
  • 07:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Slowly repool db1080 - T162807 (duration: 01m 11s)
  • 07:39 elukey: forced remount of /mnt/hdfs on stat1005
  • 06:52 marostegui: Fix replication on labsdb1010 - T186579
  • 06:47 marostegui: Fix data drifts, upgrade kernel, mariadb and socket path on db1080 - T162807
  • 06:44 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T162807 (duration: 01m 12s)
  • 02:41 andrew@tin: Finished deploy [horizon/deploy@60cac8e]: updating with designate dashboard (duration: 02m 42s)
  • 02:38 andrew@tin: Started deploy [horizon/deploy@60cac8e]: updating with designate dashboard
  • 00:18 demon@tin: rebuilt and synchronized wikiversions files: surprise, it broke. revert group1 back to wmf.20
  • 00:16 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20 *duck and cover*

2018-02-08

  • 23:49 ppchelko@tin: Finished deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag (duration: 15m 44s)
  • 23:33 ppchelko@tin: Started deploy [restbase/deploy@c0f0dcd]: Fix a type that prevented the mobile partial content to have an etag
  • 22:37 bsitzmann@tin: Finished deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95 (duration: 05m 07s)
  • 22:32 bsitzmann@tin: Started deploy [mobileapps/deploy@75a2ebb]: Update mobileapps to e93ab95
  • 22:17 demon@tin: rebuilt and synchronized wikiversions files: mw.org back to wmf.20
  • 22:08 XioNoX: rebooting cr1-eqsin
  • 21:59 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess? (duration: 00m 03s)
  • 21:58 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take... six, I guess?
  • 21:53 ottomata: finished upgrade of scb to librdkafka 0.11 and node-rdkafka 2
  • 21:49 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 49s)
  • 21:49 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there (duration: 00m 35s)
  • 21:48 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Finally we're there
  • 21:48 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 46s)
  • 21:47 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:40 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 15s)
  • 21:40 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:40 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 04s)
  • 21:40 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:39 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 47s)
  • 21:38 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2 (duration: 00m 24s)
  • 21:38 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:38 otto@tin: Started deploy [eventstreams/deploy@7629e16]: upgrade to librdkafka 0.11 node-rdkafka 2
  • 21:32 herron: restarted rsyslogd services on lithium and wezen to clear rsyslog tls listener on port 6514 icinga alerts
  • 21:23 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 54s)
  • 21:23 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 01m 03s)
  • 21:23 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:22 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 25s)
  • 21:22 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:22 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:13 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 21:12 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 21:11 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 45s)
  • 21:10 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 21:09 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 21s)
  • 21:09 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five (duration: 01m 25s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@5e53829]: updating with designate dashboard -- take five
  • 20:52 andrew@tin: Finished deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four (duration: 01m 36s)
  • 20:50 andrew@tin: Started deploy [horizon/deploy@7d4a2d9]: updating with designate dashboard -- take four
  • 20:34 ppchelko@tin: Started restart [changeprop/deploy@5fdc03a]: Restart CP to force rule rebalance
  • 20:27 ppchelko@tin: Finished deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary (duration: 00m 46s)
  • 20:26 ppchelko@tin: Started deploy [changeprop/deploy@5fdc03a]: Update node-rdkafka to 2.0+. Canary
  • 20:26 ppchelko@tin: Finished deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary (duration: 00m 13s)
  • 20:26 ppchelko@tin: Started deploy [cpjobqueue/deploy@9adaa92]: Update node-rdkafka to 2.0+. Canary
  • 20:24 otto@tin: Finished deploy [eventstreams/deploy@7629e16]: (no justification provided) (duration: 00m 22s)
  • 20:24 otto@tin: Started deploy [eventstreams/deploy@7629e16]: (no justification provided)
  • 20:20 ottomata: starting deploy process to update scb cluster to librdkafka 0.11 and node-rdkafka 2. we will depool, stop puppet, deploy, test, start puppet on each node
  • 20:03 no_justification: gerrit: killed about 12 parallel clones of mediawiki/extensions/Math that had been running between 2-3 days (wtf?)
  • 19:24 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Flow/includes/Model/AbstractRevision.php: T186077 (duration: 01m 11s)
  • 19:19 catrope@tin: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateStyles on svwiki (T176082) (duration: 01m 11s)
  • 19:17 catrope@tin: Synchronized php-1.31.0-wmf.20/extensions/Campaigns/CampaignsSecondaryAuthenticationProvider.php: T185870 (duration: 01m 13s)
  • 19:02 bsitzmann@tin: Finished deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94 (duration: 08m 21s)
  • 19:00 bblack: lvs@ulsfo - all back to normal
  • 18:55 bblack: lvs@ulsfo - puppet disabled, trying tagged vlan deploy
  • 18:54 bsitzmann@tin: Started deploy [mobileapps/deploy@541a7f7]: Update mobileapps to e6fbc94
  • 18:38 arlolra: Updated Parsoid to 961a5cf (T186630)
  • 18:27 arlolra@tin: Finished deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf (duration: 08m 11s)
  • 18:26 andrew@tin: Finished deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three (duration: 01m 16s)
  • 18:25 andrew@tin: Started deploy [horizon/deploy@9e9d458]: updating with designate dashboard -- take three
  • 18:19 arlolra@tin: Started deploy [parsoid/deploy@1367057]: Updating Parsoid to 961a5cf
  • 18:10 ema: upgrade cp2026 to varnish 5
  • 17:55 bblack@neodymium: conftool action : set/pooled=yes; selector: name=dns400[12].wikimedia.org
  • 17:21 akosiaris: repool sca1004 (zotero) for T181121
  • 17:21 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 17:16 ema: upgrade cp2024 to varnish 5
  • 16:58 ema: upgrade cp2022 to varnish 5
  • 16:39 moritzm: installing PHP7 security updates
  • 16:32 moritzm: installing mysql security updates on auth*
  • 16:31 ema: upgrade cp2020 to varnish 5
  • 16:30 bblack: puppet disabled on all ntp servers for initial ulsfo recdns/ntp config process
  • 16:25 bblack: puppet disabled on lvs400[67] for initial ulsfo recdns config process
  • 16:23 elukey: stop archiva on meitnerium to swap /var/lib/archiva from the root partition to a new separate one - T186020
  • 16:20 akosiaris: reboot ganeti1005 T181121
  • 16:18 akosiaris: depool sca1004 (zotero) for T181121
  • 16:17 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: sca1004.eqiad.wmnet (tags: ['dc=eqiad', 'cluster=sca', 'service=zotero'])
  • 16:13 bblack: rebooting dns400[12] (downtimed, currently spare::system)
  • 16:13 ema: upgrade cp2017 to varnish 5
  • 16:11 andrew@tin: Finished deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two (duration: 01m 24s)
  • 16:10 andrew@tin: Started deploy [horizon/deploy@9af532a]: updating with designate dashboard -- take two
  • 16:05 bblack: ntp servers back to normal
  • 16:04 andrew@tin: Finished deploy [horizon/deploy@2f176e2]: updating with designate dashboard (duration: 01m 11s)
  • 16:03 andrew@tin: Started deploy [horizon/deploy@2f176e2]: updating with designate dashboard
  • 15:57 ema: upgrade cp2014 to varnish 5
  • 15:48 moritzm: installing libio-socket-ssl-perl update from jessie point release
  • 15:47 bblack: disabling puppet on all global dns recursors for controlled config deploy
  • 15:35 ema: upgrade cp2011 to varnish 5
  • 15:18 ema: upgrade cp2008 to varnish 5
  • 15:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1073 - T162807 (duration: 01m 12s)
  • 14:59 moritzm: installing icu security updates from jessie/stretch point releases
  • 14:56 ema: upgrade cp2005 to varnish 5
  • 14:49 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 14:47 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on meta and mediawiki.org (duration: 01m 12s)
  • 14:43 zeljkof: EU SWAT finished
  • 14:31 moritzm: upgrading deployment-mediawiki04 to HHVM linked against ICU 57
  • 14:23 ema: upgrade cp2002 to varnish 5
  • 13:54 marostegui: Rename dewiki tables on s8 slaves - T184599
  • 13:53 ariel@tin: Finished deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454 (duration: 00m 02s)
  • 13:53 ariel@tin: Started deploy [dumps/dumps@9b7841f]: make sure all hashes appear in dumpstatus file , T185454
  • 13:49 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1073 with low weight - T162807 (duration: 01m 11s)
  • 13:41 marostegui: Drop dewiki already renamed tables and database on s8 master (db1071) - T184599
  • 13:22 marostegui: Fixing data drifts on db1073, also upgrade kernel, socket location and mysql - T162807
  • 13:21 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1073 - T162807 (duration: 01m 12s)
  • 13:15 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1051 - T184599 (duration: 01m 12s)
  • 13:09 moritzm: upgrade deployment servers and script runners to HHVM 3.18.7
  • 13:09 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1051 - T184599 (duration: 01m 11s)
  • 13:03 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1110 - T184599 (duration: 01m 11s)
  • 13:02 moritzm: upgrade mwdebug servers to HHVM 3.18.7
  • 12:50 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1110 - T184599 (duration: 01m 11s)
  • 12:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1106 - T184599 (duration: 01m 11s)
  • 12:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1106 - T184599 (duration: 01m 11s)
  • 12:14 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1100 - T184599 (duration: 01m 11s)
  • 12:08 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T184599 (duration: 01m 11s)
  • 11:59 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 - T184599 (duration: 01m 11s)
  • 11:42 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:37 marostegui: Fix replication on labsdb1010 - T186579
  • 11:33 akosiaris: reboot ganeti1005 T181121
  • 11:26 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 - T184599 (duration: 01m 11s)
  • 11:19 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1082 (duration: 01m 11s)
  • 11:12 akosiaris: migrate all running VMs off ganeti1005 T181121
  • 11:06 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1082 (duration: 01m 12s)
  • 11:00 marostegui: Drop wikidata renamed tables and database from s5 eqiad hosts - T184599
  • 10:07 marostegui: Drop deleted databases from sanitarium and labsdb hosts - T186685
  • 10:07 moritzm: upgrading remaining nginx-full packages on mw* in eqiad to 1.13.6-2+wmf1~jessie1
  • 08:07 moritzm: upgrade remaining app servers to HHVM 3.18.7
  • 07:27 _joe_: depooled mw1256 from traffic, scap (faulty disk, T186535); now powering it off
  • 02:26 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 05m 58s)
  • 02:20 eileen: Update CiviCRM civicrm revision changed from 71b1e35b99 to 61acc9175e (deploy citibank, benevity import updates)
  • 01:30 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 01:30 andrew@tin: Finished deploy [horizon/deploy@9223ba7]: Now with static content, I hope (duration: 01m 15s)
  • 01:29 andrew@tin: Started deploy [horizon/deploy@9223ba7]: Now with static content, I hope
  • 00:35 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/VisualEditor/: Revert "Use wgEditSubmitButtonLabelPublish from upstream", Assume wpTextbox1 has an API registered already (duration: 01m 12s)
  • 00:33 ebernhardson@tin: Synchronized php-1.31.0-wmf.20/extensions/CirrusSearch/: T186765: Add special handling for profiles into config dump (duration: 01m 27s)

2018-02-07

  • 23:59 mutante: restarted icinga-wm, too quiet
  • 21:53 ebernhardson: mwdebug1001 back to standard deployed versions
  • 21:51 bsitzmann@tin: Finished deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643) (duration: 06m 41s)
  • 21:44 bsitzmann@tin: Started deploy [mobileapps/deploy@fe3cd60]: Update mobileapps to 7a3b19c (T186745 T186643)
  • 21:40 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:40 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:39 otto@tin: Finished deploy [eventstreams/deploy@ee854df]: (no justification provided) (duration: 00m 02s)
  • 21:39 otto@tin: Started deploy [eventstreams/deploy@ee854df]: (no justification provided)
  • 21:33 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png (duration: 03m 55s)
  • 21:29 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png
  • 21:27 ebernhardson: deploying wmf.20 to en* (except enwiki) on mwdebug1001 to debug new cirrus errors in wmf.20/wmf.19 mixed sister search
  • 21:13 andrew@tin: Finished deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times (duration: 01m 24s)
  • 21:12 andrew@tin: Started deploy [horizon/deploy@9773454]: This isn't working and I'm going to do this 1000 times
  • 21:07 demon@tin: rebuilt and synchronized wikiversions files: mw.org also back to wmf.17
  • 21:06 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 03s)
  • 21:06 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:04 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 02m 38s)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:01 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 44s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 21:00 andrew@tin: Finished deploy [horizon/deploy@9773454]: (no justification provided) (duration: 00m 05s)
  • 21:00 andrew@tin: Started deploy [horizon/deploy@9773454]: (no justification provided)
  • 20:39 demon@tin: rebuilt and synchronized wikiversions files: revert, huge spike in db lag
  • 20:36 demon@tin: rebuilt and synchronized wikiversions files: group1 to wmf.20
  • 19:47 ejegg: updated SmashPig from 1f56978c0c to 1ebee97a45
  • 19:43 ejegg: updated payments-wiki from 39a7ef32e5 to fe311c2d26
  • 19:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add NS_MAIN to $wgNamespacesWithSubpages for cawikimedia T185436 (duration: 01m 12s)
  • 19:11 chasemp: after conversation with andrew we moved labweb to public for T186729
  • 19:09 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Rename Project NS on Wikimedia Canada Chapter wiki T185661 (duration: 01m 11s)
  • 18:55 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove old "accountcreator" rules now handled by default T185417 T186462 (duration: 01m 12s)
  • 18:16 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Tidy: Re-do this as a sorted negative list that gets shorter over time (duration: 01m 13s)
  • 18:07 jynus: fixing ferm breakage by restarting the service on db1051
  • 17:38 awight: ORES celery workers restarted on scb100[1-4]
  • 16:53 legoktm@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options - https://gerrit.wikimedia.org/r/408718 (Unbreak ExtensionDistributor) (duration: 01m 12s)
  • 16:47 gehel: upgrade of tilerator / kartotherian on maps eqiad completed, sorry for the noise...
  • 16:46 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 21s)
  • 16:46 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:44 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:44 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:43 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:42 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:39 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 18s)
  • 16:39 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:38 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 24s)
  • 16:38 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:37 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 17s)
  • 16:37 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:31 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 20s)
  • 16:31 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:30 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 17s)
  • 16:28 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:28 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1001.eqiad.wmnet
  • 16:27 gehel: upgrading tilerator / kartotherian on maps eqiad
  • 16:00 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: mw1271.eqiad.wmnet
  • 14:37 moritzm: installing poppler security updates
  • 14:33 zeljkof: EU SWAT finished
  • 14:30 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Updates to enable transliteration for crhwiki (T23582) (duration: 01m 11s)
  • 14:18 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add "Portal" namespace on it.wikiquote (T185232) (duration: 01m 13s)
  • 14:05 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 47s)
  • 14:03 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:58 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: (no justification provided) (duration: 03m 02s)
  • 13:55 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:38 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 02m 45s)
  • 13:36 moritzm: installing p7zip security updates
  • 13:35 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:35 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 21s)
  • 13:34 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:33 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 06s)
  • 13:32 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:20 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 00m 55s)
  • 13:19 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:18 akosiaris@tin: Finished deploy [ores/deploy@eb0f776]: T171851 (duration: 01m 22s)
  • 13:17 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: T171851
  • 13:16 akosiaris@tin: Started deploy [ores/deploy@eb0f776]: (no justification provided)
  • 13:16 marostegui: Drop wikidata tables and database from s5 codfw hosts - T184599
  • 13:13 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1089 (duration: 01m 11s)
  • 12:47 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1089 weight (duration: 01m 11s)
  • 12:30 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1089 with low weight (duration: 01m 40s)
  • 11:54 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1069 - T186321 (duration: 01m 11s)
  • 11:09 elukey: install libc6-dbg on phab1001 to get a more precise gdb stack trace - T182832
  • 11:04 marostegui: Stop MySQL on db1069 for MySQL upgrade, kernel upgrade and change binlog format to statement - T186321
  • 10:58 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1069 - T186321 (duration: 01m 09s)
  • 09:52 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1051 after the BBU change - T186049 (duration: 01m 14s)
  • 09:41 kartik@tin: Finished deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901) (duration: 03m 44s)
  • 09:38 marostegui: Failover back labsdb1010 to labsdb1009 - T174569
  • 09:37 kartik@tin: Started deploy [cxserver/deploy@eabb6d7]: Update cxserver to e164ead and Matxin MT deployment (T184901)
  • 09:18 marostegui: Failover labsdb1009 to labsdb1010 - T174569
  • 09:16 marostegui: Failover back labsdb1010 to labsdb1011 - T174569
  • 09:05 marostegui: Failover labsdb1011 to labsdb1010 - T174569
  • 08:43 marostegui: Change triggers for s3 on db1095 - T174569
  • 08:21 marostegui: Change triggers for s1 on db1095 - T174569
  • 08:11 marostegui: Change triggers for s5 on db1095 - T174569
  • 07:53 marostegui: Change triggers for s8 on db1095 - T174569
  • 07:17 marostegui: Change triggers for s7 on db1102 - T174569
  • 07:05 marostegui: Change triggers for s6 on db1102 - T174569
  • 07:00 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Start repooling db1051 after the BBU change - T186049 (duration: 01m 15s)
  • 02:25 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 34s)
  • 01:14 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable fine grained usage tracking, another batch. (T186645) (duration: 01m 11s)
  • 01:05 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable AICaptcha data collection everywhere (T186244) (duration: 01m 11s)
  • 00:45 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Support fallback values for referrer policy (T180921) (duration: 01m 12s)
  • 00:31 ladsgroup@tin: Synchronized php-1.31.0-wmf.20/includes/http/MWHttpRequest.php: MWHttpRequest: Restore ability to pass null for $options (duration: 01m 11s)
  • 00:28 ladsgroup@tin: Synchronized wmf-config/InitialiseSettings.php: Enable RemexHtml on wikis with < 10 errors in all high-priority categories (T184656) (duration: 01m 09s)

2018-02-06

  • 23:02 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 04s)
  • 23:02 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 23:00 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 00m 03s)
  • 23:00 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:56 andrew@tin: Finished deploy [horizon/deploy@48c51e9]: (no justification provided) (duration: 02m 45s)
  • 22:53 andrew@tin: Started deploy [horizon/deploy@48c51e9]: (no justification provided)
  • 22:42 ejegg: updated SmashPig standalone from 778e8f87b4 to 1f56978c0c
  • 22:23 hashar: Zuul/CI seems to work all fine now
  • 21:49 hashar: Flushing Zuul queue and upgrading to zuul_2.5.1-wmf2 | T186381
  • 21:49 hashar: Flushing Zuul queue and upgrading
  • 21:41 hashar: Going to shutdown Zuul in a few for an emergency hotfix | T186381
  • 21:35 andrew@tin: Finished deploy [horizon/deploy@a316e45]: (no justification provided) (duration: 01m 00s)
  • 21:34 andrew@tin: Started deploy [horizon/deploy@a316e45]: (no justification provided)
  • 21:14 legoktm: restarted zuul due to patch being stuck (T186381)
  • 20:25 andrew@tin: Finished deploy [horizon/deploy@fbf761e]: (no justification provided) (duration: 01m 21s)
  • 20:23 andrew@tin: Started deploy [horizon/deploy@fbf761e]: (no justification provided)
  • 20:13 demon@tin: rebuilt and synchronized wikiversions files: group0 to wmf.20
  • 20:11 demon@tin: Synchronized php: symlink swap (duration: 01m 17s)
  • 19:25 hashar: Restarted Zuul due to T186381
  • 18:55 demon@tin: Finished scap: bootstrap wmf.20 @ testwiki (duration: 26m 09s)
  • 18:55 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 00m 15s)
  • 18:55 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:47 arlolra: Updated Parsoid to 8a0ff6c (T183515, T129372, T181408)
  • 18:46 mlitn@tin: Finished deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo (duration: 06m 23s)
  • 18:40 mlitn@tin: Started deploy [3d2png/deploy@8135c2d]: Updating 3d2png repo
  • 18:39 arlolra@tin: Finished deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c (duration: 03m 47s)
  • 18:35 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 18:29 demon@tin: Started scap: bootstrap wmf.20 @ testwiki
  • 18:22 demon@tin: Pruned MediaWiki: 1.31.0-wmf.16 (duration: 07m 29s)
  • 18:15 arlolra@tin: Started deploy [parsoid/deploy@211ea5d]: Updating Parsoid to 8a0ff6c
  • 16:56 elukey: restart httpd on phab1001
  • 16:50 gehel: upgrading kartotherian / tilerator on maps codfw completed
  • afk: restarting jenkins for updates
  • 16:41 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2004.codfw.wmnet
  • 16:41 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:40 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:40 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 36s)
  • 16:39 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:38 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2004.codfw.wmnet
  • 16:37 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2003.codfw.wmnet
  • 16:36 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 30s)
  • 16:36 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:35 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 01m 01s)
  • 16:34 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:32 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2003.codfw.wmnet
  • 16:30 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 31s)
  • 16:30 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:29 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 02m 34s)
  • 16:29 mutante: mw1262 started hhvm, it had Unhandled server exception: Class undefined: Psr\Log\LogLevel
  • 16:27 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:26 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:24 gehel@tin: Finished deploy [tilerator/deploy@29d633e]: new tilerator packaging (duration: 00m 34s)
  • 16:24 gehel@tin: Started deploy [tilerator/deploy@29d633e]: new tilerator packaging
  • 16:17 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:15 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2001.codfw.wmnet
  • 16:14 gehel@tin: Finished deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging (duration: 00m 22s)
  • 16:14 gehel@tin: Started deploy [kartotherian/deploy@ecdda41]: new kartotherian packaging
  • 16:11 gehel@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2001.codfw.wmnet
  • 16:10 gehel: upgrading kartotherian / tilerator on maps codfw
  • 15:36 elukey: drain + shutdown of analytics1038 to replace faulty BBU - T185409
  • 15:02 zeljkof: EU SWAT finished
  • 15:01 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow eliminators to undelete at urwiki (T185829) (duration: 00m 55s)
  • 14:53 marostegui: Poweroff db1051 for BBU replacement - T186049
  • 14:50 akosiaris: upgrade service-checker to 0.1.4 on scb1001
  • 14:45 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: Typo, its 2018 not 2017 (T185794) (duration: 00m 55s)
  • 14:39 zfilipin@tin: Synchronized wmf-config/throttle.php: SWAT: New throttle rule (T186530) (duration: 00m 55s)
  • 14:35 chasemp: disable puppet on labs things for a cautious change rollout
  • 14:33 anomie@tin: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = MIGRATION_WRITE_BOTH on test wikis (duration: 00m 56s)
  • 14:28 marostegui: Changing triggers on s2 - T174569
  • 14:26 zfilipin@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RemexHtml on fiwiki, hewiki, ruwiki, svwiki (T185945) (duration: 00m 55s)
  • 14:14 mlitn@tin: Synchronized php-1.31.0-wmf.17/extensions/UploadWizard/resources/details/uw.DescriptionsDetailsWidget.js: T184380 (duration: 00m 55s)
  • 14:10 ladsgroup@tin: Synchronized wmf-config/Wikibase.php: Add entityUsageModifierLimits config for Wikibase (T185693) (duration: 00m 55s)
  • 14:07 urandom: re-enable smartpath on restbase1010 (revert experiment) - T178177
  • 13:35 gehel: upgrading prometheus-elasticsearch-exporter across all elasticsearch nodes
  • 12:32 marostegui: Power cycled dbstore1001 after it crashed - T186596
  • 11:54 marostegui: Sanitize s4 - T174569
  • 11:11 _joe_: forcing a resync of /dev/md1 on conf2001 to verify if the higher timeouts avoid consensus loss in etcd
  • 11:02 ema: restart pybal on codfw primary LVSs to make them reconnect to etcd
  • 11:01 ema: restart pybal on codfw secondary LVSs to make them reconnect to etcd
  • 10:57 ema: restart pybal on eqiad primary LVSs to make them reconnect to etcd
  • 10:55 ema: restart eqiad secondary LVSs to make them reconnect to etcd
  • 10:47 _joe_: rolling restart of the eqiad etcd cluster
  • 10:39 _joe_: rolling restart of the codfw cluster to pick up the config changes
  • 09:38 marostegui: Sanitizing s2 - T174569
  • 08:41 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Fully repool db1077 (duration: 00m 55s)
  • 08:21 elukey: rollback apache/httpd changes on phab1001 (restart required)
  • 08:07 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:43 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Increase db1077 weight (duration: 00m 55s)
  • 07:28 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Repool db1077 with low weight (duration: 00m 53s)
  • 07:06 marostegui: Stop MySQL on db1077 for a full upgrade
  • 07:01 marostegui@tin: Synchronized wmf-config/db-eqiad.php: Depool db1077 for MariaDB and kernel upgrade (duration: 00m 56s)
  • 06:49 marostegui: Fix replication on labsdb1010 - T186579
  • 03:32 demon@tin: Finished deploy [gerrit/gerrit@f25f017]: adding gitiles plugin (duration: 00m 10s)
  • 03:32 demon@tin: Started deploy [gerrit/gerrit@f25f017]: adding gitiles plugin
  • 03:17 foks: reset email for User:Andrewman327
  • 02:32 demon@tin: Synchronized tests/Defines.php: no op (duration: 00m 55s)
  • 02:30 l10nupdate@tin: scap sync-l10n completed (1.31.0-wmf.17) (duration: 06m 15s)
  • 01:47 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable AICaptcha data collection on group0/group1 T186244 (duration: 00m 56s)
  • 00:25 thcipriani@tin: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ps.svg: SWAT: Update the ps mobile wordmark T184442 (duration: 00m 55s)
  • 00:14 thcipriani@tin: Synchronized wmf-config/InitialiseSettings.php: SWAT: Configure settings feedback link T182217 (duration: 00m 56s)

2018-02-05

  • 23:21 mutante: nihal - restarted puppetdb service
  • 23:07 mobrovac@tin: Finished deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395 (duration: 03m 29s)
  • 23:04 mobrovac@tin: Started deploy [citoid/deploy@7bbc583]: Fix TypeError bug - T186395
  • 22:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 55s)
  • 22:45 mobrovac@tin: Synchronized wmf-config/jobqueue.php: EventBus: Enable htmlCacheUpdate jobs for all projects - T182023 (duration: 00m 56s)
  • 22:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate (duration: 00m 54s)
  • 22:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@4543102]: Revert the switch to librdkafka 0.11 and enable htmlCacheUpdate
  • 21:47 mholloway-shell@tin: Finished deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a (duration: 06m 38s)
  • 21:47 ppchelko@tin: Finished deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023 (duration: 02m 27s)
  • 21:45 chasemp: asw-b-codfw# rollback 0 pending questions on T183167
  • 21:45 ppchelko@tin: Started deploy [cpjobqueue/deploy@aebfded]: Enble htmlCacheUpdate job for all wikis T182023
  • 21:41 mholloway-shell@tin: Started deploy [mobileapps/deploy@6cae404]: Update mobileapps to 3140b1a
  • 21:07 tgr@tin: Finished scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki (duration: 18m 24s)
  • 20:48 tgr@tin: Started scap: T186244 backporting patches and enabling AICaptcha data collection on testwiki
  • 19:44 demon@tin: Synchronized wmf-config/InitialiseSettings.php: collation for abwiki (duration: 00m 55s)
  • 19:32 demon@tin: Finished scap: adding collation for Abkhaz (duration: 05m 12s)
  • 19:27 demon@tin: Started scap: adding collation for Abkhaz
  • 19:26 demon@tin: Synchronized multiversion/MWWikiversions.php: drop php5.3 support (duration: 00m 56s)
  • 19:22 demon@tin: Synchronized wmf-config/InitialiseSettings.php: Enable ArticlePlaceholder extension for urwiki (duration: 00m 56s)
  • 19:05 elukey: executed 'echo '/srv/apache2_dump/core.%h.%e.%p.%t' > /proc/sys/kernel/core_pattern' on phab1001 - T182832
  • 18:57 ppchelko@tin: Finished deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps (duration: 14m 42s)
  • 18:42 ppchelko@tin: Started deploy [restbase/deploy@44f2d2b]: Pass cache-control headers to /sys/mobileapps
  • 18:37 mutante: added bstorm to acl*operations-team (project 29) on Phabricator (T185493)
  • 18:35 elukey: add 'ulimit -c unlimited' to /etc/default/apache2 to see if httpd's CoreDumpDirectory works properly on phab1001
  • 18:35 mutante: welcome new root shell user bstorm
  • 18:31 mutante: added bstorm to the 'wmf' and 'ops' LDAP groups (modify-ldap-groups on terbium) (T185493)
  • 18:30 ppchelko@tin: Finished deploy [restbase/deploy@55e9d87]: Enable ensure_content_type filter for mobile content (duration: 12m 04s)
  • 18:18 ppchelko@tin: Started deploy