Jump to content

Server Admin Log/Archive 36

From Wikitech

2018-12-31

  • 16:27 ejegg: disabled SmashPig recurring donation charge job for investigation

2018-12-30

  • 07:17 elukey: restart pdfrender on scb1002 (alarms flapping)
  • 02:07 bawolff@deploy1001: Synchronized private/PrivateSettings.php: fine-tune antispam measure T212667 (duration: 00m 47s)

2018-12-29

  • 14:53 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T212667 218371fd35 - Adjust mw.org abusefilter emergency shutoff threshold down to 0.3 (duration: 00m 46s)
  • 14:30 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T212667 fe72284c Adjust account throttle limits (duration: 00m 46s)
  • 13:30 bawolff@deploy1001: Synchronized private/PrivateSettings.php: T212667 - adjust spam block (duration: 00m 44s)
  • 12:34 bawolff@deploy1001: Synchronized private/PrivateSettings.php: T212667 - make spam mitigation global (duration: 00m 49s)
  • off: manually recreated /dev/log symlink on kubernetes1001, restarting systemd-journald.socket didn't worked (this should fix cron-spam emails from the host every hour)
  • off: restarted pdfrender on scb1004
  • 09:21 elukey: restart pdfrender on scb1004

2018-12-28

  • 18:14 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: 97446843a27 T212667 - Temp increase abusefilter emergency cutoff on mw.org to deal with spam attack (duration: 00m 46s)
  • 18:02 bawolff@deploy1001: Synchronized private/PrivateSettings.php: T212667 - adjust account creation (duration: 00m 47s)
  • 15:28 bawolff@deploy1001: Synchronized private/PrivateSettings.php: Attempt to adjust captcha settings for T212667 (duration: 00m 46s)

2018-12-27

  • 19:31 mobrovac@deploy1001: Finished deploy [restbase/deploy@a67f38e]: Fix rate-limiter crash (with increased deploy delays), take #3 (duration: 25m 49s)
  • 19:05 mobrovac@deploy1001: Started deploy [restbase/deploy@a67f38e]: Fix rate-limiter crash (with increased deploy delays), take #3
  • 18:15 mobrovac@deploy1001: Started deploy [restbase/deploy@a67f38e]: Fix rate-limiter crash (with increased deploy delays), take #2
  • 18:15 mobrovac@deploy1001: Finished deploy [restbase/deploy@a67f38e]: Fix rate-limiter crash (with increased deploy delays) - T212631 (duration: 11m 20s)
  • 18:04 mobrovac@deploy1001: Started deploy [restbase/deploy@a67f38e]: Fix rate-limiter crash (with increased deploy delays) - T212631
  • 18:04 mobrovac@deploy1001: deploy aborted: Fix rate-limiter crash (with increased deploy delays) - T212631 (duration: 00m 09s)
  • 18:04 mobrovac@deploy1001: Started deploy [restbase/deploy@a67f38e]: Fix rate-limiter crash (with increased deploy delays) - T212631
  • 18:03 mobrovac@deploy1001: deploy aborted: Fix rate-limiter crash - T212631 (duration: 13m 09s)
  • 17:50 mobrovac@deploy1001: Started deploy [restbase/deploy@70c4752]: Fix rate-limiter crash - T212631
  • 17:20 mobrovac@deploy1001: Finished deploy [restbase/deploy@ae7a537]: Fix rate-limiter crash - T212631 - deploy only on canary restbase1007 (duration: 04m 24s)
  • 17:15 mobrovac@deploy1001: Started deploy [restbase/deploy@ae7a537]: Fix rate-limiter crash - T212631 - deploy only on canary restbase1007
  • 16:00 cmjohnson1: powercycling frdb1001 for troubleshooting
  • 15:21 bawolff: Updated interwiki cache https://biblio.wiki to https://wikilivres.org -> T212650
  • 15:19 bawolff@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 03m 34s)

2018-12-26

  • 16:37 godog: clear logstash persistent queue /var/lib/logstash on logstash100[789]
  • 15:47 godog: roll-restart logstash on logstash100[789]
  • 14:26 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=wtp1028.eqiad.wmnet
  • 14:18 godog: powercycle wtp1028 - nothing on console
  • 09:13 godog: swift eqiad-prod: more weight for ms-be10[44-50].eqiad.wmnet - T209618
  • 07:20 marostegui: Fix replication db1124:3318

2018-12-24

  • 05:38 _joe_: powercycling lvs3001

2018-12-23

  • 11:53 marostegui: Fix replication on db1124:3318
  • 10:25 ariel@deploy1001: Finished deploy [dumps/dumps@af74350]: python3 fixup for show runtimes (duration: 00m 05s)
  • 10:25 ariel@deploy1001: Started deploy [dumps/dumps@af74350]: python3 fixup for show runtimes
  • 08:53 apergos: restarted pdfrender on scb1003

2018-12-22

  • 18:45 elukey: manually clean up of old log files on an-coord1001 (disk space issues)
  • 15:58 godog: reboot ms-be2018, stuck on sd 0:1:0:1: rejecting I/O to offline device
  • 10:21 cwd: re-enabled process-control
  • 08:36 cwd: took down sidebar fr campaign
  • 08:36 cwd: disabled process-control
  • 01:00 AaronSchulz: Deployed b47e9fcfece99 to navtiming
  • 00:59 aaron@deploy1001: Finished deploy [performance/navtiming@b47e9fc]: (no justification provided) (duration: 00m 05s)
  • 00:59 aaron@deploy1001: Started deploy [performance/navtiming@b47e9fc]: (no justification provided)

2018-12-21

  • 21:29 mutante: phab1002 - temp hack to unbreak phd / systemd alert, real fix will be phab deployment to new server
  • 21:28 mutante: phab1002 - mkdir -p /srv/phab/libext/ava/src  ; touch __phutil_library_init__.php
  • 21:05 mutante: phab1002 - apt autoremove
  • 21:04 mutante: phab1002 - removing all php related packages and letting puppet reinstall them
  • 20:38 mutante: phab1002 - restart php-fpm, restart phd for testing. phd fails
  • 18:50 mutante: [scb1001:~] $ sudo systemctl restart pdfrender
  • 14:20 dcausse: elastic@eqiad deleting unused index enwiki_general_1537906513
  • 13:24 moritzm: installing subversion updates from stretch point release
  • 12:34 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: T211494 (duration: 00m 44s)
  • 12:31 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T211494 (duration: 00m 45s)
  • 12:10 moritzm: rebooting url downloaders to pick up SSBD-enabled QEMU
  • 11:38 moritzm: rebooting debug proxies to pick up SSBD-enabled QEMU
  • 09:58 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ms-fe2006.codfw.wmnet
  • 09:57 ema: repool ms-fe2006 with old certs, test successful T212215#4839960
  • 09:23 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=ms-fe2006.codfw.wmnet
  • 09:22 ema: depool ms-fe2006 to test new TLS certs T212215
  • 08:45 moritzm: upgrading nginx on sodium
  • 08:41 moritzm: upgrading nginx on debug proxies
  • 06:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove old comments T211973 (duration: 00m 46s)
  • 00:52 twentyafterfour: SWAT Finished. See you all next year!
  • 00:50 twentyafterfour@deploy1001: Synchronized php-1.33.0-wmf.9/extensions/MobileFrontend/: SWAT: sync https://gerrit.wikimedia.org/r/c/mediawiki/extensions/MobileFrontend/+/481026 (duration: 00m 48s)

2018-12-20

  • 22:46 mutante: phab1001 / phabricator: upgraded nodejs package
  • 22:39 mutante: phab1001 / phabricator: installing php5 package upgrades
  • 19:29 kaldari@deploy1001: Synchronized wmf-config/Wikibase.php: syncing Wikibase for SWAT deployment (duration: 00m 45s)
  • 19:27 kaldari@deploy1001: Synchronized wmf-config/InitialiseSettings.php: syncing InitialiseSettings for SWAT deployment (duration: 00m 46s)
  • 19:03 elukey: restart hdfs namenode on an-master1002 with new heap settings (currently standby, 8->12G)
  • 18:30 elukey: remove hdfs journalnode config+packages from analytics10(28|35) - not used anymore - T209929
  • 18:29 elukey: restart hdfs namenode on an-master1001 with new heap settings (currently standby, 8->12G)
  • 18:06 mutante: doc1001 - meged gerrit:480881 and then manually moved the entire /srv/org/wikimedia/doc/ structure into /srv/docroot/srv/org/wikimedia/ and deleted the old dirs T137890
  • 17:53 arturo: updating puppet compiler facts: `PUPPET_COMPILER=compiler1002.puppet-diffs.eqiad.wmflabs modules/puppet_compiler/files/compiler-update-facts`
  • 17:49 arturo: updating puppet compiler facts: `PUPPET_COMPILER=compiler1001.puppet-diffs.eqiad.wmflabs modules/puppet_compiler/files/compiler-update-facts`
  • 17:05 XioNoX: add 208.80.155.88/29 to cloud-in4 term icmp - T207663
  • 17:02 XioNoX: configure additional 208.80.155.88/29 IPs on cloud-instance-transport1-b-eqiad - T207663
  • 16:59 zfilipin@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.9
  • 16:51 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.9/extensions/Wikibase: SWAT: Revert "Fail hard if an entity namespace is not configured." (T212427) (duration: 01m 17s)
  • 16:31 elukey: remove two journal nodes from the Analytics hadoop cluster - T209929
  • 14:53 moritzm: installing nodejs security updates on maps* (was tested via T211419)
  • 14:41 moritzm: restarted etherpad for nodejs security updates
  • 14:39 elukey: add two journal nodes to the Analytics Hadoop cluster - T209929
  • 14:30 moritzm: installing libdap updates from stretch point release
  • 14:26 moritzm: rearmed keyholder on netmon1002 after reboot
  • 14:22 moritzm: rebooting netmon1002 for kernel security update
  • 13:09 moritzm: installing xapian-core updates from stretch point release
  • 12:53 arturo: T209616 installing cloudvirt1030, icinga downtime for 1 day
  • 12:48 kartik@deploy1001: Finished deploy [cxserver/deploy@16f65cb]: Update cxserver to 803baa4 (T210581, T211889, T144467, T209473) (duration: 04m 42s)
  • 12:47 moritzm: installing libxcursor security updates
  • 12:43 kartik@deploy1001: Started deploy [cxserver/deploy@16f65cb]: Update cxserver to 803baa4 (T210581, T211889, T144467, T209473)
  • 12:31 moritzm: installing fuse updates from stretch point release
  • 12:09 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546, T202497) (duration: 00m 52s)
  • 12:08 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546, T202497) (duration: 00m 53s)
  • 12:02 moritzm: draining restbase1016 for eventual reboot for kernel security update
  • 11:46 moritzm: draining restbase1015 for eventual reboot for kernel security update
  • 11:34 moritzm: powercycling restbase1014, similar EFI ASSERT error to T212305
  • 11:34 moritzm: powercycling restbase1014, similar EFI ASSEER error to T212305
  • 11:21 moritzm: draining restbase1014 for eventual reboot for kernel security update
  • 10:59 banyek: executing schema change on db1068 (s4 master) - T85757
  • 10:58 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1121 after schema change - T85757 (duration: 00m 52s)
  • 10:53 banyek: repooling db1121 after schema change T85757
  • 10:45 moritzm: draining restbase1013 for eventual reboot for kernel security update
  • 10:44 banyek: stopping replication on db1121 - T85757
  • 10:40 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1121 for schema change - T85757 (duration: 00m 52s)
  • 10:35 banyek: depooling db1121 for schema change T85757
  • 10:33 banyek: executing schema change on dbstore1002 - T85757
  • 10:22 banyek: executing schema change on db1102 - T85757
  • 10:20 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1081 after schema change - T85757 (duration: 00m 51s)
  • 10:19 moritzm: draining restbase1012 for eventual reboot for kernel security update
  • 10:15 banyek: repooling db1081 after schema change T85757
  • 10:10 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1081 for schema change - T85757 (duration: 00m 51s)
  • 10:05 banyek: depooling db1081 for schema change T85757
  • 10:00 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1103:3314 after schema change - T85757 (duration: 00m 52s)
  • 09:56 banyek: repooling db1103:3314 after schema change T85757
  • 09:49 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1103:3314 for schema change - T85757 (duration: 00m 51s)
  • 09:46 banyek: depooling db1103:3314 for schema change T85757
  • 09:45 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1097:3314 after schema change - T85757 (duration: 00m 51s)
  • 09:41 legoktm@deploy1001: Synchronized php-1.33.0-wmf.8/includes/: T199540 (duration: 01m 06s)
  • 09:40 legoktm@deploy1001: Synchronized php-1.33.0-wmf.9/includes/: T199540 (duration: 01m 14s)
  • 09:39 banyek: repooling db1097:3314 after schema change T85757
  • 09:35 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T187299 Increase ruwiki navtiming rate + frwiki survey rate (duration: 00m 52s)
  • 09:29 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1097:3314 for schema change - T85757 (duration: 00m 52s)
  • 09:23 banyek: depooling db1097:3314 for schema change T85757
  • 08:06 elukey: roll restart of druid middlemanagers on druid* to pick up new port settings
  • 07:17 marostegui: Re-start codfw s4 backup as the previous one failed
  • 07:11 elukey: restart pdfrender on scb1002
  • 07:10 elukey: restart rsyslog on lithium - in:imtcp stuck in recvfrom ms-be2047.codfw.wmnet - T199406
  • 06:35 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2057 T212277 (duration: 00m 57s)
  • 01:41 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@c7977a7]: Update mobileapps to 42c011e (duration: 04m 08s)
  • 01:37 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@c7977a7]: Update mobileapps to 42c011e
  • 01:09 eileen_: civicrm revision changed from 9d727e4708 to b33dcd3c94, config revision is 0f94a475b7
  • 00:09 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/479911/ (duration: 00m 53s)

2018-12-19

  • 23:24 mutante: syncing facts from puppetmaster1001 to compiler1001/compiler1002
  • 23:21 eileen_: civicrm revision changed from d5c3d5fd17 to 9d727e4708, config revision is 0f94a475b7
  • 23:05 krinkle@deploy1001: Finished deploy [performance/navtiming@64e3f63]: (no justification provided) (duration: 00m 05s)
  • 23:05 krinkle@deploy1001: Started deploy [performance/navtiming@64e3f63]: (no justification provided)
  • 22:15 mutante: scb1003: systemctl restart pdfrender
  • 20:44 hashar: 1.33.0-wmf.9 on group1 looks fine.
  • 20:03 hashar@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.9 (duration: 00m 51s)
  • 20:02 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.9
  • 19:33 XioNoX: Revert "Redirect eqsin/ulsfo caches to eqiad" - T210467
  • 19:32 XioNoX: repool codfw - T210467
  • 19:28 XioNoX: reactive BGP sessions to telia on cr1-codfw - T211715
  • 19:07 chasemp: cp of files to ext drive on labstore1007
  • 18:41 marostegui: Stop MySQL on db2057
  • 18:28 XioNoX: replace `interface-range vlan-private1-b-eqiad member ge-6/0/*` with individual interfaces on asw2-b-eqiad
  • 18:09 ejegg: updated fundraising CiviCRM from 8e18485697 to d5c3d5fd17
  • 17:32 addshore: SWAT done
  • 17:22 addshore@deploy1001: Synchronized wmf-config: Wikibase: wikidatawiki upsert idGenerator, T194299 (duration: 00m 52s)
  • 17:13 addshore@deploy1001: Synchronized wmf-config: Wikibase: testwikidatawiki upsert idGenerator, T194299 (duration: 00m 52s)
  • 17:05 XioNoX: remove 2nd port to AS8220 (cf. email to peering@)
  • 17:04 addshore@deploy1001: Synchronized wmf-config: Wikibase: prepare to set $wgWBRepoSettings idGenerator, T194299 (duration: 00m 53s)
  • 16:59 XioNoX: deactive BGP sessions to telia on cr1-codfw - T211715
  • 16:58 fsero: DNS: updating wmnet to include new registries T212212
  • 16:57 hashar@deploy1001: Synchronized php-1.33.0-wmf.9/extensions/ExtensionDistributor/includes/specials/SpecialBaseDistributor.php: Follow-up f686d348: No need for an <img> tag any more - T212217 (duration: 00m 52s)
  • 16:52 XioNoX: codfw row D maintenance finished without issues - T210467
  • 16:39 cmjohnson1: swapping disk in slot 2 on db1072
  • 16:33 XioNoX: shutdown asw-d4-codfw - T210467
  • 16:21 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting comment migration to new on group 2 (T166733) (duration: 00m 52s)
  • 16:04 XioNoX: Redirect eqsin/ulsfo caches to eqiad - T210467
  • 16:01 moritzm: installing php5 security updates on jessie
  • 15:48 XioNoX: depool codfw - T210467
  • 15:39 chasemp: various cp jobs on labstore1007 to ext media
  • 15:33 chasemp: labstore1007 mount /dev/sde /mnt/T211327
  • 15:13 moritzm: draining restbase1011 for eventual reboot for kernel security update
  • 15:07 marostegui: Drop image_comment_temp from labswiki and labtestwiki - T209591
  • 14:58 moritzm: draining restbase1010 for eventual reboot for kernel security update
  • 14:42 moritzm: draining restbase1009 for eventual reboot for kernel security update
  • 14:16 moritzm: draining restbase1008 for eventual reboot for kernel security update
  • 14:03 moritzm: draining restbase1007 for eventual reboot for kernel security update
  • 13:57 moritzm: installing nodejs updates on wtp*
  • 13:36 marostegui: Correction from the previous !log: Rename table valid_tag on db1089 (s1) - T212254
  • 13:35 marostegui: Rename table valid_tag on db1081 (s1) - T212254
  • 13:31 marostegui: Drop image_comment_temp on s4 - T209591
  • 12:46 marostegui: Drop image_comment_temp on s3 - T209591
  • 12:24 zeljkof: EU SWAT finished
  • 12:23 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.9/extensions/Kartographer/: SWAT: Fix using at-ease functions in namespaced class (T212218) (duration: 00m 53s)
  • 12:06 moritzm: rearmed keyholder after netmon2001 reboot
  • 11:58 moritzm: rebooting netmon2001 for kernel security update
  • 11:50 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1091 after schema change - T85757 (duration: 00m 52s)
  • 11:49 moritzm: rebooting matomo1001 to pick up SSBD-enabled qemu
  • 11:46 banyek: repooling db1091 after schema change - T85757
  • 11:41 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1091 for schema change - T85757 (duration: 00m 52s)
  • 11:37 banyek: depooling db1091 for schema change - T85757
  • 11:34 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: repool db1084 after schema change - T85757 (duration: 00m 51s)
  • 11:30 banyek: repooling db1084 after schema change - T85757
  • 11:29 moritzm: upgrading nodejs on restbase2013-2018
  • 11:26 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2050 after recloning db2057 T212275 (duration: 00m 52s)
  • 11:25 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1084 for schema change - T85757 (duration: 00m 52s)
  • 11:15 banyek: depooling db1084 for schema change - T85757
  • 11:13 marostegui: Stop MySQL and power off db2057 for firmware upgrade - T212277
  • 11:11 marostegui: Drop image_comment_temp from s5 - T209591
  • 10:37 moritzm: draining restbase2012 for eventual reboot for kernel security update
  • 10:28 banyek: executing schema change in db2051 (s4 codfw master) with replication enabled - T85757
  • 10:25 banyek: stopping replication on db2073 as executing schema change on codfw master - T85757
  • 10:14 moritzm: draining restbase2011 for eventual reboot for kernel security update
  • 09:53 banyek: dropping tables 'flagged%' on db1066 ptwiki with replication enabled - T211544
  • 09:41 moritzm: draining restbase2010 for eventual reboot for kernel security update
  • 09:37 banyek: dropping tables with 'T211544' prefix on db1122 - T211544
  • 09:24 moritzm: draining restbase2009 for eventual reboot for kernel security update
  • 09:14 moritzm: rebooting restbase2008 for kernel security update
  • 08:53 elukey: roll restart of cassandra on aqs1005-1009 for opendjdk upgrades
  • 08:50 marostegui: Drop image_comment_temp from s7 - T209591
  • 08:46 marostegui: Drop image_comment_temp from s6 - T209591
  • 08:43 akosiaris: rebalance row_A ganeti01.svc.codfw.wmnet nodegroup after recabling T210447
  • 08:40 marostegui: Drop image_comment_temp from s8 - T209591
  • 08:37 moritzm: draining restbase2007 for eventual reboot for kernel security update
  • 08:28 marostegui: Drop image_comment_temp from s2 - T209591
  • 08:27 marostegui: Drop image_comment_temp from s1 - T209591
  • 08:16 godog: swift eqiad-prod: more weight for ms-be10[44-50].eqiad.wmnet - T209618
  • 07:37 marostegui: Drop nodepooldb on m5 master - T212230
  • 07:22 marostegui: Stop MySQL on db2050 to clone db2057 - T212275
  • 07:15 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2050 to clone db2057 T212275 (duration: 00m 52s)
  • 07:00 marostegui: Enable GTID on s8 codfw master (db2045) - T211973
  • 06:58 marostegui: Enable GTID on s1 codfw master (db2048) - T211973
  • 06:44 marostegui: Remove nodepool@10.64.16.155 user from m5 master - T212230
  • 06:34 marostegui: Hard reboot db2057 - T212275
  • 06:27 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2057 - storage crashed T212275 (duration: 01m 08s)
  • 01:27 tstarling@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/AbuseFilter/maintenance/normalizeThrottleParameters.php: g 480681 make maintenance script dry run more useful (duration: 00m 52s)
  • 01:25 tstarling@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/AbuseFilter/includes/AbuseFilter.php: g 480680 fix exception in maintenance script (duration: 00m 54s)
  • 00:11 mutante: contint1001 - rsyncing /srv/org/wikimedia/docs to rsync://docs1001.eqiad.wmnet/docs T211974

2018-12-18

  • 23:49 XioNoX: remove BGP session to AS50629 from cr2-esams (not in AMS-IX anymore)
  • 20:33 otto@deploy1001: Finished deploy [eventlogging/analytics@104adb5]: Send JSON string of event for validation errors in EventError (duration: 00m 04s)
  • 20:33 otto@deploy1001: Started deploy [eventlogging/analytics@104adb5]: Send JSON string of event for validation errors in EventError
  • 19:43 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@963d704]: Parse summaries from lead objects only (T202642) (duration: 05m 26s)
  • 19:38 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@963d704]: Parse summaries from lead objects only (T202642)
  • 19:30 robh: migration of ulsfo pdus complete
  • 19:19 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Production configuration for GrowthExperiments Help Panel (duration: 00m 52s)
  • 19:06 XioNoX: redirect ns1 back to authdns2001 - T210447
  • 18:51 moritzm: installing libx11 security updates
  • 18:48 XioNoX: depool ulsfo - T209101
  • 18:42 XioNoX: repool codfw - T210447
  • 18:41 XioNoX: Revert "Redirect eqsin/ulsfo caches to eqiad" - T210447
  • 18:05 chasemp: stat1004:~# umount /mnt/T211327
  • 17:58 XioNoX: shutdown fpc4 for replacement - T210447
  • 17:51 bblack: deploying php7 cache-splitter patch to cache_text - https://gerrit.wikimedia.org/r/c/operations/puppet/+/478680 - T206339
  • 17:49 godog: bounce rsyslog on lithium, tls listener timeout
  • 17:34 volans: triggered restart of ircecho on icinga1001 while applying https://gerrit.wikimedia.org/r/480509
  • 16:37 jijiki: librsvg* 2.40.20-3+wmf1+stretch1 uploaded to components/thumor to stretch-wikimedia - T209886
  • 16:22 bblack: powercycle cp1075 from console (crashed, apparently)
  • 16:07 XioNoX: starting codfw row A recabling - T210447
  • 16:05 ejegg: updated fundraising python tools from af5dbee8eb to 5f44d9dd43
  • 15:53 XioNoX: redirect ns1 to authdns1001 for T210447
  • 15:47 XioNoX: redirect eqsin/ulsfo caches to eqiad for T210447
  • 15:44 XioNoX: depool codfw for T210447
  • 15:30 akosiaris: empty ganeti2005, ganeti2006 for T210447
  • 15:16 Amir1: mwscript extensions/WikibaseQualityConstraints/maintenance/ImportConstraintEntities.php --wiki=testwikidatawiki --config-format=wgConf | tee WikibaseQualityConstraints-config.php
  • 14:55 akosiaris: restart pybal on lvs1006, lvs2003 for blubberoid LVS deployment. T205919
  • 14:40 zfilipin@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.9
  • 14:26 akosiaris: restart ircecho, seems to have croaked with Dec 17 19:39:52 icinga1001 ircecho[861]: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 174: ordinal not in range(128)
  • 14:25 zfilipin@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.9 and rebuild l10n cache (duration: 34m 11s)
  • 14:15 akosiaris: restart pybal on lvs1016, lvs2006 for blubberoid LVS deployment. T205919
  • 13:51 zfilipin@deploy1001: Started scap: testwiki to php-1.33.0-wmf.9 and rebuild l10n cache
  • 13:12 addshore@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/Wikibase/lib/includes/Formatters/ControlledFallbackEntityIdFormatter.php: T201930 ControlledFallbackEntityIdFormatter, track unique value formats (duration: 00m 46s)
  • 12:52 Amir1: deployed a patch on wmf.8 for T207814
  • 12:41 filippo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 45s)
  • 12:17 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Perform even more PHP constraint checks before falling back (T209504) (duration: 00m 46s)
  • 12:16 moritzm: installing fuse updates from stretch point release
  • 11:41 moritzm: installing remaining libgd2 security updates
  • 09:38 godog: swift eqiad-prod: initial weights for ms-be10[44-50].eqiad.wmnet - T209618
  • 08:44 marostegui: Enable GTID on s7 codfw master (db2040) - T211973
  • 08:26 marostegui: Enable GTID on es3 - T211973
  • 08:23 marostegui: Enable GTID on es2 - T211973
  • {{safesubst:SAL entry|1=07:57 elukey: restart cassandra-{a,b} on aqs1004 for openjdk upgrades}}
  • 07:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 T86338 T202167 (duration: 00m 45s)
  • 07:49 marostegui: Deploy schema change on db1075 (s3 master) T86338 T202167
  • 06:07 marostegui: Deploy schema change on db1078 T86338 T202167
  • 06:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 T86338 T202167 (duration: 00m 45s)
  • 06:03 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2038 after mysql and kernel upgrade (duration: 00m 47s)
  • 00:13 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable MediaViewer thumbnail URL guessing for private wikis (T212099) (duration: 00m 45s)

2018-12-17

  • 23:21 tgr: deployed security patch per T207750#4829275
  • 23:17 chasemp: launch gpg jobs for files on stat1004
  • 23:17 chasemp: launch gpg jobs for files on labstore1007
  • 22:00 mutante: puppetmaster - signed cert for doc1001 (ganeti VM), initial puppet run
  • 21:58 chasemp: stat1004:/var/log# mkfs.exfat /dev/sde && mkdir /mnt/T211327 && mount /dev/sde /mnt/T211327/
  • 21:55 arlolra: Updated Parsoid to 4eba44e (T204622, T211941)
  • 21:42 arlolra@deploy1001: Finished deploy [parsoid/deploy@1bf4dab]: Updating Parsoid to 4eba44e (duration: 08m 55s)
  • 21:33 arlolra@deploy1001: Started deploy [parsoid/deploy@1bf4dab]: Updating Parsoid to 4eba44e
  • 21:20 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@92fdf43]: Update mobileapps to d244439 (duration: 04m 31s)
  • 21:15 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@92fdf43]: Update mobileapps to d244439
  • 20:25 ppchelko@deploy1001: Finished deploy [recommendation-api/deploy@c1b6b32]: Rollback to c1b6b32 until the checks are fixed (duration: 01m 56s)
  • 20:23 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@c1b6b32]: Rollback to c1b6b32 until the checks are fixed
  • 20:17 mutante: creating new ganeti VM doc1001.eqiad.wmnet for doc.wikimedia.org - specs as requested by hashar on T211974
  • 20:05 XioNoX: remove sandbox-out4 from all routers - T212155
  • 20:02 XioNoX: remove sandbox-out4 from ulsfo - T212155
  • 19:47 ppchelko@deploy1001: Finished deploy [recommendation-api/deploy@f183af7]: Update to 657a515. All hosts (duration: 10m 45s)
  • 19:36 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@f183af7]: Update to 657a515. All hosts
  • 19:33 ppchelko@deploy1001: Finished deploy [recommendation-api/deploy@f183af7]: Update to 657a515. Canary on scb2001 (duration: 00m 18s)
  • 19:32 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@f183af7]: Update to 657a515. Canary on scb2001
  • 19:29 raynor: Morning SWAT finished
  • 19:27 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Page issues treatment enabled on all wikis except enwiki(T210553) (duration: 00m 45s)
  • 18:58 ppchelko@deploy1001: Finished deploy [recommendation-api/deploy@8036f9b]: Update to 2991db1. Canary on scb2001 (duration: 00m 51s)
  • 18:57 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@8036f9b]: Update to 2991db1. Canary on scb2001
  • 18:42 herron: manually restarting pybal on lvs2003 to add kibana service T205850
  • 18:34 herron@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=logstash,service=kibana,name=logstash2006.codfw.wmnet
  • 18:34 herron@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=logstash,service=kibana,name=logstash2005.codfw.wmnet
  • 18:34 mobrovac@deploy1001: Finished deploy [proton/deploy@ff7c8a2]: Config: Use extdomain and add printable=yes to the req query - T210793 (duration: 01m 32s)
  • 18:32 herron: performing manual puppet run and manually restarting pybal on lvs2006 to add kibana service T205850
  • 18:32 mobrovac@deploy1001: Started deploy [proton/deploy@ff7c8a2]: Config: Use extdomain and add printable=yes to the req query - T210793
  • 18:29 gtirloni: prometheus-node-exporter: Ignore Docker/Kubelet mount points T211810
  • 18:26 akosiaris@deploy1001: scap-helm blubberoid finished
  • 18:26 akosiaris@deploy1001: scap-helm blubberoid cluster codfw completed
  • 18:26 akosiaris@deploy1001: scap-helm blubberoid install --name production --set docker.registry=docker-registry.discovery.wmnet --set main_app.version=2018-12-13-183249-production --set service.deployment=production --set service.externalIP=10.2.2.31 --set service.port=8748 stable/blubberoid [namespace: blubberoid, clusters: codfw]
  • 18:25 akosiaris@deploy1001: scap-helm blubberoid finished
  • 18:25 akosiaris@deploy1001: scap-helm blubberoid cluster eqiad completed
  • 18:25 akosiaris@deploy1001: scap-helm blubberoid install --name production --set docker.registry=docker-registry.discovery.wmnet --set main_app.version=2018-12-13-183249-production --set service.deployment=production --set service.externalIP=10.2.1.31 --set service.port=8748 stable/blubberoid [namespace: blubberoid, clusters: eqiad]
  • 18:24 akosiaris@deploy1001: scap-helm blubberoid install --name production --set docker.registry=docker-registry.discovery.wmnet --set main_app.version=2018-12-13-183249-production --set service.deployment=production --set service.externalIP=10.2.1.31 --set service.port=8748 stable/blubberoid [namespace: blubberoid, clusters: eqiad]
  • 18:23 akosiaris@deploy1001: scap-helm blubberoid install --name production --set docker.registry=docker-registry.discovery.wmnet --set main_app.version=2018-12-13-183249-production --set service.deployment=production --set service.port=8748 stable/blubberoid [namespace: blubberoid, clusters: eqiad,codfw]
  • 18:21 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@69c000a]: DelayQueue fix and mitigation of stale index for updater (duration: 11m 52s)
  • 18:21 moritzm: installing libgd2 security updates
  • 18:10 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@69c000a]: DelayQueue fix and mitigation of stale index for updater
  • 18:07 akosiaris@deploy1001: scap-helm blubberoid finished
  • 18:07 akosiaris@deploy1001: scap-helm blubberoid cluster staging completed
  • 18:07 akosiaris@deploy1001: scap-helm blubberoid install --name staging --set docker.registry=docker-registry.discovery.wmnet --set main_app.version=2018-12-13-183249-production stable/blubberoid [namespace: blubberoid, clusters: staging]
  • 18:04 herron@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=logstash,service=kibana,name=logstash2004.codfw.wmnet
  • 16:41 moritzm: installing libarchive security updates on jessie
  • 16:33 moritzm: installing php5 security updates on jessie
  • 16:13 anomie: Aborted migrateActors.php run, queries were too slow.
  • 16:07 anomie@mwmaint1002: Running migrateActors.php on test wikis and mediawikiwiki for T188327
  • 16:07 mobrovac@deploy1001: Started restart [electron-render/deploy@94d27d7]: Electron stuck
  • 14:33 marostegui: Upgrade mysql and kernel on db2093 (tendril sby host on codfw)
  • 14:10 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: wmgWikibaseMaxItemIdForNewItemIdHtmlFormatter 200million everywhere T201837 (duration: 00m 44s)
  • 14:09 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wmgWikibaseMaxItemIdForNewItemIdHtmlFormatter 200million everywhere T201837 (duration: 00m 44s)
  • 14:06 marostegui: Stop MySQL on db2038 for mysql and kernel upgrade
  • 14:06 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2038 for mysql and kernel upgrade (duration: 00m 44s)
  • 13:58 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wmgWikibaseMaxItemIdForNewItemIdHtmlFormatter 10million for wikidatawiki T201837 (duration: 00m 45s)
  • 13:49 marostegui: Enable GTID on s3 codfw master db2043 - T211973
  • 13:49 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: wmgWikibaseMaxItemIdForNewItemIdHtmlFormatter 1million for wikidatawiki T201837 (duration: 00m 45s)
  • 13:46 fsero: installing new version of php-excimer on mwdebug* - T205059
  • 13:41 fsero: installing new version of php-excimer on mwdebug2001 - T205059
  • 13:31 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki incubatorwiki --force-protocol https
  • 13:29 Amir1: deleting everything from site_identifiers@incubatorwiki
  • 13:15 ariel@deploy1001: Finished deploy [dumps/dumps@0393ca7]: switch dumps to python3 (duration: 00m 03s)
  • 13:15 ariel@deploy1001: Started deploy [dumps/dumps@0393ca7]: switch dumps to python3
  • 13:04 Amir1: ladsgroup@mwmaint1002:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https (T209820)
  • 13:04 Amir1: ladsgroup@mwmaint1002:~$ foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https
  • 13:02 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=fawiki --force-protocol https
  • 12:52 ema: lvs1016: bounce pybal for new service kartotherian-ssl T211970
  • 12:50 ladsgroup@deploy1001: Synchronized wmf-config/InterwikiSortOrders.php: Update InterwikiSortOrders.php (T209820) (duration: 00m 45s)
  • 12:49 ladsgroup@deploy1001: sync-file aborted: Use the right index for change_tag (T211896) (duration: 00m 02s)
  • 12:38 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add be.wikisource new 1.5x and 2x logos to wgLogoHD (T150618) (duration: 00m 44s)
  • 12:38 ema: lvs2003: bounce pybal for new service kartotherian-ssl T211970
  • 12:29 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Update be.wikisource logo (T211795) (duration: 00m 45s)
  • 12:29 ema: lvs1006: bounce pybal to pick up new service kartotherian-ssl T211970
  • 12:24 ema: lvs2006: bounce pybal to pick up new service kartotherian-ssl
  • 12:22 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable extendedmover user group at ur.wiki (T211978) (duration: 00m 45s)
  • 12:15 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use new minerva logos for cswiki in IS.php (T210979) (duration: 00m 45s)
  • 12:08 zfilipin@deploy1001: Synchronized static/images/mobile/copyright/: SWAT: Upload custom minerva logo for cswiki (T210979) (duration: 00m 44s)
  • 12:02 ladsgroup@deploy1001: Finished deploy [ores/deploy@18d3657]: T206333 T211267 (duration: 14m 14s)
  • 11:48 ladsgroup@deploy1001: Started deploy [ores/deploy@18d3657]: T206333 T211267
  • 10:52 paravoid: reprepro include python 3.4, 3.6, 3.7 to component/pyall (use with care)
  • 10:28 moritzm: installing openssl security updates on stretch
  • 10:18 marostegui: Repool labsdb1011 - T86338
  • 09:55 moritzm: remove debmonitor entries for restbase2001-restbase2006
  • 09:44 marostegui: Enable GTID on s4 codfw master - T211973
  • 09:08 marostegui: Depool labsdb1011 - T86338
  • 09:04 marostegui: Repool labsdb1010 - T86338
  • 09:01 elukey: stop kafkatee on oxygen and rsync /srv/log data to weblog1001
  • 08:47 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 T86338 T202167 (duration: 00m 45s)
  • 08:21 marostegui: Deploy schema change on db1077 with replication (lag will be generated on labsdb:s3) T86338 T202167
  • 08:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 T86338 T202167 (duration: 00m 45s)
  • 08:16 marostegui: Stop replication on labsdb1011
  • 08:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 T86338 T202167 (duration: 00m 45s)
  • 07:58 marostegui: Enable GTID on db2052 (s5 master) - T211973
  • 06:48 marostegui: Enable GTID on db2035 (s2 master) - T211973
  • 06:36 marostegui: Enable GTID on db2034 (x1 master) - T211973
  • 06:36 marostegui: Enable GTID on db2034 (x1 master)
  • 06:29 marostegui: Enable replication consistency options on codfw masters - T211973
  • 06:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 T86338 T202167 (duration: 00m 46s)
  • 06:28 marostegui: Depool labsdb1010 - T86338
  • 04:22 tstarling@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/AbuseFilter/maintenance/normalizeThrottleParameters.php: gerrit 479998 for testing of normalizeThrottleParameters.php (duration: 00m 44s)
  • 04:20 tstarling@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/AbuseFilter/includes/AbuseFilter.php: gerrit 479998 for testing of normalizeThrottleParameters.php (duration: 00m 59s)

2018-12-16

  • 09:52 elukey: mask + reset-failed kafkatee default instance on sulfur (kafkatee-webrequest works fine)
  • 07:30 marostegui: Reboot db1115 after OOM
  • 07:28 marostegui: Stop MySQL on db1115 so tendril can get back to work

2018-12-15

  • off: restarted pdfrender on scb1004
  • 09:22 elukey: mask + reset-failed kafkatee default instance on weblog1001

2018-12-14

  • 23:59 mutante: LDAP: added aezell to wmf group (T211945) for grafana access
  • 22:00 XioNoX: increase accepted-prefix-limit for HE to 200000
  • 20:38 mutante: sulfur systemctl restart nagios-nrpe-server
  • 20:32 andrew@deploy1001: Finished deploy [horizon/deploy@1a830b9]: Rolling out fix for T177855 (duration: 03m 17s)
  • 20:29 andrew@deploy1001: Started deploy [horizon/deploy@1a830b9]: Rolling out fix for T177855
  • 20:13 otto@deploy1001: Finished deploy [analytics/refinery@ef1f7c6]: deploying refinery-source 0.0.82 with fix for T211833 (duration: 06m 04s)
  • 20:07 otto@deploy1001: Started deploy [analytics/refinery@ef1f7c6]: deploying refinery-source 0.0.82 with fix for T211833
  • 20:07 otto@deploy1001: deploy aborted: (no justification provided) (duration: 00m 00s)
  • 20:07 otto@deploy1001: Started deploy [analytics/refinery@ef1f7c6]: (no justification provided)
  • 17:22 cmjohnson1: mw1272 down for h/w troubleshooting
  • 14:38 marostegui: Enable GTID on db2039 (s6 codfw master) - T211973
  • 14:03 marostegui: Compare ruwiki.revision between db2039 (s6 master) and db1085 - T211973
  • 13:59 marostegui: Enable notifications for db1095 (s3 lag check)- T211973
  • 13:57 marostegui: Enable notifications for db2068 (s7 lag check)- T211973
  • 13:47 dcausse: elastic@codfw copying index data from the main cluster to psi & omega (test disk usage & import speed)
  • 13:42 marostegui: Enable GTID on db1124:3318 - T211973
  • 11:11 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2083 db2068 db2067 after mysql and kernel upgrade (duration: 00m 44s)
  • 10:51 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2083 db2068 db2067 for mysql and kernel upgrade (duration: 00m 45s)
  • 10:50 marostegui: Stop MySQL on db2083, db2068 and db2067 for mysql and kernel upgrade
  • 10:19 filippo@deploy1001: Synchronized wmf-config/logging.php: wmf-config/InitialiseSettings-labs.php (duration: 00m 44s)
  • 10:18 filippo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no justification provided) (duration: 00m 45s)
  • 09:33 marostegui: Deploy schema change on db1095:3313 T86338 T202167
  • 09:28 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2049, db2059, db2070 after mysql and kernel upgrade (duration: 00m 45s)
  • 09:21 banyek: global user rename is in progress - T209488
  • 08:58 marostegui: Stop MySQL on db2049, db2059 and db2070 for mysql and kernel upgrade
  • 08:56 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2049, db2059, db2070 for mysql and kernel upgrade (duration: 00m 43s)
  • 08:50 elukey: swap oxygen with weblog1001
  • 08:48 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2084 and db2088 after mysql and kernel upgrade (duration: 00m 44s)
  • 08:47 elukey: disabled kafkatee-webrequest logstash output on oxygen (prep step before weblog1001)
  • 07:59 marostegui: Deploy schema change on dbstore1002:s3 T86338 T202167
  • 07:49 marostegui: Upgrade mysql and kernel on db2084 and db2088
  • 07:49 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2084 and db2088 for mysql and kernel upgrade (duration: 00m 45s)
  • 06:54 marostegui: Deploy schema change on db2043 (s3 codfw master) - this will generate lag on s3 codfw T86338 T202167
  • 06:53 marostegui: Deploy schema change on db2043 (s3 codfw master) - this will generate lag on s3 codfw T86338 T20216
  • 06:14 marostegui: Deploy schema change on db1062 (s7 primary master) T86338 T202167
  • 06:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1094 T86338 T202167 (duration: 00m 44s)
  • 06:10 marostegui: Deployed schema change on db1094 T86338 T202167
  • 05:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1094 T86338 T202167 (duration: 00m 47s)
  • 01:39 mutante: install2002 deleted /srv/ contents,then mounted /mnt/vdb on /srv so same content but now / is used only 7% and /srv 57% (T211850)
  • 00:43 mutante: install2002 (T211850) restarted instance, created ext4 filesystem on new /dev/vdb, mounted on /mnt/vdb, rsyncing /srv/ to /mnt/vdb/
  • 00:33 mutante: rebooting install2002 via ganeti2003, to add new virtual disk
  • 00:31 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Enforce a 10-byte password for +staff users, I4ecac70e (duration: 00m 44s)
  • 00:27 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT I7867277d Make wmgBabelMainCategory consistent for sr* wikis (duration: 00m 45s)

2018-12-13

  • 23:57 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: T208246: Increase default minimum new password length to 10 for privileged groups (duration: 00m 44s)
  • 23:17 andrew@deploy1001: Finished deploy [horizon/deploy@18c4ca6]: Rolling out fix for T131367 (duration: 03m 25s)
  • 23:13 andrew@deploy1001: Started deploy [horizon/deploy@18c4ca6]: Rolling out fix for T131367
  • 22:31 reedy@deploy1001: Synchronized php-1.33.0-wmf.8/includes/http/HttpRequestFactory.php: T211886 (duration: 00m 44s)
  • 22:29 reedy@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/MobileFrontend: T211903 (duration: 00m 48s)
  • 21:55 mutante: Ganeti - creating new 120G virtual disk on install2002 (T211850)
  • 21:00 otto@deploy1001: Finished deploy [analytics/superset/deploy@UNKNOWN]: revert to version 0.26.3 (duration: 00m 32s)
  • 20:59 otto@deploy1001: Started deploy [analytics/superset/deploy@UNKNOWN]: revert to version 0.26.3
  • 20:33 dcausse: creating 300+ wikis indices in elastic-psi @eqiad and @codfw
  • 20:21 volans: imported elasticsearch-curator_5.2.0-1~deb9u1 into apt.w.o stretch-wikimedia component/spicerack - T205884
  • 20:10 mutante: LDAP - added mneisler to wmf (T211742) - existing shell user, so no gerrit change needed
  • 19:53 hoo: Ran scap pull on snapshot1005 to undo live changes done for dump performance testing
  • 19:49 dcausse: creating 300 wiki indices in elastic-omega@eqiad
  • 19:46 dcausse: SF Morning SWAT done
  • 19:44 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: T210381: [cirrus] fix cluster settings for temp clusters psi&omega (duration: 00m 44s)
  • 19:40 moritzm: removed labvirt1014 from debmonitor DB, has been renamed to cloudvirt1014
  • 19:39 volans: imported python-elasticsearch_5.4.0-1~deb9u1 into apt.w.o stretch-wikimedia component/spicerack - T205884
  • 19:35 dcausse@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/TwoColConflict/: T210501: Add missing code to not loose edits on the other side (duration: 00m 45s)
  • 19:29 bblack: authdns2001: upgrading gdnsd to 2.99.9944-beta
  • 19:27 bblack: multatuli: upgrading gdnsd to 2.99.9944-beta
  • 19:22 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T211234: Enable Block notice stats on top blocking wikis (duration: 00m 45s)
  • 19:19 bblack: authdns1001: upgrading gdnsd to 2.99.9944-beta
  • 18:52 arlolra: Updated Parsoid to 4242ad0 (T204622, T211738)
  • 18:40 arlolra@deploy1001: Finished deploy [parsoid/deploy@e27574c]: Updating Parsoid to 4242ad0 (duration: 09m 17s)
  • 18:31 arlolra@deploy1001: Started deploy [parsoid/deploy@e27574c]: Updating Parsoid to 4242ad0
  • 18:30 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@93443fe]: Refine MW API queries (duration: 03m 41s)
  • 18:26 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@93443fe]: Refine MW API queries
  • 18:24 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: T204748 [Beta only] Use newly-fixed config for Wikibase->Commons federation (duration: 00m 44s)
  • 18:22 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T204748 Rename repo-only Wikibase config for clarity [no-op] (duration: 00m 45s)
  • 18:11 bblack: restart pybal on lvs1006 for config updates
  • 18:07 ladsgroup@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/FlaggedRevs/frontend/specialpages/reports/ProblemChanges_body.php: Use the right index for change_tag (T211896) (duration: 00m 46s)
  • 17:45 ladsgroup@deploy1001: Finished deploy [ores/deploy@1a3de73]: T211267 (duration: 13m 53s)
  • 17:41 akosiaris: reapply the zotero calico policy to allow LVS endpoints
  • 17:35 dcausse: elastic@eqiad created cirrus metastore on psi&omega
  • 17:32 ladsgroup@deploy1001: Started deploy [ores/deploy@1a3de73]: T211267
  • 17:31 arturo: T168967 added shiny-server .deb to stretch-wikimedia
  • 17:20 godog: run puppet and bounce pybal on lvs in eqiad to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/479184
  • 17:19 akosiaris: T205919 create namespace for blubberoid on eqiad/codfw/staging clusters
  • 17:00 anomie: Set comment migration to new on group 1 (T166733)
  • 16:33 mutante: icinga1001 - started service again, enabeld puppet
  • 16:27 mutante: icinga1001 - disable puppet, stopped icinga, for cable replacement
  • 16:18 anomie: Deployed fix for T210937: API: Use parenthesized join in ApiQueryBase::showHiddenUsersAddBlockInfo
  • 15:44 moritzm: installing openssl 1.1 security updates on Hadoop workers
  • 15:33 moritzm: rebooting pybal-test hosts to pick up SSBD-enabled qemu
  • 15:18 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting actor migration to write-both/read-old on all wikis (T188327) (duration: 00m 45s)
  • 15:12 anomie@deploy1001: Synchronized php-1.33.0-wmf.8/includes/api/ApiPageSet.php: Backport fix for T211804: ApiPageSet::initFromPageIds: Default $filterIds to true (duration: 00m 46s)
  • 15:03 akosiaris: ores2* deploy https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/479206/
  • 14:55 moritzm: installing openssl 1.1 security updates on mw canaries (along with nginx restart/upgrade)
  • 14:46 cdanis: updating grafana/stretch-wikimedia to 5.4.2: reprepro --restrict grafana update stretch-wikimedia
  • 14:40 akosiaris: disable puppet on ores1* and ores2* machines to deploy https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/479206/
  • 14:36 ladsgroup@deploy1001: Finished deploy [ores/deploy@a9d5e95]: noop (duration: 14m 59s)
  • 14:23 ladsgroup@deploy1001: Started deploy [ores/deploy@a9d5e95]: noop
  • 13:22 dcausse: creating 300+ wiki indices on elastic-omega@codfw
  • 13:15 godog: stop restbase and cassandra on restbase200[1-6] - T211070
  • 12:59 elukey: superset on analytics-tool1003 upgraded to 0.28.1
  • 12:41 arturo: icinga downtime (30 mins) cloudcontrol1003, cloudnet1003 and cloudnet1004 for package upgrades
  • 12:38 zeljkof: EU SWAT finished
  • 12:35 zfilipin@deploy1001: Synchronized static/images/project-logos: SWAT: Update maiwiki logo (T211845) (duration: 00m 52s)
  • 12:35 oblivian@deploy1001: scap-helm zotero finished
  • 12:35 oblivian@deploy1001: scap-helm zotero cluster codfw completed
  • 12:35 oblivian@deploy1001: scap-helm zotero cluster eqiad completed
  • 12:35 oblivian@deploy1001: scap-helm zotero upgrade production -f ../zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: eqiad,codfw]
  • 12:26 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable flood user group at ne.wiki (T211181) (duration: 00m 51s)
  • 12:14 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration on shnwiki (T210699) (duration: 00m 53s)
  • 12:12 mobrovac@deploy1001: Finished deploy [restbase/deploy@55fcd4b]: Remove restbase200[1-6], ensure body.tfa exists for feed responses and disable Citoid check - T211070 T211871 T211411 (duration: 18m 59s)
  • 12:10 oblivian@deploy1001: scap-helm -h finished
  • 12:10 oblivian@deploy1001: scap-helm -h cluster codfw completed
  • 12:10 oblivian@deploy1001: scap-helm -h cluster eqiad completed
  • 12:10 oblivian@deploy1001: scap-helm -h [namespace: -h, clusters: eqiad,codfw]
  • 12:00 moritzm: rebooting ununpentium to pick up SSBD-enabled qemu
  • 11:53 mobrovac@deploy1001: Started deploy [restbase/deploy@55fcd4b]: Remove restbase200[1-6], ensure body.tfa exists for feed responses and disable Citoid check - T211070 T211871 T211411
  • 11:51 elukey@deploy1001: Finished deploy [analytics/superset/deploy@35841a7]: (no justification provided) (duration: 00m 38s)
  • 11:51 elukey@deploy1001: Started deploy [analytics/superset/deploy@35841a7]: (no justification provided)
  • 11:45 mobrovac@deploy1001: Finished deploy [restbase/deploy@29a0902]: Remove restbase200[1-6] and ensure body.tfa exists for feed responses - T211070 T211871 (duration: 06m 08s)
  • 11:43 moritzm: rebooting vega/bromine to pick up SSBD-enabled qemu
  • 11:39 mobrovac@deploy1001: Started deploy [restbase/deploy@29a0902]: Remove restbase200[1-6] and ensure body.tfa exists for feed responses - T211070 T211871
  • 11:39 mobrovac@deploy1001: Finished deploy [restbase/deploy@29a0902]: Remove restbase200[1-6] and ensure body.tfa exists for feed responses - T211070 T211871 (duration: 07m 08s)
  • 11:38 fsero@deploy1001: scap-helm zotero finished
  • 11:38 fsero@deploy1001: scap-helm zotero cluster codfw completed
  • 11:38 fsero@deploy1001: scap-helm zotero upgrade production -f ../zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 11:38 moritzm: rebooting krypton to pick up SSBD-enabled qemu
  • 11:31 mobrovac@deploy1001: Started deploy [restbase/deploy@29a0902]: Remove restbase200[1-6] and ensure body.tfa exists for feed responses - T211070 T211871
  • 11:31 fsero@deploy1001: scap-helm zotero finished
  • 11:31 fsero@deploy1001: scap-helm zotero cluster eqiad completed
  • 11:31 fsero@deploy1001: scap-helm zotero upgrade production -f ../zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 11:14 moritzm: rebooting webperf hosts to pick up SSBD-enabled qemu
  • 10:51 moritzm: rebooting dbmonitor hosts to pick up SSBD-enabled qemu
  • 10:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1101:3317 T86338 T202167 (duration: 00m 51s)
  • 10:00 elukey: upgrade nodejs on aqs100[5-9]
  • 09:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1101:3317 T86338 T202167 (duration: 00m 51s)
  • 09:55 moritzm: removed openssl 1.1.0f-3+deb9u2+wmf1 from stretch-wikimedia/component/node10 (superseded by openssl update in DSA 4348 for stretch)
  • 09:35 moritzm: rebooting etcd/kubernetes hosts in codfw to pick up SSBD-enabled qemu
  • 09:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3317 T86338 T202167 (duration: 00m 51s)
  • 09:06 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2006.codfw.wmnet
  • 09:06 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2005.codfw.wmnet
  • 09:06 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2004.codfw.wmnet
  • 09:06 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2003.codfw.wmnet
  • 09:05 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2002.codfw.wmnet
  • 09:05 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2001.codfw.wmnet
  • 08:50 godog: stress-test ms-be10[44-50] - T209618
  • 08:45 marostegui: Deploy schema change on db1090:3317 T86338 T202167
  • 08:45 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3317 T86338 T202167 (duration: 00m 53s)
  • 08:45 moritzm: installing openssl security updates on stretch
  • 08:26 vgutierrez: Use certcentral managed TLS certificates in mx[12]001.wikimedia.org - T207050
  • 08:15 marostegui: Drop unused flaggedrevs tables from srwikinews - T209761
  • 08:11 moritzm: rolling reboot of scb in eqiad for kernel security update (combined with nodejs update)
  • 08:09 marostegui: Repool labsdb1011 T86338
  • 08:08 moritzm: installing nodejs updates on restbase1007
  • 07:28 marostegui: Depool labsdb1011 T86338
  • 07:25 marostegui: Repool labsdb1010 T86338
  • 07:22 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1079 T86338 T202167 (duration: 00m 52s)
  • 06:46 marostegui: Deploy schema change on db1079 with replication, lag will be generated on labsdb:s7 T86338 T202167
  • 06:45 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1079 T86338 T202167 (duration: 00m 53s)
  • 06:43 marostegui: Depool labsdb1010 T86338
  • 06:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1086 T86338 T202167 (duration: 00m 51s)
  • 06:17 marostegui: Deploy schema change on db1086 T86338 T202167
  • 06:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1086 T86338 T202167 (duration: 00m 51s)
  • 06:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1098:3316 db1098:3317 after kernel and mysql upgrade (duration: 00m 54s)
  • 00:52 reedy@deploy1001: Synchronized php-1.33.0-wmf.8/includes/http/GuzzleHttpRequest.php: T211806 (duration: 00m 51s)
  • 00:41 mutante: einsteinium - rm /lib/systemd/system/update-etcd-mw-config-lastindex.service ; systemctl reset-failed
  • 00:36 tzatziki: changing two passwords for compromised accounts
  • 00:32 dcausse: elastic@codfw created cirrus metastore on psi&omega clusters
  • 00:22 dcausse@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/MobileFrontend/extension.json: T210390: Reset default mobilefrontend provider (duration: 00m 53s)
  • 00:13 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: T210381: [cirrus] fix temp clusters for codfw (duration: 00m 52s)

2018-12-12

  • 22:47 tzatziki: change email for User:Denrique
  • 21:48 reedy@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/EventBus: T211805 (duration: 00m 53s)
  • 21:26 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@ced6fab]: Update mobileapps to 55981a8. Summary: Get modified date with regexes to avoid unneeded Document parse (duration: 04m 03s)
  • 21:22 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@ced6fab]: Update mobileapps to 55981a8. Summary: Get modified date with regexes to avoid unneeded Document parse
  • 19:45 hashar: contint1001: sudo chown -R zuul:zuul /etc/zuul/wikimedia/.git
  • 19:28 XioNoX: repool codfw - T210456
  • 19:28 XioNoX: revert redirecting eqsin/ulsfo caches to eqiad - T210456
  • 19:16 XioNoX: re-enable BGP to telia on cr1-codfw - T211715
  • 19:08 XioNoX: disable BGP to telia on cr1-codfw - T211715
  • 19:06 bblack: uploading gdnsd 2.99.9944-beta-1+wmf1 to stretch-wikimedia
  • 19:04 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: pool db1098 for recentchanges and recentchangeslinked (duration: 00m 50s)
  • 18:53 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: pool db1098 for recentchanges and recentchangeslinked (duration: 02m 58s)
  • 18:46 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: pool db1098 for recentchanges and recentchangeslinked (duration: 03m 00s)
  • 18:41 banyek: pool db1098 for recentchanges and recenlchangeslinked
  • 18:12 catrope@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/GrowthExperiments/extension.json: Temporarily disable help panel / VisualEditor integration (duration: 03m 00s)
  • 17:55 banyek: pooling db1098
  • 17:54 XioNoX: shutting down asw-b4-codfw - T210456
  • 17:02 zfilipin@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.8 (duration: 00m 50s)
  • 16:42 moritzm: installing lxml security updates
  • 16:40 moritzm: installing cups updates on trusty (only client libs used)
  • 16:34 jijiki: Merged 477472 "mcrouter: replace codfw proxy before maintenance", eqiad mcrouters are picking up the change - T210467
  • 16:19 ladsgroup@deploy1001: Synchronized php-1.33.0-wmf.8/includes/specials/pagers/ImageListPager.php: T211774 (duration: 00m 52s)
  • 16:17 ladsgroup@deploy1001: Synchronized php-1.33.0-wmf.8/includes/specials/pagers/ImageListPager.php: T211774 (duration: 00m 52s)
  • 16:01 XioNoX: Redirect eqsin/ulsfo caches to eqiad - T210456
  • 15:54 XioNoX: Depool codfw for row B recabling - T210456
  • 15:35 elukey: upload matomo 3.7.0 to stretch-wikimedia, removed 3.5.1 from jessie-wikimedia
  • 15:27 moritzm: installing PHP security updates on matomo1001 (piwik host)
  • 15:24 godog: poweroff ms-be2044 for hardware inspection - T209921
  • 15:22 ladsgroup@deploy1001: Synchronized php-1.33.0-wmf.8/includes/api/ApiBase.php: T211769 (duration: 00m 52s)
  • 15:21 urandom: decommissioning cassandra-c, restbase2006 -- T210843
  • 15:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1098:3316 db1098:3317 after kernel and mysql upgrade (duration: 00m 53s)
  • 14:43 banyek: renaming tables on db1122 ptwiki: flagged* -> T211544_flagged* - T211544
  • 14:29 zfilipin@deploy1001: rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.33.0-wmf.8"
  • 14:18 moritzm: restart uwsgi-netbox on netmon2001
  • 14:16 zfilipin@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.8 (duration: 00m 51s)
  • 14:07 mobrovac@deploy1001: Started restart [restbase/deploy@5946231]: Restart RB to pick up the new seeds in codfw - T211416
  • 14:04 marostegui: Stopy MySQL on db1098:3316 and db1098:3317 for kernel and mysql upgrade
  • 13:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 db1098:3317 for kernel and mysql upgrade (duration: 00m 52s)
  • 13:26 moritzm: installing PHP security updates
  • 12:52 zeljkof: EU SWAT finished
  • 12:49 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add NS_PROJECT localised name for tt.wiktionary (T211312) (duration: 00m 52s)
  • 12:41 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add new namespace aliases for zhwikiversity (T207544) (duration: 00m 52s)
  • 12:35 arturo: T205969 icinga downtime load-avg check for labstore1007 until January (1 month)
  • 12:32 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable extension SandboxLink for nowiki (T210325) (duration: 00m 52s)
  • 12:27 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use HD logos for cawikimedia in IS.php (T198507) (duration: 00m 52s)
  • 12:17 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Upload new logos for cawikimedia (T198507) (duration: 00m 52s)
  • 12:03 hoo@deploy1001: Synchronized wmf-config/Wikibase.php: Display Kartographer mapframes for geocoordinate statements (T184933) (duration: 00m 52s)
  • 11:30 volans: re-enabled puppet on icinga[12]001, re-activated crontab to sync files on 2001 and manually run it + run puppet
  • 11:14 volans: restarting icinga with dropped downtimes from last night (start_date > 1544489652)
  • 11:03 volans: restarting Icinga with debug log on icinga1001
  • 10:51 mobrovac@deploy1001: Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2b - T211416 (duration: 10m 11s)
  • 10:41 mobrovac@deploy1001: Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2b - T211416
  • 10:40 mobrovac@deploy1001: Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2 - T211416 (duration: 00m 15s)
  • 10:40 mobrovac@deploy1001: Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date, try #2 - T211416
  • 10:26 volans: Icinga is having issue restarting properly, investigation ongoing
  • 10:08 banyek: executing schema change on db1070 (s5 master) - T85757
  • 09:40 banyek: repooling labsdb1010 - T210693
  • 09:34 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1082 (duration: 00m 52s)
  • 09:29 banyek: repooling db1082 - T85757
  • 09:25 banyek: restarting replication on db1082 after schema change - T85757
  • 09:25 banyek: fixing triggers on db1124:3315- T85757
  • 09:22 banyek: executing schema change with replication on db1082 - T85757
  • 09:20 banyek: stopping replication on db1082 for schema change - T85757
  • 09:15 moritzm: installing pixman security updates on trusty (Debian already fixed)
  • 08:48 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1082 (duration: 00m 51s)
  • 08:38 banyek: depooling db1082 for schema change - T85757
  • 08:37 marostegui: Remove old backup directory from db1116 - T206743
  • 08:18 godog: decommissioning cassandra-b, restbase2006 -- T210843
  • 08:03 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1088 (duration: 00m 51s)
  • 07:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1088 (duration: 00m 52s)
  • 07:38 marostegui: Deploy schema change on db2040 (s7 codfw master), this will generate lag on codfw T86338 T202167
  • 07:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1088 (duration: 00m 51s)
  • 07:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1088 (duration: 00m 51s)
  • 06:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1088 (duration: 00m 52s)
  • 06:44 marostegui: Stop MySQL on db1088 for mysql and kernel upgrade
  • 06:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1088 for mysql upgrade (duration: 01m 07s)
  • 06:37 marostegui: Deploy schema change on s4 primary master (db1068) T86338
  • 06:00 marostegui: Deploy schema change on s8 primary master (db1071) T86338 T202167
  • 00:47 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add some missing groups to the privileged list (duration: 00m 51s)
  • 00:46 tgr@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Bring up password change logging to the same standards as login logging Add some missing groups to the privileged list (duration: 00m 53s)

2018-12-11

  • 22:50 chasemp: ssh to tar archive data from logstash1006 /mnt (external) to labstore1007
  • 22:06 XioNoX: push loopback filter term return-tcp to all routers - T207962
  • 22:04 urandom: decommissioning cassandra-a, restbase2006 -- T210843
  • 19:57 XioNoX: apply BGP_IXP_RS_in and avoid HE to cr4-ulsfo - T211079
  • 19:28 jforrester@deploy1001: Synchronized php-1.33.0-wmf.8/extensions/Flow/modules/engine/misc/flow-handlebars.js: T211707 Hot-deploy fix for StructuredDiscussions pages (duration: 00m 52s)
  • 19:22 jforrester@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: T209619: Hot-deploy Ibcf3c93e (duration: 00m 52s)
  • 18:59 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@ecd5fb6]: fix site CSS URL (duration: 03m 58s)
  • 18:55 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@ecd5fb6]: fix site CSS URL
  • 18:54 XioNoX: push BGP_IXP_RS_in to all routers (but don't apply it to any peers, needs to be done manually) - T211079
  • 18:33 cmjohnson1: swapping disk slot 0 db1063
  • 18:23 XioNoX: push BGP_IXP_RS_in to cr2-eqord - T211079
  • 18:06 anomie@deploy1001: Synchronized php-1.33.0-wmf.6/includes/user/User.php: Backport fix for T210621, for real this time (duration: 00m 53s)
  • 18:00 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: T211237 Hot-deploy disabling Wikibase federation, unused except in Beta Cluster (duration: 00m 52s)
  • 17:33 _joe_: restarting pybal on lvs2003
  • 17:23 XioNoX: push changes tested on cr4-ulsfo to all routers - T211079
  • 17:16 _joe_: restarting pybal on lvs2006
  • 17:15 oblivian@puppetmaster1001: conftool action : set/weight=1; selector: service=search-omega-ssl
  • 17:15 oblivian@puppetmaster1001: conftool action : set/weight=1; selector: service=search-psi-ssl
  • 17:15 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: service=search-psi-ssl
  • 17:14 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: service=search-omega-ssl
  • 17:05 XioNoX: remove redundant term classification from BGP_transit_in on cr4-ulsfo - T211079
  • 16:55 banyek: executing schema change on dbstore1002 - T85757
  • 16:50 XioNoX: replace local-preference/default-action by next policy for BGP_IXP_in and BGP_Private_Peer_in on cr4-ulsfo - T211079
  • 16:49 banyek: executing schema change on db1102 - T85757
  • 16:48 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1113:3315 (duration: 00m 51s)
  • 16:44 banyek: repooling db1113:3315 after schema change - T85757
  • 16:38 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1113:3315 (duration: 00m 51s)
  • 16:35 banyek: depooling db1113:3315 for schema change - T85757
  • 16:33 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1110 (duration: 00m 51s)
  • 16:32 banyek: repooling db1110 after schema change - T85757
  • 16:28 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1110 (duration: 00m 50s)
  • 16:25 banyek: depooling db1110 for schema change - T85757
  • 16:23 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1100 (duration: 00m 51s)
  • 16:22 XioNoX: Remove static routes for NS v6 IPs - T211699
  • 16:20 banyek: repooling db1100 after schema change - T85757
  • 16:15 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1100 (duration: 00m 52s)
  • 16:09 banyek: depooling db1100 for schema change - T85757
  • 15:49 zfilipin@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.8
  • 15:45 papaul: re-installing OS on ms-be2047.codfw.wmnet
  • 15:31 zfilipin@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.8 and rebuild l10n cache (duration: 36m 08s)
  • 15:06 vgutierrez: Use certcentral managed TLS certificate in mirrors.wikimedia.org - T207050
  • 14:55 zfilipin@deploy1001: Started scap: testwiki to php-1.33.0-wmf.8 and rebuild l10n cache
  • 14:50 zfilipin@deploy1001: Pruned MediaWiki: 1.33.0-wmf.3 (duration: 02m 51s)
  • 14:49 banyek: depooling labsdb1010 - T210693
  • 14:46 zfilipin@deploy1001: Pruned MediaWiki: 1.33.0-wmf.2 (duration: 03m 12s)
  • 14:43 zfilipin@deploy1001: Pruned MediaWiki: 1.33.0-wmf.1 (duration: 11m 40s)
  • 14:42 volans: 'sudo systemctl reload icinga' on icinga1001
  • 14:32 godog: decommissioning cassandra-c, restbase2005 -- T210843
  • 14:31 bblack: removed unused public IPv6 IPs from authdnses manually with "ip -6 addr del ..." - https://gerrit.wikimedia.org/r/c/operations/puppet/+/478939
  • 14:23 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2013.codfw.wmnet
  • 14:13 mobrovac@deploy1001: Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date (duration: 00m 38s)
  • 14:13 mobrovac@deploy1001: Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date
  • 14:10 mobrovac@deploy1001: Finished deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date - T211416 (duration: 01m 53s)
  • 14:08 mobrovac@deploy1001: Started deploy [restbase/deploy@44e0955]: Bring restbase201[3-8] up to date - T211416
  • 13:48 vgutierrez: Use certcentral TLS managed certificate in lists.wikimedia.org - T207050
  • 13:06 zeljkof: EU SWAT finished
  • 13:02 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: HD Logos: Add fr and fy wikibooks and fr wiiknews variants to InitaliseSettings.php (T150618) (duration: 00m 46s)
  • 12:52 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update settings to include new HD logos (T150618) (duration: 00m 47s)
  • 12:50 hoo@deploy1001: Finished deploy [wdqs/wdqs@f914415]: Fix WDQS UI embeds (T211629) (duration: 10m 31s)
  • 12:39 hoo@deploy1001: Started deploy [wdqs/wdqs@f914415]: Fix WDQS UI embeds (T211629)
  • 12:38 bblack: Authdns CI/config refactoring done, all is well, resume normal DNS ops!
  • 12:37 elukey: updated nodejs nodejs-legacy on aqs1004 (security upgrades)
  • 12:36 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: HD Logos: Add 1.5x and 2x variants of fr and fy wikibooks and fr wikinews (T150618) (duration: 00m 46s)
  • 12:16 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Add HD logos for 3 projects (T150618) (duration: 00m 47s)
  • 12:12 marostegui: Repool labsdb1011 - T86338
  • 11:59 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@dcde39f]: GUI update (duration: 01m 05s)
  • 11:58 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@dcde39f]: GUI update
  • 11:39 bblack: puppet disabled on authdnses for attempting https://gerrit.wikimedia.org/r/q/topic:%22authdns-ci%22
  • 11:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1087 T86338 T202167 (duration: 00m 46s)
  • 11:10 marostegui: Depool labsdb1011 - T86338
  • 11:06 marostegui: Repool labsdb1010 - T86338
  • 11:03 volans: restarted pdfrender on scb1003 [last time, we need an automatic restart]
  • 10:47 fsero: pooling mw1272
  • 10:42 fsero: scap pull mw1272
  • 09:30 ema: mw1272 down for the past 12h. Nothing in console, power-cycling
  • 09:08 marostegui: Deploy schema change on db1087 with replication (this will generate lag on labsdb:s8) T202167 T86338
  • 09:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1087 T86338 T202167 (duration: 00m 46s)
  • 09:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1092 T86338 T202167 (duration: 00m 46s)
  • 09:01 marostegui: Depool labsdb1010 - T86338
  • 08:15 marostegui: Deploy schema change on db1092 T202167 T86338
  • 08:15 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1092 T86338 T202167 (duration: 00m 46s)
  • 08:10 godog: decommissioning cassandra-b, restbase2005 -- T210843
  • 07:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1104 T86338 T202167 (duration: 00m 46s)
  • 07:32 oblivian@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: Hotfix for logging on php7 (2/2) (duration: 02m 51s)
  • 07:29 oblivian@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw1272.*
  • 07:28 oblivian@deploy1001: Synchronized wmf-config/php7.php: Hotfix for logging on php7 (1/2) (duration: 02m 50s)
  • 07:06 marostegui: Deploy schema change on db1104 T202167 T86338
  • 07:06 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 T86338 T202167 (duration: 02m 51s)
  • 06:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 T86338 T202167 (duration: 02m 52s)
  • 06:45 marostegui: Rename flaggedrevs tables on srwikinews on db1078 - T209761
  • 06:13 marostegui: Deploy schema change on db1109 T202167 T86338
  • 06:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 T86338 T202167 (duration: 02m 55s)
  • 05:57 marostegui: Deploy schema change on s4 primary master (db1068) T202167 T86338
  • 01:03 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure localized logos for nywiki (T211570) (duration: 01m 36s)
  • 01:01 catrope@deploy1001: Synchronized static/images/project-logos/: Add localised logos for nywiki (T211570) (duration: 01m 00s)
  • 00:52 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Use new HD logos for zhwiktionary, zhwikivoyage, zhwikinews (T150618) (duration: 01m 16s)
  • 00:50 RoanKattouw: mw1272 is down (does not respond to ping), but scap still tries to deploy to it
  • 00:50 catrope@deploy1001: Synchronized static/images/project-logos/: Add HD logos for zhwikinews, zhwikivoyage, zhwiktionary (T150618) (duration: 02m 30s)
  • 00:15 mutante: icinga2001 - killed all nagios processes, restarted nsca service, something is different from icinga1001, service failed when trying to restart (T211641)

2018-12-10

  • 23:51 andrewbogott: silencing the kvm process count alert on cloudvirt1023 until I can figure out why it's misfiring
  • 22:13 mutante: Welcome new Mediawiki deployer Christoph 'WMDE-Fisch' Jauera (T211014)
  • 21:29 arlolra@deploy1001: Finished deploy [parsoid/deploy@dc9b3a1]: Updating Parsoid to 19560da (duration: 11m 15s)
  • 21:20 ladsgroup@deploy1001: Finished deploy [ores/deploy@03b9c98]: Add celery4 configs back to the deploy repo (duration: 15m 25s)
  • 21:19 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@9f4b567]: More internal promisification and other performance tweaks (T202642) (duration: 04m 17s)
  • 21:17 arlolra@deploy1001: Started deploy [parsoid/deploy@dc9b3a1]: Updating Parsoid to 19560da
  • 21:14 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@9f4b567]: More internal promisification and other performance tweaks (T202642)
  • 21:05 ladsgroup@deploy1001: Started deploy [ores/deploy@03b9c98]: Add celery4 configs back to the deploy repo
  • 20:35 cdanis: T210416: grafana.wikimedia.org switched to point to grafana1001.eqiad.wmnet (running grafana 5.4.1)
  • 20:32 jforrester@deploy1001: Synchronized wmf-config/extension-list: Uninstall the ParserMigration extension, Part III I332939809 (duration: 00m 46s)
  • 20:30 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Uninstall the ParserMigration extension, Part II I1f7266f55a (duration: 00m 46s)
  • 20:29 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Uninstall the ParserMigration extension, Part I I338a3d8a87fd (duration: 00m 47s)
  • 20:26 cdanis: T210416: switching grafana.wikimedia.org to point to grafana1001.eqiad.wmnet
  • 20:25 robh: messing with ulsfo power for 103.02.23 tower b, shouldnt disrupt anything T209101
  • 20:20 cdanis: T210416: setting grafana.wikimedia.org (currently served by krypton) to read-only and copying to grafana1001 (serving grafana-beta)
  • 20:13 urandom: decommissioning cassandra-a, restbase2005 -- T210843
  • 19:58 cdanis: T210416: updating grafana to 5.4.1 in stretch-wikimedia: reprepro --restrict grafana update stretch-wikimedia
  • 18:15 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@dcde39f]: GUI Update (duration: 09m 31s)
  • 18:05 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@dcde39f]: GUI Update
  • 17:59 banyek: restarting mysql instance on labsdb1004 to restore replication filters to the original state - T211210
  • 17:58 banyek: restarting mysql instance on labsdb1004 to restore replication filters to the original state
  • 17:29 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: T211527 Hot-deploy Disable ParserMigration now that Raggett has been dropped (duration: 00m 47s)
  • 16:03 moritzm: installing PHP updates on netmon1002
  • 15:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 T86338 T202167 (duration: 00m 46s)
  • 15:36 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1097:3315 (duration: 00m 45s)
  • 15:31 banyek: repooling db1097:3315 after schema change - T85757
  • 15:24 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1097:3315 (duration: 00m 46s)
  • 15:13 banyek: depooling db1097:3315 on a schema change - T85757
  • 15:05 anomie@deploy1001: Synchronized php-1.33.0-wmf.6/includes/user/User.php: Backport fix for T210621 (duration: 00m 46s)
  • 14:55 marostegui: Deploy schema change db1101:3318 T86338 T202167
  • 14:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 T86338 T202167 (duration: 00m 46s)
  • 14:50 _joe_: uploading php-mongodb 1.5.3 to stretch-wikimedia thirdparty/php72 T206152
  • 14:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099 T86338 T202167 (duration: 00m 47s)
  • 14:00 marostegui: Deploy schema change db1099:3318 T86338 T202167
  • 14:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099 T86338 T202167 (duration: 00m 46s)
  • 13:34 ema: trafficserver 8.0.1-1wm1 uploaded to stretch-wikimedia T207048
  • 12:56 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T187299 T197607 Oversample performance survey on specific ruwiki articles (duration: 00m 46s)
  • 12:49 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 46s)
  • 12:48 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 46s)
  • 12:43 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add http://idb.ub.uni-tuebingen.de/digitue to the wgCopyUploadsDomains (T211466) (duration: 00m 47s)
  • 12:37 zfilipin@deploy1001: Synchronized dblists/flaggedrevs.dblist: SWAT: Remove FlaggedRevs for ptwikipedia (T211433) (duration: 00m 46s)
  • 12:25 moritzm: installing imagemagick security update for jessie
  • 12:21 zfilipin@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Set FileImporter config help location (T199108) (duration: 00m 47s)
  • 12:20 fsero: running puppet agent on icinga to add fsero
  • 12:13 hoo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove the "wikibase-debug" log channel (T207850) (duration: 00m 47s)
  • 11:04 godog: decommissioning cassandra-c, restbase2004 -- T210843
  • 10:42 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 46s)
  • 10:41 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 52s)
  • 10:39 marostegui: Deploy schema change on db1116:3318 T86338 T202167
  • 10:36 marostegui: Deploy schema change on dbstore1002:s8 T86338 T202167
  • 10:35 marostegui: Repool labsdb1011 T86338
  • 09:07 marostegui: Deploy schema change on s8 codfw master with replication (db2045) - lag will be generated on codfw - T202167 T86338
  • 09:01 marostegui: Depool labsdb1011 - T86338
  • 08:58 moritzm: installing chromium security updates on proton*
  • 08:57 marostegui: Repool labsdb1010 - T86338
  • 08:39 elukey: roll restart of aqs on aqs100* to pick up new Druid backend settings
  • 08:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1121 T86338 T202167 (duration: 00m 51s)
  • 08:23 godog: final round of weight addition to new ms-be codfw hosts - T209395
  • 07:18 _joe_: reenabling puppet given my changes were useless
  • 06:52 _joe_: running puppet on the puppetmasters in codfw, twice, then restarting apache to ensure cleanup of any cache
  • 06:50 _joe_: disabled puppet across the fleet for merge of hiera change
  • 06:47 marostegui: Stop slave on s4 on labsdb1011
  • 06:25 marostegui: Deploy schema change on db1121 with replication (this will generate lag on labs) - T86338 T202167
  • 06:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1121 T86338 T202167 (duration: 00m 49s)
  • 06:23 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010 - T86338
  • 02:53 urandom: decommissioning cassandra-b, restbase2004 -- T210843
  • 00:55 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Ic07ff9acfbe17 - T211529, T205546 (duration: 00m 47s)

2018-12-09

  • 20:16 urandom: decommissioning cassandra-a, restbase2004 -- T210843
  • 18:39 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Icb9ad2f554e1 - Fix ruwikinews logo config (duration: 00m 57s)
  • 11:55 godog: decommissioning cassandra-c, restbase2003 -- T210843
  • 03:00 urandom: decommissioning cassandra-b, restbase2003 -- T210843

2018-12-08

  • 23:07 krinkle@deploy1001: Synchronized docroot/wikipedia.org/speed-tests/: T185446 - I6cf29d598a11 (duration: 00m 47s)
  • 22:21 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209794 (duration: 00m 56s)
  • 20:59 urandom: decommissioning cassandra-a, restbase2003 -- T210843
  • 13:54 godog: decommissioning cassandra-c, restbase2002 -- T210843

2018-12-07

  • 23:54 ejegg: updated payments-wiki from b99cd0816e to b8acb95a2a
  • 23:41 urandom: decommissioning cassandra-b, restbase2002 -- T210843
  • 22:43 ejegg: re-enabled Thank You mail sender
  • 22:31 ejegg: updated fundraising CiviCRM from 3e5d74f17e to 8e18485697
  • 22:29 ejegg: Turned off Thank You mailing job for letter update
  • 17:47 herron: rebooting logstash1006 for security updates
  • 17:17 bstorm_: T207377 rebooted labstore1007 for kernel upgrades
  • 17:15 _joe_: uploading php-tideways (rebuilt with php 7.2 support) to stretch-wikimedia thirdparty/php72 T206152
  • 15:37 moritzm: rebooting sarin/neodymium
  • 14:52 urandom: decommissioning cassandra-a, restbase2002 -- T210843
  • 14:21 godog: more weight to new ms-be codfw hosts - T209395
  • 13:37 moritzm: rolling reboot of scb in codfw (along with nodejs update)
  • 12:18 moritzm: installing nodejs security updates on stat/notebook
  • 12:05 mobrovac@deploy1001: Finished deploy [restbase/deploy@44e0955]: Fix: Encode recommendation api title (duration: 21m 11s)
  • 11:43 mobrovac@deploy1001: Started deploy [restbase/deploy@44e0955]: Fix: Encode recommendation api title
  • 11:42 mobrovac@deploy1001: Finished deploy [restbase/deploy@9e4af13]: Fix: Encode recommendation api title (duration: 00m 21s)
  • 11:42 mobrovac@deploy1001: Started deploy [restbase/deploy@9e4af13]: Fix: Encode recommendation api title
  • 11:41 mobrovac@deploy1001: Finished deploy [restbase/deploy@9e4af13]: Fix: Encode recommendation api title (duration: 18m 58s)
  • 11:22 mobrovac@deploy1001: Started deploy [restbase/deploy@9e4af13]: Fix: Encode recommendation api title
  • 11:20 moritzm: rolling upgrade of nginx on swift frontends
  • 11:19 mobrovac@deploy1001: Finished deploy [restbase/deploy@31c44e8]: Fix: Encode recommendation api title (duration: 03m 49s)
  • 11:15 mobrovac@deploy1001: Started deploy [restbase/deploy@31c44e8]: Fix: Encode recommendation api title
  • 11:09 mobrovac@deploy1001: Finished deploy [citoid/deploy@269c9c7]: Add an explicit check for Zotero (duration: 06m 19s)
  • 11:03 mobrovac@deploy1001: Started deploy [citoid/deploy@269c9c7]: Add an explicit check for Zotero
  • 10:58 mobrovac@deploy1001: Finished deploy [citoid/deploy@6b36331]: Add an explicit check for Zotero (duration: 02m 42s)
  • 10:55 mobrovac@deploy1001: Started deploy [citoid/deploy@6b36331]: Add an explicit check for Zotero
  • 09:40 banyek: importing back linkwatcher_linklog into database s51230__linkwatcher on host labsdb1004.eqiad.wmnet. - T211210
  • 09:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1096:3315, db1096:3316 after kernel and mysql upgrade (duration: 00m 46s)
  • 09:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1096:3315, db1096:3316 after kernel and mysql upgrade (duration: 00m 46s)
  • 08:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1096:3315, db1096:3316 after kernel and mysql upgrade (duration: 00m 46s)
  • 08:21 marostegui: Stop MySQL on db1096:3315,3316 for kernel and mysql upgrade
  • 08:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315, db1096:3316 for kernel and mysql upgrade (duration: 00m 46s)
  • 08:15 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1084 (duration: 00m 47s)
  • 07:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 (duration: 00m 46s)
  • 07:50 godog: decommissioning cassandra-c, restbase2001 -- T210843
  • 07:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1084 (duration: 00m 46s)
  • 07:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1084 (duration: 00m 46s)
  • 07:11 marostegui: Stop MySQL on db1084 for mysql and kernel upgrade
  • 07:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1084 (duration: 00m 49s)
  • 00:47 XioNoX: done troubleshoting bird bfd on dns2001/cr1-codfw
  • 00:10 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: turn off wbsearchentities ab test T209402 (duration: 00m 47s)

2018-12-06

  • 23:47 ppchelko@deploy1001: Finished deploy [recommendation-api/deploy@299b268]: Add 'morelike' article recommendations API T201192 (duration: 02m 06s)
  • 23:47 XioNoX: troubleshoot bird bfd on dns2001/cr1-codfw
  • 23:45 ppchelko@deploy1001: Started deploy [recommendation-api/deploy@299b268]: Add 'morelike' article recommendations API T201192
  • 23:21 ppchelko@deploy1001: Finished deploy [restbase/deploy@be8f0c0]: Add 'morelike' recommendation public API specification T201192 (duration: 22m 46s)
  • 22:58 ppchelko@deploy1001: Started deploy [restbase/deploy@be8f0c0]: Add 'morelike' recommendation public API specification T201192
  • 22:12 urandom: decommissioning cassandra-b, restbase2001 -- T210843
  • 21:39 gtirloni: reimaging cloudvirt1019 with jessie T196507
  • 21:33 ppchelko@deploy1001: Finished deploy [changeprop/deploy@f675fcc]: Added performer to the revision-scores event (duration: 01m 15s)
  • 21:32 ppchelko@deploy1001: Started deploy [changeprop/deploy@f675fcc]: Added performer to the revision-scores event
  • 21:07 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@cbe4551]: Install new Updater with INSERT DATA (duration: 09m 18s)
  • 21:00 XioNoX: remove 2 esams avoid path + 4 prefered/selected transits - T194542
  • 20:58 smalyshev@deploy1001: Started deploy [wdqs/wdqs@cbe4551]: Install new Updater with INSERT DATA
  • 20:51 XioNoX: remove 2 eqiad avoid path - T194542
  • 20:48 gtirloni: reimaging cloudvirt1019 with stretch T196507
  • 20:45 XioNoX: remove codfw/eqdfw avoid path - T194542
  • 19:32 gehel: shutting down elasticsearch on elastic2001-2024 (third time is a charm) - T211023
  • 18:21 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@1dba3cd]: Internally promisify page processing steps (T202642) (duration: 03m 54s)
  • 18:17 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@1dba3cd]: Internally promisify page processing steps (T202642)
  • 17:31 moritzm: installing nodejs updates on proton*
  • 17:11 moritzm: uploaded nodejs 6.11~dfsg-1+wmf5 for jessie-wikimedia (the upstream patch for CVE-2018-12122 had a regression, this update fixes it)
  • 16:27 urandom: decommissioning cassandra-a, restbase2001 -- T210843
  • 16:17 gehel: shutting down elasticsearch on elastic2001-2024 (second try) - T211023
  • 15:51 moritzm: upgrading spamassassin on mx1001/fermium
  • 15:45 fsero@deploy1001: scap-helm zotero finished
  • 15:45 fsero@deploy1001: scap-helm zotero cluster codfw completed
  • 15:45 fsero@deploy1001: scap-helm zotero upgrade production -f ../zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 15:42 fsero: modifying zotero deploy CLUSTER=codfw scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero - T211322
  • 15:42 _joe_: disabling puppet fleet-wide for a change in the role() function
  • 15:38 moritzm: uploaded nodejs 6.11~dfsg-1+wmf5 for stretch-wikimedia (the upstream patch for CVE-2018-12122 had a regression, this update fixes it)
  • 15:08 gehel: restartign new elasticsearch masters on codfw - T211023
  • 15:05 gehel: upgrade nginx on wdqs servers
  • 14:59 elukey@deploy1001: Finished deploy [analytics/turnilo/deploy@6bd6e2f]: upgrade deps to nodejs 10 (duration: 00m 09s)
  • 14:59 elukey@deploy1001: Started deploy [analytics/turnilo/deploy@6bd6e2f]: upgrade deps to nodejs 10
  • 14:46 moritzm: uploaded nodejs 10.4.0~dfsg-1+wmf2 to apt.wikimedia.org/component/node10 (backports of recent security fixes)
  • 14:16 moritzm: installing nginx security updates on mw in eqiad
  • 13:46 moritzm: upgrading spamassassin on mx2001
  • 12:56 gehel: depooling and shutting down elasticsearch on elastic2001-2024 - T211023
  • 12:55 moritzm: installing nginx updates on mw in codfw
  • 11:59 moritzm: installing nginx updates on mw canaries
  • 10:59 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1096:3315 (duration: 00m 47s)
  • 10:56 volans: disable event handler on Icinga for ms-be2047 MD Raid and MegaRAID checks, it's spamming Phabricator - T209921
  • 10:56 banyek: repooling db1096 for schema change - T85757
  • 10:36 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1096:3315 (duration: 00m 49s)
  • 10:29 banyek: depooling db1096 for schema change - T85757
  • 09:56 dcausse: elastic@codfw cleanup: deleting wikidatawiki_content_1537469318 index (failed reindex probably)
  • 01:03 tstarling@deploy1001: Synchronized w/fatal-error.php: (no justification provided) (duration: 00m 46s)
  • 00:55 tstarling@deploy1001: Synchronized w/fatal-error.php: (no justification provided) (duration: 00m 47s)
  • 00:42 tstarling@deploy1001: Synchronized w/fatal-error.php: (no justification provided) (duration: 00m 46s)
  • 00:40 tstarling@deploy1001: Synchronized private/FatalErrorSettings.php: (no justification provided) (duration: 00m 46s)
  • 00:38 tstarling@deploy1001: Synchronized private/FatalErrorSettings.php: (no justification provided) (duration: 00m 46s)
  • 00:15 mutante: MPM prefork tweaks for high load systems are applied again (apparently they were not since a change in the past that resulted in 2 competing configs in mods-enabled and conf-enabled with the latter one being loaded last and containing the package defaults
  • 00:13 mutante: re-enabling puppet on phabricator, applying change that adds php-fpm support on stretch ..which doesnt affect phab1001 (prod) on jessie.. BUT re-adds tuning config from the past for mpm_prefork.conf (more SpareServers etc) that was not actually applied due to a bug
  • 00:10 urandom: bootstrapping cassandra-c, restbase2018 -- T210843

2018-12-05

  • 23:50 ejegg: updated payments-wiki from 20595cca97 to b99cd0816e
  • 23:40 ejegg: re-enabled fundraising queue consumer jobs
  • 23:33 ejegg: updated fundraising CiviCRM from e757753a46 to 3e5d74f17e
  • 23:32 ejegg: turned off fundraising queue jobs for base queue consumer logic update
  • 22:43 jijiki: restarting pdfreder on scb* hosts in eqiad
  • 21:44 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@243a503]: Update mobileapps to 2f44362 (duration: 02m 47s)
  • 21:41 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@243a503]: Update mobileapps to 2f44362
  • 21:41 mdholloway: mobileapps deployment failed for group default03, rolling back and retrying
  • 21:39 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@243a503]: Update mobileapps to 2f44362 (duration: 18m 46s)
  • 21:39 arlolra: Updated Parsoid to a6058e3 (T210647, T208360, T205333)
  • 21:37 urandom: bootstrapping cassandra-b, restbase2018 -- T210843
  • 21:23 arlolra@deploy1001: Finished deploy [parsoid/deploy@5e9a496]: Updating Parsoid to a6058e3 (duration: 11m 36s)
  • 21:20 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@243a503]: Update mobileapps to 2f44362
  • 21:12 arlolra@deploy1001: Started deploy [parsoid/deploy@5e9a496]: Updating Parsoid to a6058e3
  • 20:08 banyek: repooling labsdb1010 - T210693
  • 19:12 cmjohnson1: cloudvirt1019 for an all inclusive part swap by HPE
  • 18:53 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Fix formatting of help panel links (duration: 00m 47s)
  • 18:01 jijiki: uploaded thumbor_6.3.2+git20170607-1+deb9u1 to stretch-wikimedia
  • 17:23 hoo@deploy1001: Synchronized wmf-config/Wikibase.php: Enable Kartographer maps on testwikidatawiki (T184933) (duration: 00m 46s)
  • 17:22 hoo@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: (no justification provided) (duration: 00m 46s)
  • 17:20 hoo@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: (no justification provided) (duration: 00m 46s)
  • 17:19 XioNoX: remove private IPs from codfw cloud-instance-transport1-b T207663
  • 17:11 XioNoX: add public IPs to codfw cloud-instance-transport1-b T207663
  • 16:58 XioNoX: re-deactivate ams-ix prefix list entry on cr2-esams
  • 16:53 XioNoX: activate ams-ix prefix list entry on cr2-esams
  • 16:27 jijiki: uploaded python-thumbor-wikimedia_2.2-1+deb9u1 to stretch-wikimedia
  • 16:18 akosiaris@deploy1001: scap-helm zotero finished
  • 16:18 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 16:18 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-codfw.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 16:17 akosiaris@deploy1001: scap-helm zotero finished
  • 16:17 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 16:17 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 16:17 akosiaris@deploy1001: scap-helm zotero finished
  • 16:17 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 16:17 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 16:15 akosiaris@deploy1001: scap-helm zotero finished
  • 16:15 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 16:15 akosiaris@deploy1001: scap-helm zotero upgrade production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 16:08 fsero: redeploying zotero on eqiad
  • 16:02 akosiaris@deploy1001: scap-helm zotero finished
  • 16:02 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 16:02 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 16:02 akosiaris@deploy1001: scap-helm zotero upgrade production --set resources.replicas=16 stable/zotero [namespace: zotero, clusters: eqiad,codfw]
  • 16:02 akosiaris@deploy1001: scap-helm zotero upgrade production --set resources.replicas=16 [namespace: zotero, clusters: eqiad,codfw]
  • 15:34 thcipriani: restarting ci jenkins for update
  • 15:33 akosiaris: add back pods/portforward right to kubernetes deploy user. T211040
  • 15:07 anomie: Running cleanupUsersWithNoId.php on potentially missed s3 and s7 wikis for T181731
  • 15:03 fsero: repooling citoid mathoid eqiad
  • 15:03 godog: bootstrap cassandra-c, restbase2017 - T210843
  • 14:56 banyek: executing schema change on s5 codfw master replication lag could be expected - T85757
  • 14:54 fsero: upgrading k8s on eqiad to 1.10.11
  • 14:54 elukey: restart HDFS namenode and Yarn resource manager on an-master100[1,2] to update rack topology config - T209929
  • 14:51 fsero: depool mathoid/eqiad: pooled changed True => False
  • 14:51 fsero: depool citoid/eqiad: pooled changed True => False
  • 14:43 anomie: Running cleanupUsersWithNoId.php on metawiki for T181731 / T210985
  • 14:09 hoo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable arbitrary item/property access for all wiktionaries (T175273) (duration: 00m 47s)
  • 13:53 akosiaris: repool citoid/mathoid codfw
  • 12:59 onimisionipe: banning elastic2001-elastic2024 from codfw production, psi and omega clusters
  • 12:53 jijiki: uploaded python-thumbor-community-core_0.4.0-1+deb9u1 to stretch-wikimedia
  • 12:47 dcausse: EU SWAT done
  • 12:45 oblivian@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=(cit|math)oid,name=codfw
  • 12:44 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: T210381: [cirrus] Add temp clusters but still write to the old ones 2/2 (duration: 00m 46s)
  • 12:42 dcausse@deploy1001: Synchronized wmf-config/CommonSettings.php: T210381: [cirrus] Add temp clusters but still write to the old ones 1/2 (duration: 00m 46s)
  • 12:31 fsero: pooling mathoid and citoid again on codfw
  • 12:30 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T202497) (duration: 00m 46s)
  • 12:29 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T202497) (duration: 00m 49s)
  • 12:23 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T211188: Increase autoconfirmed count for Meta-Wiki to 5 (duration: 00m 47s)
  • 12:15 fsero: upgrading codfw k8s cluster to 1.10.11
  • 12:15 dcausse: running namespaceDupes & cirrus indexNamespaces on yuewiktionary
  • 12:11 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T205546: Define 2 new namespaces for yuewiktionary (duration: 00m 47s)
  • 12:02 fsero: depooling mathoid and citoid servers on codfw for k8s upgrade
  • 11:28 mobrovac@deploy1001: Finished deploy [citoid/deploy@b10e034]: Truncate Zotero-reported time stamp to date - T211127 (duration: 05m 55s)
  • 11:23 mobrovac@deploy1001: Started deploy [citoid/deploy@b10e034]: Truncate Zotero-reported time stamp to date - T211127
  • 11:05 akosiaris: upgrade kubernetes-client and kubernetes-master on staging to 1.10.11
  • 11:04 godog: bootstrap cassandra-b, restbase2017 - T210843
  • 10:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3312 (duration: 00m 45s)
  • 10:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3317 (duration: 00m 45s)
  • 10:51 ema: cache hosts: begin nginx rolling upgrade to 1.13.6-2+wmf2
  • 10:46 marostegui: Reboot db1090 for kernel upgrade
  • 10:44 moritzm: uploaded jenkins 2.138.4 to jessie-wikimedia/thirdparty and stretch-wikimedia/thirdpary/ci
  • 10:42 marostegui: Stop MySQL on db1090:3312 and db1090:3317 for MySQL upgrade
  • 10:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3312 (duration: 00m 46s)
  • 10:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3317 (duration: 00m 46s)
  • 10:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1080 (duration: 00m 46s)
  • 10:21 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=elasticsearch
  • 10:17 arturo: T205969 icinga downtime the load avg check in labstore1007 for 1 week
  • 10:10 banyek: depooling labsdb1010 for testing materialized views - T210693
  • 10:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1080 (duration: 00m 46s)
  • 09:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly pool db1080 (duration: 00m 46s)
  • 09:32 gehel: setting up new elasticsearch servers on codfw - elastic2045-2054 - T210265
  • 09:22 marostegui: Stop MySQL on db1080 for mysql and kernel upgrade
  • 09:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 for MySQL upgrade (duration: 00m 46s)
  • 09:07 elukey: matomo read only + upgrade to matomo 3.7.0 on matomo1001 - T209808
  • 09:00 _joe_: disabed puppet on mw1261, used for logging tests for T211184
  • 08:48 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T207850 Define a new Wikibase log channel to use (duration: 00m 47s)
  • 08:01 moritzm: installing pdns-recursor security update in esams
  • 07:59 godog: bootstrap cassandra-a, restbase2017 - T210843
  • 07:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 T86338 T202167 (duration: 00m 46s)
  • 06:55 marostegui: Deploy schema change on db1091 T86338 T202167
  • 06:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1091 T86338 T202167 (duration: 00m 51s)
  • 04:03 urandom: bootstrapping cassandra-c, restbase2016 -- T210843
  • 03:11 kartik@deploy1001: Finished deploy [cxserver/deploy@a3dd2ca]: Update cxserver to c4240e6 and enable Youdao MT (T208985, T210578) (duration: 04m 26s)
  • 03:06 kartik@deploy1001: Started deploy [cxserver/deploy@a3dd2ca]: Update cxserver to c4240e6 and enable Youdao MT (T208985, T210578)
  • 00:32 dcausse: Evening SWAT done
  • 00:30 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [cirrus] prepare multi-instance services (T210381) (duration: 00m 46s)
  • 00:28 dcausse@deploy1001: Synchronized wmf-config/ProductionServices.php: [cirrus] prepare multi-instance services (T210381) (duration: 00m 46s)
  • 00:16 dcausse@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable Block notice stats on itwiki (T210452) (duration: 00m 47s)
  • 00:10 urandom: bootstrapping cassandra-b, restbase2016 -- T210843

2018-12-04

  • 23:46 eileen: civicrm revision changed from a411d6bd64 to e757753a46, config revision is 0e6ccc37fe
  • 23:33 XioNoX: update prefix-list peering4 on cr1-eqsin to match jnt
  • 22:46 XioNoX: remove neodymium/sarin from term labs-in4 on cr1/2-eqiad - T210612
  • 22:40 ejegg: Updated payments-wiki from 7403a196b4 to 20595cca97
  • 21:58 XioNoX: clear ethernet-swtiching table for labvirt1004:eth1's switch port
  • 21:57 XioNoX: clear ethernet-swtiching table for labvirt1009:eth1's switch port
  • 21:49 XioNoX: make cr1/2-codfw conform to jnt
  • 21:44 XioNoX: make cr2-eqord/eqdfw conform to jnt
  • 21:41 XioNoX: make cr3/4-ulsfo conform to jnt
  • 20:04 urandom: bootstrapping cassandra-a, restbase2016 -- T210843
  • 19:35 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@81dac18]: Install new Updater for T210044 investigation (duration: 10m 36s)
  • 19:24 smalyshev@deploy1001: Started deploy [wdqs/wdqs@81dac18]: Install new Updater for T210044 investigation
  • 18:04 joal@deploy1001: Finished deploy [analytics/aqs/deploy@e7d48e9]: Add underestimate and offset to uniques-devices endpoint (duration: 17m 33s)
  • 18:03 akosiaris: bump zotero pod number from 4 to 16 in eqiad/codfw
  • 17:47 joal@deploy1001: Started deploy [analytics/aqs/deploy@e7d48e9]: Add underestimate and offset to uniques-devices endpoint
  • 17:46 ppchelko@deploy1001: Finished deploy [changeprop/deploy@e1aeb27]: Do not initialize scores and errors arrays in advance T210465 (duration: 01m 13s)
  • 17:45 ppchelko@deploy1001: Started deploy [changeprop/deploy@e1aeb27]: Do not initialize scores and errors arrays in advance T210465
  • 17:22 Reedy: created oathauth tables on punjabiwikimedia T211110
  • 17:16 godog: bootstrap cassandra-c on restbase2015 - T210843
  • 16:48 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure 'api-warning' log channel (duration: 00m 47s)
  • 15:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1103:3314 T86338 T202167 (duration: 00m 47s)
  • 14:04 elukey: upgrade turnilo on analytics-tools1002 to nodejs-10 - T210705
  • 13:56 addshore: addshore@mwmaint1002:~$ mwscript namespaceDupes.php --wiki=bnwikisource --fix --add-prefix=T210472
  • 13:53 addshore: addshore@mwmaint1002:~$ mwscript namespaceDupes.php --wiki=euwiki --fix
  • 13:52 marostegui: Deploy schema change on db1103:3314 T86338 T202167
  • 13:52 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103:3314 T86338 T202167 (duration: 00m 47s)
  • 13:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1081 T86338 T202167 (duration: 00m 46s)
  • 13:34 cdanis: T210416: adding grafana 5 to wikimedia-stretch: reprepro --restrict grafana update stretch-wikimedia
  • 13:33 moritzm: installing nodejs security updates on restbase in codfw
  • 13:32 godog: bootstrap cassandra-b on restbase2015 - T210843
  • 13:29 moritzm: installing nodejs security updates on proton*
  • 13:19 marostegui: Deploy schema change on db1081 T86338 T202167
  • 13:19 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1081 T86338 T202167 (duration: 00m 46s)
  • 13:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1084 T86338 T202167 (duration: 00m 46s)
  • 12:39 Lucas_WMDE: EU SWAT done
  • 12:38 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create namespace "Work" on bnwikisource (T210472) (duration: 00m 46s)
  • 12:33 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create List namespace on euwiki (T209834) (duration: 00m 47s)
  • 12:28 lucaswerkmeister-wmde@deploy1001: Synchronized static/images/project-logos/: SWAT: Revert "Milestone logo for atjwiki" (T200713) (duration: 00m 47s)
  • 12:17 moritzm: installing tiff security updates
  • 11:49 mobrovac@deploy1001: Finished deploy [restbase/deploy@8abcbda] (dev-cluster): (no justification provided) (duration: 04m 47s)
  • 11:44 mobrovac@deploy1001: Started deploy [restbase/deploy@8abcbda] (dev-cluster): (no justification provided)
  • 11:41 moritzm: rebooting puppetboard2001 to pick up SSBD-enabled qemu
  • 11:38 mobrovac@deploy1001: Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1004 - T197242 (duration: 00m 21s)
  • 11:38 mobrovac@deploy1001: Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1004 - T197242
  • 11:37 moritzm: rebooting puppetboard1001 to pick up SSBD-enabled qemu
  • 11:36 akosiaris: enable puppet on scb1004, run puppet T197242
  • 11:36 mobrovac@deploy1001: Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1003 - T197242 (duration: 00m 20s)
  • 11:35 marostegui: Deploy schema change on db1084 T86338 T202167
  • 11:35 mobrovac@deploy1001: Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1003 - T197242
  • 11:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1084 T86338 T202167 (duration: 00m 47s)
  • 11:34 akosiaris: enable puppet on scb1003, run puppet T197242
  • 11:33 mobrovac@deploy1001: Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1002 - T197242 (duration: 00m 28s)
  • 11:33 mobrovac@deploy1001: Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1002 - T197242
  • 11:31 akosiaris: enable puppet on scb1002, run puppet T197242
  • 11:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1097:3314 T86338 T202167 (duration: 00m 46s)
  • 11:18 mobrovac@deploy1001: Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1001 - T197242 (duration: 00m 30s)
  • 11:18 mobrovac@deploy1001: Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb1001 - T197242
  • 11:17 akosiaris: enable puppet on scb1001, run puppet T197242
  • 11:12 elukey@deploy1001: Finished deploy [analytics/aqs/deploy@e9a63cc]: Expose offset and underestimate numbers on unique devices - T164201 (duration: 09m 06s)
  • 11:04 mobrovac@deploy1001: Finished deploy [restbase/deploy@8abcbda]: Disable Citoid test for switching it to Zotero v2 - T211088 T197242 (duration: 20m 59s)
  • 11:03 elukey@deploy1001: Started deploy [analytics/aqs/deploy@e9a63cc]: Expose offset and underestimate numbers on unique devices - T164201
  • 10:59 fdans@deploy1001: Finished deploy [analytics/aqs/deploy@e9a63cc]: Deploying offset and underestimate numbers for uniques (duration: 00m 37s)
  • 10:58 fdans@deploy1001: Started deploy [analytics/aqs/deploy@e9a63cc]: Deploying offset and underestimate numbers for uniques
  • 10:57 fdans: deploying AQS to expose offset and underestimate numbers on unique devices
  • 10:51 moritzm: rebooting analytics-tool1003 to pick up SSBD-enabled qemu
  • 10:47 moritzm: rebooting analytics-tool1002 to pick up SSBD-enabled qemu
  • 10:43 mobrovac@deploy1001: Started deploy [restbase/deploy@8abcbda]: Disable Citoid test for switching it to Zotero v2 - T211088 T197242
  • 10:41 moritzm: rebooting analytics-tool1001 to pick up SSBD-enabled qemu
  • 10:03 mobrovac@deploy1001: Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 in codfw - T197242 (duration: 01m 45s)
  • 10:02 godog: bootstrap cassandra-a on restbase2015 - T210843
  • 10:01 mobrovac@deploy1001: Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 in codfw - T197242
  • 09:59 gehel: upgrading nginx on elasticsearch eqiad
  • 09:54 akosiaris: enable puppet on all scb2*, run puppet T197242
  • 09:52 gehel: upgrading nginx on elasticsearch codfw
  • 09:52 mobrovac@deploy1001: Finished deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb2001 - T197242 (duration: 00m 30s)
  • 09:51 mobrovac@deploy1001: Started deploy [citoid/deploy@b902865]: Switch Citoid to Zotero v2 on scb2001 - T197242
  • 09:50 akosiaris: enable puppet on scb2001, run puppet T197242
  • 09:46 akosiaris: disable puppet on scb for citoid migration to zoterov2 T197242
  • 09:46 gehel@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=elasticsearch
  • 09:31 gehel: add elastic2039-2044 to cirrus eqiad (new server) - T210265
  • 09:11 gehel: add elastic2038 to cirrus eqiad (new server) - T210265
  • 09:00 addshore: graphite1004 & graphite2003, /var/lib/carbon/whisper/MediaWiki/electronpdf/action # Ran https://phabricator.wikimedia.org/P7882 for T157012
  • 08:54 marostegui: Deploy schema change on db1097:3314 T86338 T202167
  • 08:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097:3314 T86338 T202167 (duration: 00m 47s)
  • 08:46 addshore: graphite1004 & graphite2003, /var/lib/carbon/whisper/daily/wikidata/api/actions$ sudo -u _graphite find . -type f -name "*-*.wsp" -delete # T120639
  • 08:46 addshore: graphite1004 & graphite2003, /var/lib/carbon/whisper/daily/wikidata/api/actions$ sudo -u _graphite find . -type f -name "*_*.wsp" -delete # T120639
  • 08:43 marostegui: Deploy schema change on db1102:3314 T86338 T202167
  • 08:42 marostegui: Deploy schema change on dbstore1002:s4 T86338 T202167
  • 08:41 addshore: graphite1004 & graphite2003, /var/lib/carbon/whisper/daily/wikidata/datamodel$ sudo -u _graphite rm wikipedia_references.wsp # T121521
  • 08:36 gehel: restarting stuck tilerator on maps* - T204047
  • 08:35 addshore: graphite1004 & graphite2003, /var/lib/carbon/whisper/daily/wikidata/api/wbgetclaims$ sudo -u _graphite find . -type f -name "*.wsp" -delete # T140280
  • 08:11 moritzm: installing perl security updates on jessie/trusty (stretch already updated)
  • 07:49 godog: bootstrap cassandra-c on restbase2014 - T209615
  • 07:26 marostegui: Deploy schema change on s4 codfw master (db2051) with replication T86338 T202167
  • 07:22 marostegui: Deploy schema change on wikitech primary master (db1073) for labswiki and labtestwiki T86338 T202167
  • 06:39 marostegui: Deploy schema change on s5 primary master (db1070) T86338 T202167
  • 06:37 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 T86338 T202167 (duration: 00m 49s)
  • 06:12 marostegui: Deploy schema change on db1110 T86338 T202167
  • 06:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1110 T86338 T202167 (duration: 00m 53s)
  • 01:22 foks: Removing 2FA per request at https://phabricator.wikimedia.org/T210703
  • 01:22 foks: Reset password for user "Orangemike"

2018-12-03

  • 23:54 legoktm@deploy1001: Synchronized php-1.33.0-wmf.6/tests/: for completeness (duration: 00m 58s)
  • 23:53 legoktm@deploy1001: Synchronized php-1.33.0-wmf.6/resources/src/mediawiki.legacy/: Restore gray coloring for autocomments (T165189 part 2) (duration: 00m 47s)
  • 23:51 legoktm@deploy1001: Synchronized php-1.33.0-wmf.6/includes/Linker.php: Restore old HTML structure for history section links (T165189 part 1) (duration: 00m 47s)
  • 22:56 sbassett@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/AbuseFilter/includes/api/ApiQueryAbuseLog.php: Deploy security fix for T210329 (duration: 00m 47s)
  • 21:24 mutante: temp. disabling puppet on logstash1007 and logstash1008 to carefully deploy gerrit:476916
  • 21:17 XioNoX: push firewall change to pfw3-eqiad - T211028
  • 21:10 catrope@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/WikimediaEvents/includes/WikimediaEventsHooks.php: Fix ChangesListFilters validation errors (duration: 00m 49s)
  • 21:00 urandom: bootstrapping restbase2014-a -- T210843
  • 20:20 ppchelko@deploy1001: Finished deploy [changeprop/deploy@7470c85]: Start emitting revision-score events with new schema (duration: 01m 13s)
  • 20:19 ppchelko@deploy1001: Started deploy [changeprop/deploy@7470c85]: Start emitting revision-score events with new schema
  • 20:14 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@0c94e5f]: New GUI and updater build (duration: 09m 31s)
  • 20:05 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@0c94e5f]: New GUI and updater build
  • 20:05 XioNoX: push firewall change to pfw3-eqiad - T211028
  • 19:59 ppchelko@deploy1001: Finished deploy [changeprop/deploy@867c571]: TEMP: stop production of revision-scor events for schema change (duration: 01m 13s)
  • 19:58 ppchelko@deploy1001: Started deploy [changeprop/deploy@867c571]: TEMP: stop production of revision-scor events for schema change
  • 19:39 jforrester@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/MobileFrontend/resources/mobile.toc/TableOfContents.js: SWAT T210869 Fix Table of contents rendering (duration: 00m 47s)
  • 19:30 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT I3b906e8b1 CS part of setting enhanced RC (duration: 00m 46s)
  • 19:27 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT Ic2787309e59e IS part of setting enhanced RC (duration: 00m 47s)
  • 19:25 jforrester@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/Echo/modules/styles/mw.echo.ui.PaginationWidget.less: SWAT T210487 I914b94515 (duration: 00m 47s)
  • 19:15 jforrester@deploy1001: Synchronized wmf-config/ProductionServices.php: SWAT T210381 I73c7596818b Actual config (duration: 00m 46s)
  • 19:09 jforrester@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: SWAT T210381 I2ae162f5 Part II (duration: 00m 47s)
  • 19:08 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT T210381 I2ae162f5 Part I (duration: 00m 46s)
  • 18:42 anomie@deploy1001: Synchronized wmf-config/CommonSettings.php: Updating SkinBuildSidebar hook function for T210528 (duration: 00m 47s)
  • 18:08 gehel: add elastic2037 to cirrus eqiad (new server) - T210265
  • 18:06 godog: bootstrap cassandra-c on restbase2013 - T209615
  • 17:09 bstorm_: T207377 reboot labstore1006 for upgrades
  • 16:28 godog: poweroff ms-be2021 for battery replacement - T208269
  • 15:23 gehel: start configuration of elastic2037-2044 (new servers) - T210265
  • 15:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1100 T86338 T202167 (duration: 00m 46s)
  • 14:54 marostegui: Deploy schema change on db1100 T86338 T202167
  • 14:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1100 T86338 T202167 (duration: 00m 48s)
  • 14:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1113:3315 T86338 T202167 (duration: 00m 47s)
  • 14:17 marostegui: Deploy schema change on db1113:3315 T86338 T202167
  • 14:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1113:3315 T86338 T202167 (duration: 00m 46s)
  • 14:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1082 T86338 T202167 (duration: 00m 46s)
  • 13:47 marostegui: Deploy schema change on db1082 (sanitarium master) with replication, lag will be generated on labs (s5) T86338 T202167
  • 13:47 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1082 T86338 T202167 (duration: 00m 47s)
  • 13:01 Lucas_WMDE: EU SWAT done
  • 12:44 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Cleaning of wgLogoHD (T150618) (duration: 00m 46s)
  • 12:39 lucaswerkmeister-wmde@deploy1001: Synchronized static/images/project-logos/: SWAT: Upload HD logos for multiple projects (T150618) (duration: 00m 47s)
  • 12:37 moritzm: installing nodejs security updates on scb1001
  • 12:34 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Close internalwiki (T205584) (duration: 00m 46s)
  • 12:28 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change sitename of shnwiki (T206777) (duration: 00m 47s)
  • 12:23 godog: bootstrap cassandra-b on restbase2013 - T209615
  • 12:19 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Perform more PHP constraint checks before falling back (T209504) (duration: 00m 48s)
  • 12:14 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Don’t send SPARQL prefixes in WikibaseQualityConstraints (T204317) (duration: 00m 49s)
  • 11:39 godog: more weight to new ms-be codfw hosts - T209395
  • 11:02 moritzm: installing nodejs security updates on stat/notebook hosts
  • 10:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1097:3315 T86338 T202167 (duration: 00m 47s)
  • 10:52 moritzm: rolling upgrade of scb in codfw to nodejs security update
  • 10:39 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 46s)
  • 10:39 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 47s)
  • 10:34 moritzm: installing nodejs security updates on scb2001
  • 10:31 marostegui: Deploy schema change on db1097:3315 T86338 T202167
  • 10:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1097:3315 T86338 T202167 (duration: 00m 46s)
  • 10:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315 T86338 T202167 (duration: 00m 45s)
  • 09:48 marostegui: Deploy schema change on db1096:3315 T86338 T202167
  • 09:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315 T86338 T202167 (duration: 00m 47s)
  • 09:27 banyek: executing schema change on db1066 (s2 master) - T85757
  • 09:22 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1074 (duration: 00m 47s)
  • 09:16 banyek: repooling db1074 - T85757
  • 09:16 banyek: repooling db1074
  • 08:50 banyek: stopping replication on db1074 - T85757
  • 08:49 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1074 (duration: 00m 48s)
  • 08:44 godog: bootstrap cassandra-a on restbase2013 - T209615
  • 08:43 banyek: depooling db1074 - T85757
  • 08:32 moritzm: restarted keyholder agents/proxies on netmon1002/netmon2001 to pick up removal of netbox key
  • 08:30 marostegui: Deploy schema change on s5 codfw master (db2052) with replication, lag will be generated on codfw T86338 T202167
  • 08:13 moritzm: rearmed keyholders on netmon1002/netmon2001
  • 08:07 marostegui: Deploy schema change on s2 master (db1066) T86338 T202167
  • 07:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1074 T86338 T202167 (duration: 00m 47s)
  • 07:32 marostegui: Deploy schema change db1074 with replication (lag will appear on labs) T86338 T202167
  • 07:32 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1074 T86338 T202167 (duration: 00m 46s)
  • 07:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1122 T86338 T202167 (duration: 00m 47s)
  • 07:09 marostegui: Stop MySQL on pc1004, pc1005 and pc1006 as they will be decommissioned - T210969
  • 06:52 marostegui: Remove pc1004, pc1005 and pc1006 from tendril and zarcillo - T210969
  • 06:38 marostegui: Deploy schema change db1122 T86338 T202167
  • 06:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1122 T86338 T202167 (duration: 00m 48s)
  • 06:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1076 T86338 T202167 (duration: 00m 47s)
  • 06:16 marostegui: Deploy schema change db1076 T86338 T202167
  • 06:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1076 T86338 T202167 (duration: 00m 50s)
  • 00:34 legoktm@deploy1001: Synchronized php-1.33.0-wmf.6/includes/Title.php: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/477182 (duration: 00m 52s)

2018-12-02

  • 22:39 addshore: addshore@mwmaint1002:~$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=testwikidatawiki addless # This is my account, and apparently I no longer have the 2fa for it

2018-12-01

  • 16:48 andrewbogott: rebuilding labvirt1014 as cloudvirt1014, T210904

2018-11-30

  • 20:23 ottomata: temporarily disabled puppet on stat1005 to test rsyncd changes
  • 20:02 ssastry@deploy1001: Finished deploy [parsoid/deploy@9981ddf]: Update Parsoid to 310edecd (deploy-20181130 branch) (duration: 11m 38s)
  • 19:50 ssastry@deploy1001: Started deploy [parsoid/deploy@9981ddf]: Update Parsoid to 310edecd (deploy-20181130 branch)
  • 15:29 urandom: bootstrapping restbase2013-a -- T210843
  • 15:22 moritzm: uploaded nodejs 6.11.0~dfsg-1+wmf4+jessie to apt.wikimedia.org/jessie-wikimedia (fixes a dependency compared to the initial jessie update)
  • 12:55 moritzm: uploaded nodejs 6.11.0~dfsg-1+wmf3+jessie to apt.wikimedia.org/jessie-wikimedia (backporting the current security fixes)
  • 11:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1105:3312 T86338 T202167 (duration: 00m 47s)
  • 11:07 banyek: deploy schema change on dbstore1002 - T85757
  • 10:59 marostegui: Deploy schema change on db1105:3312 T86338 T202167
  • 10:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1105:3312 T86338 T202167 (duration: 00m 45s)
  • 09:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3312 T86338 T202167 (duration: 00m 46s)
  • 09:00 marostegui: Deploy schema change on db1090:3312 T86338 T202167
  • 09:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3312 T86338 T202167 (duration: 00m 47s)
  • 08:55 moritzm: removed rutherfordium from debmonitor DB (T210036)
  • 08:21 marostegui: Deploy schema change on dbstore1002 T86338 T202167
  • 08:12 moritzm: installing perl security updates
  • 07:28 marostegui: Deploy schema change on db1095:3312 T86338 T202167
  • 07:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1103:3312 T86338 T202167 (duration: 00m 47s)
  • 06:55 marostegui: Purge binary logs on pc1005
  • 06:52 marostegui: Deploy schema change on db1103:3312 T86338 T202167
  • 06:51 marostegui: Deploy schema change on db1103:3312 T86338 T20216
  • 06:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 T86338 T202167 (duration: 00m 48s)
  • 06:12 marostegui: Deploy schema change on s2 codfw master with replication (db2035), this will generate lag on codfw - T86338 T202167
  • 02:17 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: update wbsearchentities ab test configuration T209402 (duration: 00m 47s)
  • 01:06 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/WikimediaEvents: SWAT: wbsearchentities ab test improvements T209402 (duration: 00m 46s)
  • 01:04 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/MobileFrontend: SWAT: Change config flag for enabling Block Notice stats T201719 (duration: 00m 48s)
  • 00:57 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/VisualEditor: SWAT: Rename configs for tracking block notices on visual editor ve.init.mw.ArticleTarget: Stop when we fail to load metadata T209542 (duration: 00m 48s)

2018-11-29

  • 23:26 mutante: puppetmaster: sudo puppet cert revoke rutherfordium.eqiad.wmnet; sudo puppet node clean rutherfordium.eqiad.wmnet ; sudo puppet node deactivate rutherfordium.eqiad.wmnet ; run puppet on icinga1001.. removed host from monitoring (decom for ganeti VM) (T210036)
  • 22:21 hashar: 1.33.0-wmf.6 is on all wikis and looks stable.
  • 22:04 hashar: hashar@deploy1001 rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.6
  • 21:54 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T210636 - I9ebbc6 (duration: 00m 55s)
  • 21:41 tzatziki: changing email for User:Mathounette
  • 21:34 XioNoX: removed unused vc-port on asw2-c-eqiad:fpc8 - T210788
  • 20:50 mutante: people - rsynced /home one last time, switched DNS people.eqiad CNAME over, varnish change merged (T210036)
  • 20:42 mutante: people.wikimedia.org is switching backends from rutherfordium to people1001, please stand by during a short maintenance period.. data has been copied | https://wikitech.wikimedia.org/wiki/People.wikimedia.org#Backend_upgrade_November_2018 | T210036
  • 20:01 XioNoX: Apply Icinga:check_vcp to all VC switches - T201097
  • 19:42 XioNoX: remove neodymium/sarin from mgmt routers - T210612
  • 19:16 hashar@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/MobileFrontend/: RecordRevision::getUser() returns UserIdentity not int - T210737 (duration: 00m 55s)
  • 19:09 ejegg: updated payments-wiki from 34250e80b5 to 7403a196b4
  • 18:53 XioNoX: Netbox: remove Napalm integration
  • 18:28 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting comment migration to write-new/read-new on group 0 (T166733) (duration: 00m 52s)
  • 17:55 XioNoX: remove test netbox user from cr3-ulsfo - T205898
  • 17:49 mforns@deploy1001: Finished deploy [analytics/refinery@40b1972]: deploying refinery to refinery-source version v0.0.81 (duration: 06m 01s)
  • 17:43 mforns@deploy1001: Started deploy [analytics/refinery@40b1972]: deploying refinery to refinery-source version v0.0.81
  • 17:41 anomie@deploy1001: Synchronized php-1.33.0-wmf.6/includes/revisiondelete/RevisionDeleteUser.php: Fix RevisionDeleteUser rev_actor query for MySQL, for real this time (T210628) (duration: 00m 53s)
  • 17:34 anomie@deploy1001: Synchronized php-1.33.0-wmf.6/includes/revisiondelete/RevisionDeleteUser.php: Fix RevisionDeleteUser rev_actor query for MySQL (T210628) (duration: 00m 53s)
  • 17:02 robh: decom of labvirt101[01] continuing
  • 16:10 anomie@mwmaint1002: Running Wikibase/populateSitesTable.php and cleanupUsersWithNoId.php on more wiktionaries, incubatorwiki, and sourceswiki for T210732
  • 15:54 papaul: shutting down ms-be2047 for maintenance
  • 15:35 gtirloni: T196507 downtimed and powercycled cloudvirt1019
  • 15:32 anomie@mwmaint1002: Running Wikibase/populateSitesTable.php and cleanupUsersWithNoId.php on several other wiktionaries for T210732
  • 15:29 gehel: activating multiple elasticsearch instances on cirrus / eqiad - T207918
  • 15:18 $WHO: Running Wikibase populateSitesTable.php on eswiktionary for T210732
  • 14:31 hashar@deploy1001: rebuilt and synchronized wikiversions files: Revert all wikis to 1.33.0-wmf.6
  • 14:29 hashar@deploy1001: scap failed: average error rate on 11/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 14:17 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.6
  • 14:10 hashar: test stashbot
  • 14:03 moritzm: uploaded nodejs 6.11.0~dfsg-1+wmf3 to apt.wikimedia.org/stretch-wikimedia (backporting the current security fixes)
  • 13:45 hashar@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/Wikibase: feature flag for globe coordinator formatter using kartographer - T184933 T210617 (duration: 01m 18s)
  • 13:40 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Clarify parsercache keys section (duration: 00m 53s)
  • 13:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Clarify parsercache keys section (duration: 00m 52s)
  • 13:17 moritzm: rebooting certcentral1001 to pick up SSBD-enabled qemu/kernel update
  • 13:14 moritzm: rebooting certcentral2001 to pick up SSBD-enabled qemu/kernel update
  • 13:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool pc1009 in pc3 - T208383 (duration: 00m 53s)
  • 13:05 marostegui: Upgrade pc3 tendril topology - T208383
  • 13:00 mobrovac@deploy1001: Started restart [eventstreams/deploy@07033d4]: Restart ES on scb1004 due to possible memory leak (again)
  • 12:49 dcausse: EU swat done
  • 12:48 dcausse@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/TwoColConflict/: Fix unescaped HTML injected into conflict resolution interface (duration: 00m 53s)
  • 12:30 jynus: run puppet on notebook1004, people1001, rutherfordium to fix failures
  • 12:13 dcausse@deploy1001: Synchronized dblists/cirrussearch-big-indices.dblist: T210381: [cirrus] multi-instance: add cirrussearch-big-indices.dblist (duration: 00m 53s)
  • 12:09 dcausse@deploy1001: Synchronized wmf-config/CirrusSearch-production.php: T210381: [cirrus] Use normal config for labswiki (duration: 00m 55s)
  • 12:01 banyek: repooling labsdb1010 after upgrades - T209517
  • 10:17 elukey: remove zookeeper's crontabs from conf100[1-3] to fix cronspam
  • 10:16 arturo: T209626 icinga downtime labvirt1011 for 1 month to avoid bogus pages
  • 10:01 banyek: depooling labsdb1010 due of maintenance - T209517
  • 09:55 gehel: restarting prometheus-elasticsearch-exporter-9200 on all elastic cirrus nodes
  • 09:11 akosiaris: increase nofile of process to 20k and maxclients to 15k to account for the backlog of ores scorings
  • 08:48 _joe_: restarting uwsgi-ores on ores1003
  • 08:04 vgutierrez: replacing TLS certificates in gerrit - T207050
  • 06:59 marostegui: Stop MySQL on pc1006 to clone pc1009 - T208383
  • 06:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1006 - T208383 (duration: 00m 53s)
  • 06:36 marostegui: Deploy schema change on db1061 (s6 master) - T202167
  • 06:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1088 T202167 (duration: 00m 53s)
  • 06:31 marostegui: Deploy schema change on db1088 - T202167
  • 06:31 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1088 T202167 (duration: 00m 56s)
  • 06:15 reedy@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/OATHAuth: revert logging (loldeployingfromaplane) (duration: 00m 59s)
  • 06:11 marostegui: Deploy schema change on s6 primary master - T86338
  • 04:08 ejegg: updated fundraising internal dashboard from 5e9fb9a3ef to 1e4dd2a9ec
  • 03:30 ejegg: updated payments-wiki from 42283c73d0 to 34250e80b5
  • 00:08 catrope@deploy1001: Synchronized wmf-config/throttle.php: T210681 (duration: 01m 04s)

2018-11-28

  • 23:23 tzatziki: changing a few passwords for compromised accounts
  • 21:35 mutante: gnt-instance reboot proton1001.eqiad (stopped working, no SSH)
  • 21:28 arlolra: Updated Parsoid to 18a98af (T209236, T210437, T184755, T187142, T208470, T207286, T206777, T205710, T205546, T204477)
  • 21:26 mutante: rebooting proton1002
  • 21:19 herron: restarted nagios-nrpe-server on proton1002
  • 21:16 arlolra@deploy1001: Finished deploy [parsoid/deploy@9ed8c47]: Updating Parsoid to 18a98af (duration: 09m 35s)
  • 21:06 arlolra@deploy1001: Started deploy [parsoid/deploy@9ed8c47]: Updating Parsoid to 18a98af
  • 20:41 herron: rebooting logstash1005 for security updates
  • 20:22 paravoid: neodymium: mv /srv/jnt{,.old} (use cumin1001 instead!)
  • 19:40 herron: rebooting logstash1004 to pick up security updates
  • 19:31 ebernhardson: start goreplay logging of port 9200 across eqiad elastic cluster to track down T208248
  • 19:05 ejegg: updated payments-wiki from 3e0c97a969 to 6552acdc0f
  • 18:58 ladsgroup@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/ORES/includes/Hooks/ApiHooksHandler.php: Don't try to add scores in API where there is nothing to add (T210610) (duration: 00m 55s)
  • 18:33 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/Wikibase/lib/includes/Formatters/CachingKartographerEmbeddingHandler.php: Never return null in CachingKartographerEmbeddingHandler::getParserOutput T210617 (duration: 00m 53s)
  • 18:26 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/EventBus/includes/EventBus.php: SWAT: Revert "Revert "Revert "Set event datetime with microsecond resolution.""" T210608 (duration: 00m 55s)
  • 17:16 bblack: rebooting lvs1006, console was unresponsive
  • 17:11 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable classic_entity wbsearchentities AB test T209402 T210618 (duration: 00m 55s)
  • 17:01 chasemp: stat1004:~# aptitude install exfat-fuse exfat-utils (elukey fyi)
  • 16:57 mutante: restarting gerrit
  • 16:54 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/WikimediaIncubator/extension.json: Revert "Replace wiki with wikipedia as wmf-config has been updated" T117023 (duration: 00m 54s)
  • 16:53 mutante: gerrit about to restart for logging config change
  • 16:50 gtirloni: T206916 created shnwiki views/index in labsdb replicas
  • 15:55 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1122 (duration: 00m 53s)
  • 15:52 banyek: repooling db1122 after schema change (T85757)
  • 15:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1085 T86338 (duration: 00m 52s)
  • 15:49 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: wikidatawiki back to 1.33.0-wmf.4
  • 15:24 vgutierrez: use a certcentral managed TLS certificate in dumps.wm.o - T207050
  • 15:19 banyek: Deploy schema change on db1122 - T85757
  • 15:17 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1122 (duration: 00m 53s)
  • 15:10 banyek: depooling db1122 due schema change (T85757)
  • 15:09 marostegui: Deploy schema change on db1085 with replication - T86338
  • 15:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1085 T86338 (duration: 00m 56s)
  • 14:05 hashar@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.6 (duration: 00m 52s)
  • 14:04 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.6
  • 13:59 zeljkof: EU SWAT (finally) done
  • 13:57 zfilipin@deploy1001: Finished scap: SWAT: Add user preference to disable the advanced interface (T210479) (duration: 38m 12s)
  • 13:19 zfilipin@deploy1001: Started scap: SWAT: Add user preference to disable the advanced interface (T210479)
  • 12:54 moritzm: installing serf update from stretch point release
  • 12:43 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/Cite: SWAT: Make backlink highlighting robust for community customized HTML (T205270 T210520) (duration: 00m 55s)
  • 12:34 moritzm: installing ghostscript security updates
  • 12:24 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Make AdvancedSearch default on all wikis (T207639) (duration: 00m 54s)
  • 11:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1085 T202167 (duration: 00m 53s)
  • 11:17 marostegui: Deploy schema change on db1085 (s6) (sanitarium master) with replication - T202167
  • 11:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1085 T202167 (duration: 00m 53s)
  • 11:15 banyek: repooling labsdb1009 after maintenance
  • 10:46 moritzm: rolling reboot of logstash1007-1009 to pick up new SSBD instructions and OpenJDK security updates
  • 10:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool pc1008 - T208383 (duration: 00m 50s)
  • 10:26 ladsgroup@deploy1001: Finished deploy [ores/deploy@9b9ba06]: T206333 (duration: 14m 48s)
  • 10:11 ladsgroup@deploy1001: Started deploy [ores/deploy@9b9ba06]: T206333
  • 10:08 jynus: disable mailing list mediation-en-l
  • 10:00 banyek: depooling labsdb1009 (T209517)
  • 09:55 marostegui: Update tendril topology for pc1 - T208383
  • 09:51 marostegui: Change pc2008 to replicate from pc1008 instead of pc1005 - T208383
  • 09:22 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1113:3316 T86338 T202167 (duration: 00m 53s)
  • 09:14 vgutierrez: Use a TLS certificate managed by certcentral in icinga.wm.o - T207050
  • 09:00 marostegui: Deploy schema change db1113:3316 T86338 T202167
  • 09:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1113:3316 T86338 T202167 (duration: 00m 53s)
  • 08:41 moritzm: installing git security updates on trusty (Debian already fixed)
  • 08:28 moritzm: installing samba security updates (client libs)
  • 08:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 T86338 (duration: 00m 53s)
  • 08:12 elukey: apply -R 200 to memcached on mc1022 (cache wipe) - T208844
  • 07:45 marostegui: Deploy schema change db1096:3316 T86338
  • 07:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 T86338 (duration: 00m 53s)
  • 07:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1098:3316 T86338 T202167 (duration: 00m 54s)
  • 07:17 marostegui: Deploy schema change db1098:3316 T86338 T202167
  • 07:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 T86338 T202167 (duration: 00m 53s)
  • 06:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1093 T86338 T202167 (duration: 00m 52s)
  • 06:40 marostegui: Deploy schema change db1093 T86338 T202167
  • 06:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1093 T86338 T202167 (duration: 00m 53s)
  • 06:14 marostegui: Stop MySQL on pc1005 to clone pc1008 - T208383
  • 06:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1005 - T208383 (duration: 01m 04s)
  • 06:03 marostegui: Deploy schema change on s6 codfw - T86338 T202167
  • 02:17 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/WikimediaEvents/modules/wikibase/ext.wikimediaEvents.completionClicks.js: Fix wbsearchentities collection of non-test bucketed data (duration: 00m 56s)
  • 02:16 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/modules/wikibase/ext.wikimediaEvents.completionClicks.js: Fix wbsearchentities collection of non-test bucketed data (duration: 00m 54s)
  • 01:45 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/WikimediaEvents/: wbsearchentities needed extension.json to be deployed as well. Sync the whole directory (duration: 00m 53s)
  • 01:44 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/: wbsearchentities needed extension.json to be deployed as well. Sync the whole directory (duration: 00m 53s)
  • 01:34 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209402: Start wbsearchentities ab test at 10% (duration: 00m 54s)
  • 01:29 ebernhardson: ebernhardson@deploy1001 Synchronized php-1.33.0-wmf.6/includes/parser/Parser.php: SWAT: T209236 Protect legacy URL parameter syntax in link and alt options (duration: 00m 51s)
  • 01:22 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/ParsoidBatchAPI/includes/ApiParsoidBatch.php: SWAT: Revert ApiParsoidBatch update (duration: 00m 54s)
  • 01:12 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T209402: Configuration for wbsearchentities AB test (duration: 00m 53s)
  • 01:09 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/modules/wikibase/ext.wikimediaEvents.completionClicks.js: T209402: AB testing support for wbsearchentities (duration: 00m 52s)
  • 01:08 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/WikimediaEvents/modules/wikibase/ext.wikimediaEvents.completionClicks.js: T209402: AB testing support for wbsearchentities (duration: 00m 53s)
  • 01:06 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/Wikibase/view/resources/jquery/wikibase/jquery.wikibase.entityselector.js: Allow AB test to modify entityselector api request (duration: 00m 56s)
  • 01:02 ejegg: updated payments-wiki config to 51abff4b66
  • 00:19 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.6/extensions/ParsoidBatchAPI/includes/ApiParsoidBatch.php: SWAT I4e4373a7 revert Modernize ApiParsoidBatch using ApiResult to generate prettier output (duration: 00m 54s)
  • 00:07 ebernhardson@deploy1001: Synchronized wmf-config/CommonSettings-labs.php: SWAT: no-op labs sync for 476027 (duration: 00m 53s)
  • 00:06 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: no-op labs sync for 476027 (duration: 00m 55s)

2018-11-27

  • 23:49 ejegg: updated payments-wiki config to 04f718e8f1
  • 23:29 ejegg: updated payments-wiki config to 082eab3566
  • 22:35 ejegg: rolled back payments-wiki to 86742dd4fd
  • 22:33 ejegg: updated payments-wiki from 86742dd4fd to 3ba10e49b0
  • 21:26 ejegg: updated payments-wiki config to d497b7e880
  • 19:40 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: no op sync wikiversions to test scap 3.8.10-1
  • 18:03 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: no op sync wikiversions to test scap 3.8.9-1
  • 17:51 godog: upload scap 3.8.9-1 - T210469
  • 17:40 bstorm_: T209517 rebooted labsdb1005 after upgrades
  • 17:30 bstorm_: T209517 icinga downtime labsdb1004
  • 17:30 XioNoX: repool uslfo
  • 17:25 arturo: T209517 icinga downtime labsdb1005
  • 17:23 arturo: T207377 icinga downtime labnet1001
  • 16:17 mutante: einsteinium - apt-get remove --purge icinga nsca; apt-get autoremove ; apt-get remove --purge icinga-doc icinga-common icinga-cgi-bin icinga-cgi; apt-get remove --purge monitoring-plugin* ; rm /etc/rsync.d/frag-icinga* T209738
  • 16:14 hashar@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.6
  • 16:09 moritzm: rebooting planet1001 for kernel security update
  • 16:04 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1105 (duration: 00m 53s)
  • 15:53 moritzm: rebooting planet2001 for kernel security update
  • 15:44 moritzm: rebooting kafkamon1001 for kernel security update
  • 15:41 mutante: einsteinium - removed icinga package
  • 15:39 moritzm: rebooting kafkamon2001 for kernel security update
  • 15:35 mutante: einsteinium - stopped icinga, stopped nsca, stopped rsyncd, killall -u icinga, killall -u nagios ... T209738
  • 15:35 hashar@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.6 and rebuild l10n cache | T206660 (duration: 32m 42s)
  • 15:21 akosiaris: create graphoid namespace on kubernetes eqiad, codfw, staging clusters T203091
  • 15:02 hashar@deploy1001: Started scap: testwiki to php-1.33.0-wmf.6 and rebuild l10n cache | T206660
  • 14:52 hashar@deploy1001: Pruned MediaWiki: 1.32.0-wmf.26 (duration: 09m 48s)
  • 14:40 hashar: Applied security patches for 1.33.0-wmf.6 | T206660
  • 14:33 hashar: scap prep 1.33.0-wmf.6 | T206660
  • 14:04 hashar: Cutting 1.33.0-wmf.6 branches | T206660
  • 14:00 banyek: repooling db1105 due a schema change (T85757)
  • 13:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool pc1010 in pc1 - T208383 (duration: 00m 46s)
  • 13:34 marostegui: Change pc2007 and pc2010 to replicate from pc1010 instead of from pc1004 - T208383
  • 13:18 banyek: executing schema change on db1105 (T85757)
  • 13:18 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1105 (duration: 00m 46s)
  • 13:17 pmiazga@deploy1001: Finished deploy [proton/deploy@9efc072]: Proton: Rewrite Queue to promise-way flow (T204055) (duration: 02m 43s)
  • 13:15 pmiazga@deploy1001: Started deploy [proton/deploy@9efc072]: Proton: Rewrite Queue to promise-way flow (T204055)
  • 13:14 banyek: depooling db1105 due a schema change (T85757)
  • 13:09 raynor: proton deploying 9efc072
  • 12:39 mobrovac@deploy1001: Synchronized php-1.33.0-wmf.4/includes/page/WikiPage.php: Convert $archivedRevisionCount to integer - T210013 T210451 (duration: 00m 47s)
  • 12:32 gilles@deploy1001: Synchronized docroot/wikipedia.org/speed-tests/http2priorities/upload.wikimedia.org.html: T210141 Add variant of HTTP/2 priorities test pointing to upload (duration: 00m 46s)
  • 12:25 mobrovac@deploy1001: Synchronized static/images/project-logos: Add localised logos for the Minangkabau Wikipedia - T210387 (duration: 00m 47s)
  • 12:19 mobrovac@deploy1001: Synchronized rpc/RunSingleJob.php: RunSingleJob: Check that JobExecutor has been loaded - T208922 (duration: 00m 47s)
  • 11:56 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1103 (duration: 00m 46s)
  • 11:50 banyek: repooling db1103 after schema change (T85757)
  • 11:32 banyek: executing schema change on db1103:3312 (T85757)
  • 11:32 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1103 (duration: 00m 46s)
  • 11:28 banyek: depooling db1103 due a schema change (T85757)
  • 10:59 banyek: executing schema change on db1095 (T85757)
  • 10:55 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1090 (duration: 00m 46s)
  • 10:51 banyek: repooling db1090 after schema change (T85757)
  • 10:51 banyek: repooling db1090 due a schema change (T85757)
  • 10:42 gilles@deploy1001: Synchronized docroot/wikipedia.org/speed-tests/http2priorities: T210141 HTTP/2 prioritie speed test (duration: 00m 47s)
  • 10:36 arturo: T209948 disable puppet in all WMCS hw servers
  • 10:30 arturo: T209948 schedule 2h icinga downtime in all WMCS hw servers
  • 10:26 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1090 (duration: 00m 46s)
  • 10:19 banyek: depooling db1090 due a schema change (T85757)
  • 10:18 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1076 (duration: 00m 45s)
  • 10:13 banyek: repooling db1076 after schema change (T85757)
  • 10:06 banyek: executing schema change on db1076 (T85757)
  • 10:02 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1076 (duration: 00m 46s)
  • 09:57 banyek: depooling db1076 due a schema change (T85757)
  • 09:35 marostegui: Stop MySQL on pc1004 to clone pc1010 - T208383
  • 09:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1004 - T208383 (duration: 00m 46s)
  • 09:00 banyek: executing schema change on db2035 for s2 (T85757)
  • 08:43 elukey: roll restart of all druid daemons on druid100[1-6] for openjdk-8 upgrades
  • 08:41 vgutierrez: Use a TLS certificate managed by certcentral in apt.wm.o - T207050
  • 08:15 godog: more weight to new ms-be hosts in codfw - T209395
  • 07:35 _joe_: depooling mw1261 for benchmarking, T206341
  • 07:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3312 after cloning db1095:3312 (duration: 00m 46s)
  • 07:16 marostegui: Start MySQL on db1090:3312 after recloning db1095:3312
  • 07:16 marostegui: Start MySQL on db1095:3312 after recloning it
  • 06:16 marostegui: Stop mysql on db1090:3312 to clone db1095:3312
  • 06:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3312 (duration: 00m 49s)
  • 06:15 marostegui: Stop mysql on db1095:3312 to get it recloned
  • 04:33 ejegg: updated payments-wiki from d2b66c5bab to 86742dd4fd
  • 04:31 ejegg: updated fundraising CiviCRM from 013807a7b9 to a411d6bd64
  • 04:27 ejegg: updated standalone SmashPig deployment from f65daa8550 to fb3268897b
  • 00:38 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix weird caching issue maybe for Ian's patch (duration: 00m 46s)
  • 00:29 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Enable SVGs in page language everywhere T208899 (duration: 00m 45s)
  • 00:27 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable SVGs in page language everywhere T208899 (duration: 00m 46s)
  • 00:17 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set wgMinervaSchemaMainMenuClickTrackingSampleRate T205008 (duration: 00m 46s)
  • 00:12 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Delete a namespace from ruwikisource T210171 (duration: 00m 46s)
  • 00:05 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Enable trust survey on labs T209882 (duration: 00m 46s)
  • 00:03 niharika29@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 03s)

2018-11-26

  • 23:50 ottomata: temporarily disabling puppet on stat1007 to copy over eventbus validation logs (not using stat1007 after all)
  • 23:48 ottomata: temporarily disabling puppet on stat1007 to copy over eventbus validation logs
  • 23:24 sbassett@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove unblockself right everywhere 2046d8df3 T150826 (duration: 00m 47s)
  • 23:15 XioNoX: depool ulsfo for transport providers maintenance
  • 20:14 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/Wikibase/repo/includes/Search/Elastic/EntitySearchElastic.php: SWAT T209402 Make wbsearchentities tie-breaker configurable (duration: 00m 47s)
  • 19:33 ebernhardson@deploy1001: Synchronized wmf-config/WikibaseSearchSettings.php: SWAT T209402 Search profiles for wbsearchentities AB test (duration: 00m 46s)
  • 19:25 ebernhardson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/includes/: SWAT T210003 T210004 EditorJourney: Adjust DeferredUpdates usage (duration: 00m 46s)
  • 19:24 XioNoX: remove IP from blacklist on Amsterdam routers - T201411
  • 19:24 cmjohnson1: cloudvirt1019 is going down to check something for HPE
  • 19:16 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T209432 Enable NewUserMessage extension on tcy.wikipedia (duration: 00m 48s)
  • 18:50 mutante: stopping icinga service on einsteinium, is a role(spare) now T209738
  • 18:46 mutante: removed allowed sender addresses from AQL (mail2SMS gateway) portal: @einsteinium @tegmen addresses T208824 T209738
  • 18:45 ppchelko@deploy1001: Finished deploy [changeprop/deploy@c89bff5]: Prepared for ORES error response T197000 (duration: 01m 13s)
  • 18:44 ppchelko@deploy1001: Started deploy [changeprop/deploy@c89bff5]: Prepared for ORES error response T197000
  • 18:41 XioNoX: re-activate BGP to AS41692 on cr2-esams
  • 18:15 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@8bfea2b]: GUI updates and updater with dump and revision logging. (duration: 11m 13s)
  • 18:04 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@8bfea2b]: GUI updates and updater with dump and revision logging.
  • 17:54 bblack@neodymium: conftool action : set/pooled=yes; selector: name=cp5001.eqsin.wmnet
  • 17:53 bblack: re-pooling cp5001 - T199675
  • 17:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1122 (duration: 00m 46s)
  • 17:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1122 (duration: 00m 46s)
  • 17:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1122 (duration: 00m 46s)
  • 17:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1122 (duration: 00m 45s)
  • 17:23 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Reverting: Bumping portals to master (T128546) (duration: 00m 46s)
  • 17:22 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Reverting: Bumping portals to master (T128546) (duration: 00m 46s)
  • 17:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3312 and db1122 (duration: 00m 46s)
  • 16:01 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3312 to clone db1122 (duration: 00m 46s)
  • 16:00 marostegui: Stop MySQL on db1090:3312 to clone db1122
  • 15:57 marostegui: Stop MySQL on db1122 - it will be recloned
  • 15:50 ppchelko@deploy1001: Finished deploy [changeprop/deploy@77be2c6]: Change schema for revision-score events and start emitting again T197000 (duration: 01m 24s)
  • 15:48 ppchelko@deploy1001: Started deploy [changeprop/deploy@77be2c6]: Change schema for revision-score events and start emitting again T197000
  • 15:32 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1122 - actor table missing (duration: 00m 46s)
  • 15:27 ppchelko@deploy1001: Finished deploy [changeprop/deploy@b97e8eb]: Temporary stop emitting revision-score events for schema change T197000 (duration: 01m 21s)
  • 15:26 ppchelko@deploy1001: Started deploy [changeprop/deploy@b97e8eb]: Temporary stop emitting revision-score events for schema change T197000
  • 15:24 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting actor migration to write-both/read-old on group 1 (T188327) (duration: 00m 46s)
  • 13:47 moritzm: rebooting releases1001
  • 13:36 moritzm: rebooting releases2001
  • 13:33 moritzm: rebooting netmon1003
  • 12:04 _joe_: uploaded php-excimer for component thirdparty/php72 to stretch-wikimedia T205059
  • 12:00 kartik@deploy1001: Finished deploy [cxserver/deploy@da41227]: Disable Youdao MT until service is back (duration: 04m 01s)
  • 11:56 kartik@deploy1001: Started deploy [cxserver/deploy@da41227]: Disable Youdao MT until service is back
  • 11:41 arturo: icinga downtime cloudcontrol1004 for some systemd slice tests
  • 11:01 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 46s)
  • 11:00 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 47s)
  • 10:32 moritzm: rebooting boron
  • 09:58 banyek: performing schema change on s6 master - db1061 for frwiki (T85757)
  • 09:51 banyek: performing schema change on s6 master - db1061 for ruwiki (T85757)
  • 09:44 banyek: performing schema change on s6 master - db1061 for jawiki (T85757)
  • 08:35 moritzm: removed labvirt1015 from debmonitor DB (got renamed to cloudvirt1015)
  • 08:22 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1096:3316 - T202167 (duration: 00m 46s)
  • 08:16 marostegui: Deploy schema change on db1096:3316 - T202167
  • 08:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3316 - T202167 (duration: 00m 47s)
  • 08:10 moritzm: installing pixman security updates
  • 07:37 elukey: restart memcached on mc1021 (cache wipe) to add -R 200 - T208844
  • 07:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1088 - T86338 (duration: 00m 46s)
  • 07:00 marostegui: Deploy schema change on db1088 - T86338
  • 07:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1088 - T86338 (duration: 00m 46s)
  • 06:46 marostegui: Deploy schema change on db2067 - T86338
  • 06:36 marostegui: Deploy schema change on s3 master (db1075) - T86339
  • 06:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 - T86339 (duration: 00m 46s)
  • 06:22 marostegui: Reload haproxy on dbproxy1005
  • 06:22 marostegui: Deploy schema change on db1077 (s3 sanitarium master) with replication - T86339
  • 06:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 - T86339 (duration: 00m 50s)
  • 06:02 kartik@deploy1001: Finished deploy [cxserver/deploy@d2173ca]: Update cxserver to 218a543 (duration: 04m 21s)
  • 05:58 kartik@deploy1001: Started deploy [cxserver/deploy@d2173ca]: Update cxserver to 218a543

2018-11-25

  • 10:40 _joe_: restarting pdfrender on scb1002, flapping since 3:00 UTC

2018-11-23

  • 21:15 bawolff: deploy patch for T210192
  • 16:53 bblack: cleaned up remnants of globalsign-2017 unified cert (OCSP cache/config, unmanaged cert files, etc) on all cpNNNN - T206804
  • 14:02 gehel: restor wdqs-updater heap to 2G - T210235
  • 13:32 moritzm: installing confuse security updates
  • 12:04 gehel: manually increasing wdqs-updater heap to 4G - T210235
  • 11:37 gehel: restarting updater on all wdqs ndoes
  • 08:41 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=.*,service=zotero,cluster=kubernetes,name=.*
  • 08:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 - T86339 (duration: 00m 46s)
  • 07:48 moritzm: installing libtirpc security updates
  • 07:14 marostegui: Deploy schema change db1078 - T86339
  • 07:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T86339 (duration: 00m 45s)
  • 07:00 marostegui: Deploy schema change db1095 - T86339
  • 06:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 - T86339 (duration: 00m 49s)
  • 06:32 marostegui: Deploy schema change db1123 - T86339
  • 06:31 marostegui: Deploy schema change dbstore1002:s3 - T86339
  • 06:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 - T86339 (duration: 00m 48s)
  • 06:15 marostegui: Deploy schema change on s3 codfw master (db2043) with replication - T86339
  • 06:13 marostegui: Deploy schema change on db1067 (s1) master - T86339

2018-11-22

  • 21:08 godog: disable raid handler for ms-be2021 - T208096
  • 19:01 moritzm: installing uriparser security updates
  • 18:46 moritzm: installing openjpeg2 security updates
  • 18:45 arturo: enable puppet in all CloudVPS HW servers
  • 18:38 arturo: disable puppet in all CloudVPS HW servers to test a patch (T209948)
  • 17:46 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool es1016 (duration: 00m 46s)
  • 16:49 jynus: upgrading, and restarting es1016 (but not deleting, that was a mistake)
  • 16:49 jynus: upgrading, deleting at and restarting es1016
  • 16:35 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool es1016 (duration: 00m 47s)
  • 16:06 ema: trafficserver_8.0.0-1wm3 uploaded to stretch-wikimedia
  • 15:23 akosiaris@deploy1001: scap-helm zotero finished
  • 15:23 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 15:23 akosiaris@deploy1001: scap-helm zotero install --name production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 15:22 akosiaris@deploy1001: scap-helm zotero finished
  • 15:22 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 15:22 akosiaris@deploy1001: scap-helm zotero install --name production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 14:50 akosiaris@deploy1001: scap-helm zotero install --name production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 13:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T86339 (duration: 00m 45s)
  • 13:52 marostegui@deploy1001: sync-file aborted: Depool db1080 - T86339 (duration: 00m 00s)
  • 13:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T86339 (duration: 00m 46s)
  • 13:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 - T86339 (duration: 00m 46s)
  • 13:45 marostegui: Deploy schema change on s1 eqiad hosts T86339
  • 13:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T86339 (duration: 00m 46s)
  • 13:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T86339 (duration: 00m 45s)
  • 13:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T86339 (duration: 00m 46s)
  • 13:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1114 - T86339 (duration: 00m 45s)
  • 13:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1114 - T86339 (duration: 00m 43s)
  • 13:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1119 - T86339 (duration: 00m 46s)
  • 13:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1119 - T86339 (duration: 00m 47s)
  • 12:19 jynus: upgrading, deleting at and restarting dbstore2002
  • 10:21 jynus: upgrading and restarting dbstore1001
  • 10:01 godog: bounce rsyslog on lithium, tls listener timeout on icinga
  • 09:54 godog: bounce rsyslog on wezen, tls listener timeout on icinga
  • 09:49 jynus: stop and upgrade dbstore2001
  • 09:24 moritzm: installing ruby-l18n security updates
  • 09:20 moritzm: installing ruby-rack security updates
  • 09:06 moritzm: installing jasper security updates
  • 08:21 marostegui: Deploy schema change on s1 codfw master (db2048) with replication - T86339
  • 08:19 marostegui: Deploy schema change on db1062 (s7 master) - T86339
  • 08:19 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T86339 (duration: 00m 46s)
  • 08:17 marostegui: Deploy schema change on db1094 - T86339
  • 08:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T86339 (duration: 00m 49s)
  • 06:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T86339 (duration: 00m 46s)
  • 06:55 marostegui: Deploy schema change on db1086 - T86339
  • 06:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T86339 (duration: 00m 46s)
  • 06:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T86339 (duration: 00m 46s)
  • 06:48 marostegui: Deploy schema change on db1079 (sanitarium master) with replication - T86339
  • 06:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T86339 (duration: 00m 51s)
  • 02:25 mutante: stat1005 - started nagios-nrpe-server

2018-11-21

  • 23:34 mutante: rsyncing /home from rutherfordium.eqiad to people1001.eqiad (people.wikimedia.org) T210036
  • 21:16 robh: cp5001 is offline running hardware tests after firmware updates to see if memory error still exists. ref: T199675
  • 20:55 robh: cp5001 reboot for firmware update
  • 19:54 milimetric@deploy1001: Finished deploy [analytics/aqs/deploy@e114d99]: Fixing sorting bug on top endpoints (duration: 05m 34s)
  • 19:49 milimetric@deploy1001: Started deploy [analytics/aqs/deploy@e114d99]: Fixing sorting bug on top endpoints
  • 19:43 ejegg: updated fundraising internal dashboard from b01458b260 to 5e9fb9a3ef
  • 17:21 elukey: manually started systemd-journald.service on scb1001 after OOM
  • 17:20 jynus: stop and upgrade db2081
  • 17:04 jynus: stop and upgrade db2080
  • 16:40 jynus: stop and upgrade db2066
  • 16:37 bawolff: deploy patch T209794
  • 16:25 jynus: stop and upgrade db2063
  • 15:15 jynus: stop and upgrade db2073
  • 15:15 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1085 (duration: 00m 46s)
  • 15:03 banyek: repooling db1085 after schema change (T85757)
  • 15:00 banyek: restarting replication on db1085 (T85757)
  • 14:31 banyek: stopping replication on db1085 (T85757)
  • 14:27 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1085 (duration: 00m 46s)
  • 14:22 banyek: depooling db1085 due schema change (T85757)
  • 13:39 jynus: stop and upgrade db2077
  • 13:05 XioNoX: remove BGP session to 2603 on cr4-ulsfo
  • 12:13 jynus: stop and upgrade db2076
  • 12:08 banyek: running schema change on dbstore1001:3316 (T85757)
  • 12:08 banyek: running schema change on dbstore1001 (T85757)
  • 12:04 jynus: stop and upgrade db2075
  • 11:53 banyek: running schema change on dbstore1002 (T85757)
  • 11:00 akosiaris: disable puppet on ores2* ores1* for gradual rollout of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/474694/1/modules/ores/manifests/web.pp
  • 10:55 jynus: stop and upgrade db2074
  • 10:51 _joe_: uploading prometheus-php-fpm-exporter to stretch-wikimedia main, T209573
  • 10:44 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1096:3316 (duration: 00m 46s)
  • 10:42 banyek: repooling db1096:3316 after schema change (T85757)
  • 10:30 jynus: stop and upgrade db2095
  • 10:26 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1096:3316 (duration: 00m 45s)
  • 10:26 godog: initial weight for new ms-be2* hosts (all but ms-be2047) - T209395
  • 10:23 banyek: depooling db1096:3316 due schema change (T85757)
  • 10:14 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1098:3316 (duration: 00m 46s)
  • 10:13 banyek@deploy1001: sync-file aborted: T85757: depool db1098:3316 (duration: 00m 03s)
  • 10:11 banyek: repooling db1098 after schema change (T85757)
  • 09:57 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1098:3316 (duration: 00m 46s)
  • 09:52 banyek: depooling db1098:3316 due schema change (T85757)
  • 09:49 volans: restarted pdfrender on scb1003
  • 09:29 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1113 (duration: 00m 46s)
  • 09:21 banyek: repooling db1113 after schema change (T85757)
  • 09:09 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1113 (duration: 00m 46s)
  • 09:01 banyek: depooling db1113 due schema change (T85757)
  • 08:48 marostegui: Deploy schema change on s7 codfw master - T86339
  • 08:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1104 - T86339 (duration: 00m 45s)
  • 08:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T86339 (duration: 00m 45s)
  • 08:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 - T86339 (duration: 00m 46s)
  • 08:30 marostegui: Deploy schema changes on s8 eqiad hosts - T86339
  • 08:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T86339 (duration: 00m 46s)
  • 07:50 marostegui: Deploy schema change on s8 codfw master (db2045) with replication - T86339
  • 07:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 - T86339 (duration: 00m 45s)
  • 07:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1091 - T86339 (duration: 00m 46s)
  • 07:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1084 - T86339 (duration: 00m 46s)
  • 07:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1084 - T86339 (duration: 00m 46s)
  • 07:32 marostegui: Deploy schema change on s4 eqiad hosts - T86339
  • 07:19 marostegui: Deploy schema change on db2051 (s4 codfw master) with replication - T86339
  • 07:10 marostegui: Drop foundationwiki.petition_data from s3 master (db1075) with replication - T208979
  • 07:06 marostegui: Deploy schema change on db1066 (s2 master) - T86339
  • 07:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1076 - T86339 (duration: 00m 46s)
  • 07:02 marostegui: Deploy schema change on db1076 - T86339
  • 07:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1076 - T86339 (duration: 00m 46s)
  • 06:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1074 - T86339 (duration: 00m 46s)
  • 06:48 marostegui: Deploy schema change on db1074 (sanitarium master) - T86339
  • 06:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1074 - T86339 (duration: 00m 46s)
  • 06:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1122 - T86339 (duration: 00m 46s)
  • 06:39 marostegui: Deploy schema change on db1122 - T86339
  • 06:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1122 - T86339 (duration: 00m 51s)
  • 06:30 marostegui: Drop schema change on db1103:3312 and db1105:3312 - T86339
  • 00:48 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:474967 Disable FlaggedRevs, enable RC patrol and add rights on srwikinews (duration: 00m 47s)
  • 00:39 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:475005 Enable SVGs in page in group1, rest of group0 (duration: 00m 46s)
  • 00:32 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:474976 Enable suppressredirect on srwiki (duration: 00m 47s)
  • 00:24 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:472744 Enable RCPatrol and add some rights on srwikibooks (duration: 00m 46s)
  • 00:07 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/GrowthExperiments/includes/Specials/SpecialWelcomeSurvey.php: gerrit:474946 WelcomeSurvey: indicate that the special page does write (duration: 00m 47s)

2018-11-20

  • 23:01 XioNoX: create vol.ans account on switches - T208726
  • 22:54 pmiazga@deploy1001: Synchronized wmf-config: SYNC: noop Doc: add repoConceptBaseUri comment (T209352)noop: Remove utf-8 characters from DOC comment for better readability (T209352)beta: Wikibase: override repoConceptBaseUri (T209352) (duration: 00m 49s)
  • 22:13 XioNoX: create volans account on routers - T208726
  • 21:12 eileen: civicrm revision changed from f4127d5316 to 013807a7b9, config revision is 684ec9b7c0
  • 21:08 jgleeson: civicrm changed from a31dbefc61 to f4127d5316
  • 18:56 ebernhardson: start loading dumps into elastic codfw omega and psi from mwmaint2001
  • 18:19 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: Create Federated Wikibase instance on Beta Commons, part II (T204748) (duration: 00m 47s)
  • 18:17 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Create Federated Wikibase instance on Beta Commons (T204748) (duration: 00m 48s)
  • 18:08 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@7553087]: Deploy 2018 app fundraising announcement config (T204821) (duration: 03m 37s)
  • 18:04 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@7553087]: Deploy 2018 app fundraising announcement config (T204821)
  • 18:03 bstorm_: rebooting labsdb1006 for upgrades T209517
  • 17:25 bstorm_: rebooting labsdb1004 for upgrades T209517
  • 17:19 gehel: reload nginx configuration on elasticsearch codfw
  • 16:59 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2072 (duration: 00m 46s)
  • 16:53 XioNoX: rollback all BFD tests on cr1-codfw
  • 16:10 jynus: stop and upgrade db2072
  • 15:59 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2071, depool db2072 (duration: 00m 47s)
  • 15:51 banyek: repooling labsdb1011 (T209517)
  • 15:41 banyek: uploaded wmf-pt-kill_2.2.20-1+wmf5 packages to stretch-wikimedia (T209517)
  • 15:38 vgutierrez: switching to certcentral managed TLS certificate for librenms.wikimedia.org - T209856
  • 15:36 XioNoX: add test term allow BFD multihop on cr1-codfw loopback4 filter
  • 15:36 ejegg: updated fundraising CiviCRM from e648be0d9e to a31dbefc61
  • 15:20 moritzm: installing libopenmpt security updates
  • 15:12 XioNoX: enable bfd traceoptions on cr1-codfw
  • 15:02 XioNoX: Add BFD multihop support to Bird anycast DNS
  • 14:55 jijiki: libthumbor_1.3.2-0+wmf1+stretch1 uploaded to stretch-wikimedia T209886
  • 14:43 chasemp: puppet temp disable on es2001 for data transfer work
  • 14:32 jynus: stop and upgrade db2033
  • 13:19 jynus: stop and upgrade db2082
  • 13:06 banyek: depooling labsdb1011 (T209517)
  • 13:03 zeljkof: EU SWAT finished
  • 13:03 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: deployment-prep: Update parsoid09 IP (T208101) (duration: 00m 47s)
  • 12:55 zfilipin@deploy1001: Synchronized wmf-config/db-labs.php: SWAT: deployment-prep: Update deployment-db* IPs (T208101) (duration: 00m 47s)
  • 12:55 banyek: setting innodb_flush_log_at_trx_commit to 2 on dbstore2002 (s3 instance only!) (T208320)
  • 12:53 banyek: setting innodb_flush_log_at_trx_commit to 2 on dbstore2002 (T208320)
  • 12:49 zfilipin@deploy1001: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 12:45 zfilipin@deploy1001: Synchronized wmf-config/reverse-proxy-staging.php: SWAT: deployment-prep: Update cache-upload private IP (T208101) (duration: 00m 45s)
  • 12:30 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use HD logos in InitialiseSettings.php for multiple projects (T150618) (duration: 00m 48s)
  • 12:25 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add tboverride permission to extendedmover group on enwiki (T209753) (duration: 00m 47s)
  • 12:19 jynus: powercycling db2087, stuck on reboot
  • 12:13 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Upload HD logos for multiple projects (T150618) (duration: 00m 48s)
  • 11:55 moritzm: rolling reboot of proton hosts for kernel security update
  • 11:27 jynus: stop and upgrade db2087
  • 11:16 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: (now really) repool db1093 (duration: 00m 47s)
  • 11:11 banyek: repooling db1093 (T85757)
  • 11:05 banyek: executing schema change on db1093 (T85757)
  • 11:00 jynus: stop and upgrade db2086
  • 10:59 banyek: db1093 was depooled wrong message sent
  • 10:51 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1093 (duration: 00m 47s)
  • 10:48 banyek: depooling db1093 (T85757)
  • 10:48 banyek: depooling db1093
  • 10:47 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2018 (duration: 00m 46s)
  • 10:17 jynus: upgrade and reboot es2018
  • 10:13 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2014, depool es2018 (duration: 00m 46s)
  • 09:34 marostegui: Deploy schema change on s2 hosts: dbstore1002, db1090:3312 and db1095:3312 - T86339
  • 09:26 marostegui: Deploy schema change on s2 codfw master (db2035) with replication - T86339
  • 09:25 jynus: upgrade and reboot es2014
  • 09:23 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2011, depool es2014 (duration: 00m 46s)
  • 09:23 godog: stress-test new ms-be hardware - T209395
  • 09:13 marostegui: Stop MySQL on pc2004, pc2005 and pc2006 for decommission - T209858
  • 09:05 gehel: powercycle elastic2021
  • 09:04 marostegui: Remove pc2004, pc2005 and pc2006 from tendril and zarcillo - T209858
  • 08:53 jynus: upgrade and reboot es2011
  • 08:48 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool es2011 (duration: 00m 47s)
  • 06:28 marostegui: Deploy schema change on db1070 (s5 master) - T86339
  • 06:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T86339 (duration: 00m 47s)
  • 06:21 marostegui: Deploy schema change on db1082 - T86339
  • 06:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T86339 (duration: 00m 52s)
  • 00:55 catrope@deploy1001: Synchronized static/images/project-logos/: Correct logos for Sindhi Wiktionary (duration: 00m 47s)
  • 00:51 mutante: Gerrit: added Jeena Huneidi to wmf-deployers (T209722)
  • 00:26 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/resources/src/mediawiki.rcfilters/ui/mw.rcfilters.ui.FilterTagMultiselectWidget.js: RCFilters bug fix (T209657) (duration: 00m 47s)
  • 00:23 XioNoX: registering librenms IRC bot
  • 00:15 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Increase Schema.org page split test to 100% sampling (T208755) (duration: 00m 48s)

2018-11-19

  • 23:18 ejegg: updated fundraising CiviCRM from 275716f000 to e648be0d9e
  • 22:41 ejegg: updated fundraising CiviCRM from bbc0dddd1e to 275716f000
  • 22:21 ejegg: updated fundraising CiviCRM from 6b279509f8 to bbc0dddd1e
  • 21:58 XioNoX: restart bird on dns2001 to try to establish the BFD sessions
  • 20:27 catrope@deploy1001: Finished scap: Full scap for special alias changes for GrowthExperiments (duration: 21m 03s)
  • 20:06 catrope@deploy1001: Started scap: Full scap for special alias changes for GrowthExperiments
  • 19:50 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable WelcomeSurvey on cswiki and kowiki (T209725) (duration: 00m 46s)
  • 19:44 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/: EditorJourney fixes (T207307) (duration: 00m 46s)
  • 19:36 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/GrowthExperiments/: WelcomeSurvey fixes (T206371) (duration: 00m 46s)
  • 19:25 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable WelcomeSurvey on testwiki (T209725) (duration: 00m 49s)
  • 18:29 cmjohnson1: connecting eqiad asw2-b fpc2 and fpc8
  • 18:17 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@a25eb30]: GUI Update, new executor limits and new blazegraph build (duration: 08m 53s)
  • 18:08 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@a25eb30]: GUI Update, new executor limits and new blazegraph build
  • 15:45 ladsgroup@deploy1001: Finished deploy [ores/deploy@e957b24]: T209587 T170950 (duration: 17m 09s)
  • 15:28 ladsgroup@deploy1001: Started deploy [ores/deploy@e957b24]: T209587 T170950
  • 15:23 milimetric@deploy1001: Finished deploy [analytics/aqs/deploy@b399c34]: Removing empty fields from unique result (duration: 05m 17s)
  • 15:18 milimetric@deploy1001: Started deploy [analytics/aqs/deploy@b399c34]: Removing empty fields from unique result
  • 15:08 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting actor migration to write-both/read-old on group 0 (T188327) (duration: 00m 47s)
  • 14:02 joal@deploy1001: Finished deploy [analytics/aqs/deploy@7cde8c8]: Update unique-devices schema adding 2 fields (duration: 20m 57s)
  • 13:55 gtirloni: T207377 reboot cloudcontrol1004
  • 13:43 moritzm: installing chromium security update on proton* (tested new upstream release in deployment-prep)
  • 13:41 joal@deploy1001: Started deploy [analytics/aqs/deploy@7cde8c8]: Update unique-devices schema adding 2 fields
  • 13:39 fdans@deploy1001: Finished deploy [analytics/aqs/deploy@7cde8c8]: Deploying AQS to add two new fields to uniques (duration: 06m 18s)
  • 13:39 akosiaris: cumin -b1 -s 300 'ores2*' 'enable-puppet "merge of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/474158/" ; puppet agent -t ; service uwsgi-ores restart ; service celery-ores-worker restart'
  • 13:36 akosiaris: disable puppet on ores1* and ores2* for slow deployment of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/474158/
  • 13:33 fdans@deploy1001: Started deploy [analytics/aqs/deploy@7cde8c8]: Deploying AQS to add two new fields to uniques
  • 13:21 gtirloni: T207377 icinga downtime and reboot of labcontrol1001 and labservices1001
  • 13:08 arturo: T207377 icinga downtime and reboot of cloudcontrol1003 and cloudservices1003
  • 13:06 raynor: EU SWAT finished
  • 13:03 pmiazga@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:474679]|In SecurePoll use gpg1 to avoid gpg-agent autostart (T209802)]] (duration: 00m 48s)
  • 12:50 raynor: EU SWAT reopened
  • 12:40 raynor: EU SWAT finished
  • 12:38 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:472918]|Enable autopatrol, patrol, rollback rights and RCPatrol on srwiktionary (T209252)]] (duration: 00m 46s)
  • 12:21 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:474124]|Remove wgMetaNamespaceTalk for shnwiki (T206777)]] (duration: 00m 46s)
  • 12:10 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:473225]|Enable Schema.org page split test at 50% sampling (T208755)]] (duration: 00m 46s)
  • 11:33 gtirloni: labsdb1011 upgraded packages on labsdb1011 (pre-work T209517)
  • 11:20 elukey: restart memcached on mc1020 to apply -R 200 settings (shard wiped) - T208844
  • 10:41 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1078 T209754 (duration: 00m 46s)
  • 10:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 T209754 (duration: 00m 46s)
  • 10:21 banyek: stopping replication on db2076 (T85757)
  • 10:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 T209754 (duration: 00m 46s)
  • 09:57 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1123 and increase weight for db1078 T209754 (duration: 00m 46s)
  • 09:48 marostegui: Rename table foundationwiki.petition_data on db1078 - T208979
  • 09:46 marostegui: Drop empty testwiki.petition_data from db1075 with replication - T208979
  • 09:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1123 and db1078 T209754 (duration: 00m 46s)
  • 09:10 Nikerabbit: Rebuilt message group stats cache for T208521
  • 08:43 banyek: executing schema change on db2095 (T85757)
  • 07:57 marostegui: Stop MySQL on db1123 - T209754
  • 07:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 to clone db1078 T209754 (duration: 00m 47s)
  • 06:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Add db1078 line back to config file but depooled T209754 (duration: 00m 51s)
  • 06:20 marostegui@deploy1001: sync-file aborted: Add db1078 line back to config file but depooled T209754 (duration: 00m 02s)

2018-11-18

  • 17:26 andrewbogott: restarting cp1078 from mgmt console
  • 09:00 elukey: cleaned up analytics1039 and restarted Yarn

2018-11-17

  • off: 'reset modified attributes' on IcingaUI for db1078 (and mgmt) and all its services
  • 06:38 oblivian@deploy1001: Synchronized wmf-config/db-eqiad.php: Depooling db1078 (duration: 00m 59s)
  • 02:54 RoanKattouw: Deployed patches for T208112, T208109, T208110

2018-11-16

  • 23:13 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@ee91c41]: Deploy on test wdqs1010 (duration: 00m 23s)
  • 23:12 smalyshev@deploy1001: Started deploy [wdqs/wdqs@ee91c41]: Deploy on test wdqs1010
  • 16:23 Trey314159: reindexing Chinese wikis on elastic@eqiad and elastic@codfw (T209156)
  • 15:46 moritzm: rebooting debmonitor* instances for kernel security update and to pick up SSBD
  • 15:01 hashar: restarting zuul with 1300 events to process
  • 14:56 marostegui: Create ipblocks_restrictions on labswiki and labtestwiki on db1073 - T209674
  • 14:56 ema: upgrade cp-ats to 8.0.0-1wm2 T204225 T204209
  • 14:39 ema: trafficserver 8.0.0-1wm2 uploaded to stretch-wikimedia T204225 T204209
  • 14:36 hashar: Gracefully stopping zuul (kill -SIGUSR1)
  • 14:29 _joe_: re-depooling mw1261 for php-fpm testing
  • 14:28 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw126[1-6].*,dc=eqiad,cluster=appserver
  • 14:26 _joe_: repooling the mw canaries
  • 13:02 moritzm: installing spamassassin security update on mendelevium
  • 12:10 godog: reboot restbase1014, nothing on console
  • 11:13 kartik@deploy1001: Finished deploy [cxserver/deploy@473b0de]: Update cxserver to b7cdb26 (T208831, T203077, T203160, T206777) (duration: 04m 26s)
  • 11:09 kartik@deploy1001: Started deploy [cxserver/deploy@473b0de]: Update cxserver to b7cdb26 (T208831, T203077, T203160, T206777)
  • 10:39 akosiaris: upgrade OTRS to 5.0.32 T209691
  • 09:31 marostegui: Set back sync_binlog=1 and trx_commit=1 after dbstore2002:3313 has caught up
  • 09:25 moritzm: installing postgres updates on labsdb1006
  • 09:21 moritzm: removed labvirt1016 from debmonitor db, got renamed to cloudvirt1016
  • 08:40 moritzm: installing curl security updates on jessie
  • 07:32 elukey: forced reboot + fsck + removal of /var/lib/hadoop/data/l from fstab on analytics1029
  • 06:36 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Pool pc2009 in pc3 - T208383 (duration: 00m 56s)
  • 06:28 marostegui: Set sync_binlog=0 and trx_commit=2 on dbstore2002:3313 to let it catch up
  • 05:47 vgutierrez: uploaded certcentral 0.7 to apt.wikimedia.org (stretch) - T208967 T209475
  • 00:55 mutante: some users reported missing files in home dirs on mwmaint1002, reversed rsyncd/ferm setup and rsynced /home from mwmaint2001 to /root on mwmaint1002, restored individually where requested, rsync is not fully automatic but puppetized with rsync::quickdatacopy
  • 00:35 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/includes/PageViews.php: SWAT: Exclude users where getRegistration() returns null (duration: 00m 47s)
  • 00:26 eileen: civicrm revision changed from 71755d021b to 6b279509f8, config revision is 684ec9b7c0 (lybunt report)
  • 00:22 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/GrowthExperiments: SWAT: gerrit:473843 gerrit:473844 gerrit:473845 (duration: 00m 49s)
  • 00:21 sbisson@deploy1001: sync aborted: php-1.33.0-wmf.4/extensions/GrowthExperiments SWAT: gerrit:473843 gerrit:473844 gerrit:473845 (duration: 06m 16s)
  • 00:15 sbisson@deploy1001: Started scap: php-1.33.0-wmf.4/extensions/GrowthExperiments SWAT: gerrit:473843 gerrit:473844 gerrit:473845

2018-11-15

  • 21:58 mutante: mwmaint1002 - restoring entire /home of mwmaint1001 from Bacula (job queued and to tmp dir, not directly into /home)
  • 21:06 hashar: Deleting Nodepool instances on contintcloud T209361
  • 21:05 hashar: Stopped nodepool on labnodepool1001.eqiad.wmnet . Service is no more used. T209361 T209642
  • 20:22 mholloway-shell@deploy1001: Finished deploy [kartotherian/deploy@UNKNOWN]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471) (duration: 03m 55s)
  • 20:18 mholloway-shell@deploy1001: Started deploy [kartotherian/deploy@UNKNOWN]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471)
  • 20:17 mutante: re-added Chase to pwstore, signed .users file, re-encrypted all pwstore files, git pushed
  • 20:15 mholloway-shell@deploy1001: Finished deploy [kartotherian/deploy@48a1e83]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471) (duration: 04m 26s)
  • 20:11 mholloway-shell@deploy1001: Started deploy [kartotherian/deploy@48a1e83]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471)
  • 20:04 urandom: dropping disused keyspaces -- T208616
  • 20:04 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable data collection for UnderstandingFirstDay on cswiki and kowiki (duration: 00m 53s)
  • 19:51 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure sensitive namespaces for EditorJourney schema (T207307) (duration: 00m 53s)
  • 19:49 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/: cherry-picks for T208773 (duration: 00m 54s)
  • 18:55 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Hot-deploy I26f2dc2e: Don't over-ride default Wikibase string limits (duration: 00m 53s)
  • 18:50 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: Hot-deploy Ib10de2e3: Don't set Wikibase string limits when null (duration: 00m 55s)
  • 18:31 ladsgroup@deploy1001: Finished deploy [ores/deploy@dba11e9]: Another small update (duration: 13m 42s)
  • 18:18 ladsgroup@deploy1001: Started deploy [ores/deploy@dba11e9]: Another small update
  • 18:17 ladsgroup@deploy1001: Finished deploy [ores/deploy@51cdf6b]: T208623 (duration: 14m 41s)
  • 18:03 ladsgroup@deploy1001: Started deploy [ores/deploy@51cdf6b]: T208623
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 7 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 6 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 5 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 3 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 2 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 1 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:20 gehel: upgrade prometheus-blazegraph-exporter on all wdqs nodes - T206123
  • 16:41 bstorm_: rebooted labsdb1007 for upgrades
  • 16:37 andrewbogott: rebuilding labvirt1015 and cloudvirt1015
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on wikitech for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 8 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 7 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 5 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 4 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 3 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:20 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 2 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:07 gehel: plugin and JVM upgrade on elasticsearch / cirrus / eqiad completed - T209293
  • 13:59 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.4
  • 13:55 XioNoX: push firewall policies to pfw3-eqiad - T209421
  • 13:50 banyek: Deploy schema change on s6 codfw master, this will generate lag on s6 codfw - T85757
  • 13:48 moritzm: installing qemu security updates (which also backport support for SSBD passthrough) on ganeti clusters
  • 13:48 moritzm: installing qemu security updates (which also backport support for SSBD passthrough)
  • 13:12 mobrovac@deploy1001: Finished deploy [restbase/deploy@22cb0ec]: Add new wikis to RESTBase - T206777 T205710 T205546 T204477 (duration: 19m 56s)
  • 13:00 Lucas_WMDE: EU SWAT finished
  • 12:59 tarrow@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js: gerrit:473710 Fix (accidentally?) reversed blue and yellow lines SWAT T208238 T162119 again (duration: 00m 55s)
  • 12:57 _joe_: upping pm.maxworkers to 40 on mw1261 on php7.2-fpm, benchmarking T206341
  • 12:56 tarrow@deploy1001: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 12:53 moritzm: installing nginx security updates
  • 12:52 mobrovac@deploy1001: Started deploy [restbase/deploy@22cb0ec]: Add new wikis to RESTBase - T206777 T205710 T205546 T204477
  • 12:47 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove wgWBQualityConstraintsCacheCheckConstraintsResults (T207854) (duration: 00m 54s)
  • 12:42 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Prod: increase Schema.org page split test to 25% sampling T208755 (duration: 00m 53s)
  • 12:40 arturo: T207377 downtime and reboot labmon1001
  • 12:38 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js: Fix (accidentally?) reversed blue and yellow lines (T162119, T208238) (duration: 00m 54s)
  • 12:37 lucaswerkmeister-wmde@deploy1001: sync aborted: php-1.33.0-wmf.3/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js Fix (accidentally?) reversed blue and yellow lines (T162119, T208238) (duration: 00m 11s)
  • 12:37 lucaswerkmeister-wmde@deploy1001: Started scap: php-1.33.0-wmf.3/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js Fix (accidentally?) reversed blue and yellow lines (T162119, T208238)
  • 12:30 tarrow@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:473716 Read WikibaseStringLimit in Wikibase.php T154660 (duration: 00m 53s)
  • 12:30 moritzm: draining ganeti1001 for reboot/kernel security update
  • 12:26 tarrow@deploy1001: sync-file aborted: gerrit:473716 Read WikibaseStringLimit in Wikibase.php (duration: 00m 01s)
  • 12:21 tarrow@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:473694 Set Wikibase string-limits for wikidata dblist T154660 (duration: 00m 54s)
  • 12:08 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Make AdvancedSearch the default on de-, fa-, ar-, and hu-wiki T207640 (duration: 00m 55s)
  • 11:34 _joe_: depooling mw1261 for early benchmarks of php7.2-fpm
  • 11:04 akosiaris: enable puppet across the fleet. puppetdb1001 reboot done, ganeti migration_downtime setting applied
  • 10:49 akosiaris: disable puppet across the fleet for puppetdb1001 reboot
  • 10:49 moritzm: fail over ganeti master in eqiad to ganeti1003
  • 10:34 akosiaris: set migration_downtime=2000 for puppetdb1001. Should help with migration stalls
  • 10:15 banyek: sanitizing db1124 ( T205714 T207584 T205713 T206916 )
  • 10:07 banyek: sanitizing db2094 ( T205714 T207584 T205713 T206916 )
  • 09:57 moritzm: draining ganeti1002 for reboot/kernel security update
  • 09:40 volans: restarting icinga on icinga1001
  • 09:32 vgutierrez: restarting icinga on icinga1001
  • 09:31 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 52s)
  • 09:22 moritzm: draining ganeti1003 for reboot/kernel security update
  • 09:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 00m 53s)
  • 09:16 marostegui: Stop MySQL on db1110 for upgrade
  • 09:08 moritzm: reset failed debmonitor session in ms-be2038
  • 09:08 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1088 (duration: 00m 53s)
  • 09:03 banyek: repooling db1088 (T85757)
  • 08:58 ema: upload fifo-log-demux 0.1 to stretch-wikimedia T204225
  • 08:56 banyek: Deploy schema change on db1088 (T85757)
  • 08:49 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1088 (duration: 00m 53s)
  • 08:45 banyek: depooling db1088 due a schema change (T85757)
  • 08:21 moritzm: draining ganeti1004 for reboot/kernel security update
  • 07:42 marostegui: Drop site_stats.ss_total_views from labswiki - T86339
  • 07:08 elukey: memcached on mc1019 restarted to apply -R 200 - T208844
  • 06:57 marostegui: Stop MySQL on db2071 to upgrade MySQL and kernel
  • 06:57 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 00m 54s)
  • 06:06 marostegui: Stop MySQL on pc2006 to clone pc2009 - T208383
  • 06:06 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool pc2006 - T208383 (duration: 00m 53s)
  • 06:05 marostegui@deploy1001: sync-file aborted: Dool pc2006 - T208383 (duration: 00m 00s)
  • 05:59 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Pool pc2008 in pc2 - T208383 (duration: 00m 56s)
  • 00:50 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: increase Schema.org page split test to 5% sampling T208755 (duration: 00m 54s)

2018-11-14

  • 22:46 ejegg: updated payments-wiki from 5751286f1c to d2b66c5bab
  • 21:34 thcipriani: restart gerrit to load JavaMelody dependency library
  • 21:28 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on cobalt (duration: 00m 09s)
  • 21:28 thcipriani@deploy1001: Started deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on cobalt
  • 21:27 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on gerrit2001 (duration: 00m 11s)
  • 21:26 thcipriani@deploy1001: Started deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on gerrit2001
  • 20:13 hashar@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.4 (duration: 00m 52s)
  • 20:12 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.4
  • 19:14 hashar@deploy1001: Synchronized php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php: Actually return the value from getRootJobCacheKey() - T209429 (duration: 00m 53s)
  • 18:33 hashar@deploy1001: Finished scap: php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php Add trace to debug memcached bad key error - T209429 (duration: 34m 07s)
  • 17:58 hashar@deploy1001: Started scap: php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php Add trace to debug memcached bad key error - T209429
  • 17:53 arturo: T207377 downtime and reboot cloudnet1003 (cloudnet1004 is the active one already)
  • 17:37 arturo: T207377 downtime and reboot cloudnet1004 (cloudnet1003 is the active one already)
  • 17:31 bawolff_: Running importImage.php for 'Opening ceremony of First accusation protest against presumption of guilt of judicial branch.webm' per request T209495
  • 17:26 sbisson@deploy1001: Synchronized dblists/wikidataclient.dblist: SWAT: Add incubatorwiki to wikidataclient.dblist (duration: 00m 48s)
  • 17:19 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/MobileFrontend/resources/mobile.editor.common/schemaEditAttemptStep.js: SWAT: schemaEditAttemptStep.js: Use correct config var name for sampling rate (duration: 00m 54s)
  • 17:12 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/includes/WikimediaEventsHooks.php: SWAT: Fix EditAttemptStepSamplingRate variable export (duration: 00m 54s)
  • 16:33 bblack: [Done] replacement of GlobalSign unified TLS cert at US edges complete - T206804
  • 16:25 moritzm: rebooting ganeti1005 for kernel security update
  • 16:16 moritzm: rebooting restbase-dev1006 for kernel security update and OpenJDK security update
  • 16:10 bblack: disabling puppet as precaution on all caches (cumin A:cp) - T206804
  • 16:09 bblack: starting replacement of GlobalSign unified TLS cert at US edges (affects all public TLS termination for US traffic edges) - T206804
  • 16:08 moritzm: rebooting restbase-dev1005 for kernel security update and OpenJDK security update
  • 15:58 moritzm: rebooting restbase-dev1004 for kernel security update and OpenJDK security update
  • 15:53 jiji: Restarting pdfrender on scb*.eqiad.wmnet
  • 15:50 godog: roll restart swift-proxy in eqiad to apply statsd changes
  • 15:41 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T85757: repool db2046 (duration: 00m 52s)
  • 15:39 banyek: repooling db2046 (T85757)
  • 15:12 godog: roll-restart swift on ms-be1* to pick up statsd changes
  • 15:07 Amir1: ladsgroup@mwmaint1002:/srv/mediawiki-staging/php-1.33.0-wmf.4$ mwscript sql.php --wiki=incubatorwiki extensions/Wikibase/client/sql/entity_usage.sql (T209207)
  • 14:56 addshore@deploy1001: Synchronized wmf-config: Prod: Enable Schema.org page split test at 1% sampling (again) (duration: 00m 54s)
  • 14:29 godog: roll-restart swift-proxy in codfw to pick up statsd changes
  • 14:14 addshore@deploy1001: Synchronized wmf-config: Revert Prod: Enable Schema.org page split test at 1% sampling (duration: 00m 54s)
  • 14:10 Reedy: Wiki created T205714 T207584 T205713 T206916
  • 14:07 gehel: starting plugin and JVM upgrade on elasticsearch / cirrus / eqiad - T209293
  • 14:07 reedy@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 02m 25s)
  • 14:02 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix shnwiki TZ (duration: 00m 54s)
  • 13:57 reedy@deploy1001: rebuilt and synchronized wikiversions files: (no justification provided)
  • 13:55 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Add pc2010 as spare - T208383 (duration: 00m 53s)
  • 13:54 reedy@deploy1001: Synchronized multiversion/MWMultiVersion.php: (no justification provided) (duration: 00m 53s)
  • 13:51 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: new wikis (duration: 00m 53s)
  • 13:51 gehel: restarting tilerator on maps1004 for config change
  • 13:50 reedy@deploy1001: Synchronized static/images/project-logos/: (no justification provided) (duration: 00m 53s)
  • 13:49 reedy@deploy1001: Synchronized dblists/: new wikis! (duration: 00m 53s)
  • 13:48 reedy@deploy1001: Synchronized langlist: shn (duration: 00m 52s)
  • 13:45 gehel: plugin and JVM upgrade completed on elasticsearch / cirrus / codfw - T209293
  • 13:45 reedy@deploy1001: rebuilt and synchronized wikiversions files: (no justification provided)
  • 13:07 moritzm: installing ghostscript security updates on stretch
  • 13:01 reedy@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaMaintenance/addWiki.php: Fixing addshores code... (duration: 00m 53s)
  • 13:00 reedy@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMaintenance/addWiki.php: Fixing addshores code... (duration: 00m 55s)
  • 12:56 moritzm: installing gettext "security" updates for trusty
  • 12:48 moritzm: installing python3.4 security updates on trusty (Debian already fixed)
  • 12:40 reedy@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding wiktionary (duration: 00m 52s)
  • 12:39 reedy@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding wiktionary (duration: 00m 53s)
  • 12:39 moritzm: installing python security updates on trusty
  • 12:36 Amir1: EU SWAT is done
  • 12:35 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert the language of votewiki to English (en) (T207560) (duration: 00m 55s)
  • 12:30 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Start reading from change_tag_def on wikidatawiki (T208846) (duration: 00m 55s)
  • 12:19 pmiazga@deploy1001: Synchronized wmf-config: SWAT: [[gerrit:473079]|Enable Schema.org page split test at 1% sampling (T208755)]] (duration: 00m 54s)
  • 12:02 reedy@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding TMH tables (duration: 00m 53s)
  • 12:01 reedy@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding TMH tables (duration: 00m 55s)
  • 11:11 moritzm: draining ganeti1005 for reboot/kernel security update
  • 11:09 banyek: Deploy schema change on db2046 (T85757)
  • 10:07 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T85757: depooling db2046 (duration: 00m 55s)
  • 09:59 banyek: depooling db2046 (T85757)
  • 09:22 moritzm: updated stretch netinst image for 9.6 point release
  • 09:17 marostegui: Deploy schema change on db2053 - T86339
  • 08:24 marostegui: Deploy schema change on s5 codfw master, this will generate lag on s5 codfw - T205913
  • 08:22 marostegui: Deploy schema change on s2 codfw master, this will generate lag on s7 codfw - T205913
  • 08:19 marostegui: Deploy schema change on s2 codfw master, this will generate lag on s2 codfw - T205913
  • 08:17 marostegui: Deploy schema change on s6 codfw master, this will generate lag on s6 codfw - T205913
  • 08:14 marostegui: Deploy schema change on s4 codfw master, this will generate lag on s4 codfw - T205913
  • 08:08 marostegui: Deploy schema change on s3 codfw master, this will generate lag on s3 codfw - T205913
  • 08:07 godog: rollout rsyslog_exporter to eqiad
  • 07:42 marostegui: Deploy schema change on s3 codfw master, this will generate lag on s3 codfw - T203709
  • 07:19 marostegui: Deploy schema change on s7 codfw master, this will generate lag on s7 codfw - T203709
  • 07:07 marostegui: Deploy schema change on s2 codfw master, this will generate lag on s2 codfw - T203709
  • 06:52 marostegui: Deploy schema change on s4 codfw master, this will generate lag on s4 codfw - T203709
  • 06:40 marostegui: Deploy schema change on s6 codfw master, this will generate lag on s6 codfw -T203709
  • 06:32 marostegui: Stop MySQL on pc2005 to clone it to pc2008 - T208383
  • 06:27 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool pc2005 - T208383 (duration: 01m 04s)
  • 05:46 _joe_: restarting gerrit
  • 01:02 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/Wikibase/client/includes: SWAT: Update: use wikibase-debug logger instead of "PageRandomLookup" T208796 (duration: 00m 56s)
  • 00:42 mutante: restarted smokeping on netmon1002 and netmon2001

2018-11-13

  • 22:42 XioNoX: restart librenms irc bot
  • 22:24 XioNoX: add term labnet-nova-api to cloud-in4 on cr1/2-eqiad - T209424
  • 20:22 herron: updated labs realm smarthosts (via hiera) to mx-out0[12].wmflabs.org T41785
  • 19:49 otto@deploy1001: Finished deploy [analytics/refinery@62d6f4b]: Deploy hive jars from CDH 5.10.0 to workaround Refine bug: T209407 (duration: 05m 57s)
  • 19:43 otto@deploy1001: Started deploy [analytics/refinery@62d6f4b]: Deploy hive jars from CDH 5.10.0 to workaround Refine bug: T209407
  • 19:31 herron: uploaded librdkafka_0.11.6-1~bpo9+1+wikimedia1 packages to stretch-wikimedia T209300
  • 18:11 mutante: the CUSTOM message from ores.svc.codfw was the (one-time) test of the new Icinga server
  • 18:03 mutante: icinga migration has concluded, we are now on stretch and icinga1001, einsteinium is passive (T202782)
  • 17:27 mutante: re-enabled puppet on icinga1001, einsteinium becoming passive
  • 17:21 mutante: ran puppet on einsteniumr; e-enabling puppet on tegmen and icinga1001
  • 17:13 bstorm_: Added 172.16.0.0/21 to the allowed connections for wikilabels postgresql on labsdb1004
  • 17:04 mutante: disabled puppet on all 3 icinga servers, re-enabling on einsteinium , going through https://wikitech.wikimedia.org/wiki/Icinga#Failover_Icinga_between_the_active_and_passive_servers
  • 17:02 ejegg: updated payments-wiki from 20542c9184 to 5751286f1c
  • 17:01 mutante: starting migration of icinga server - maintenance windows
  • 16:33 thcipriani: restarting gerrit service for upgrade to 2.15.6
  • 16:32 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@d2763c6]: v2.15.6 to cobalt (duration: 00m 10s)
  • 16:32 thcipriani@deploy1001: Started deploy [gerrit/gerrit@d2763c6]: v2.15.6 to cobalt
  • 16:29 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@d2763c6]: v2.15.6 to gerrit2001 (duration: 00m 11s)
  • 16:29 thcipriani@deploy1001: Started deploy [gerrit/gerrit@d2763c6]: v2.15.6 to gerrit2001
  • 16:22 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting actor migration to write-both/read-old on test wikis and mediawikiwiki (T188327) (duration: 00m 54s)
  • 16:07 anomie@mwmaint1002: Running refreshExternallinksIndex.php on labtestwiki for T209373
  • 16:07 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 3 wikis in group 0 for T209373
  • 15:48 _joe_: upgrading extensions on all appservers / jobrunners while upgrading to php 7.2
  • 15:45 gehel: restart tilerator on maps1004
  • 15:21 moritzm: draining ganeti1006 for reboot/kernel security update
  • 15:18 marostegui: Restore replication consistency options on dbstore2002:3313 as it has caught up - T208320
  • 14:59 akosiaris: increase the migration downtime for kafkamon1001. It should make live migration of these VMs easier and without the need for manual fiddling
  • 14:54 hashar@deploy1001: rebuilt and synchronized wikiversions files: group to 1.33.0-wmf.4 | T206658
  • 14:40 hashar@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.4 | T206658 (duration: 19m 34s)
  • 14:27 moritzm: draining ganeti1007 for reboot/kernel security update
  • 14:20 hashar@deploy1001: Started scap: testwiki to php-1.33.0-wmf.4 | T206658
  • 14:20 akosiaris: reboot logstash1007, logstash1008, logstash1009 with 500 secs of sleep between them for the migration_downtime ganeti setting to be applied
  • 14:18 akosiaris: increase the migration downtime for logstash1007, logstash1008, logstash1009. It should make live migration of these VMs easier and without the need for manual fiddling
  • 14:15 hashar@deploy1001: Pruned MediaWiki: 1.32.0-wmf.24 (duration: 08m 55s)
  • 14:03 hashar: Applied security patches to 1.33.0-wmf.4 | T206658
  • 14:03 gehel: start plugin and JVM upgrade on elasticsearch / cirrus / codfw - T209293
  • 14:00 hashar: scap prep 1.33.0-wmf.4 # T206658
  • 13:58 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Pool pc2007 to replace pc2004 (duration: 00m 48s)
  • 13:41 marostegui: Deploy schema change on s8 codfw master (db2045) this will generate lag on s8 codfw - T203709
  • 13:40 hashar: Cutting wmf/1.33.0-wmf.4 branch | T206658
  • 13:30 moritzm: draining ganeti1008 for reboot/kernel security update
  • 12:51 phuedx: European Mid-day SWAT finished
  • 12:50 phuedx@deploy1001: Finished scap: SWAT: Define WikimediaMessages for Wikibase SEO change l18n refresh (duration: 21m 43s)
  • 12:28 phuedx@deploy1001: Started scap: SWAT: Define WikimediaMessages for Wikibase SEO change l18n refresh
  • 12:22 phuedx@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMessages/: SWAT: Define WikimediaMessages for Wikibase SEO change (T208755) (duration: 00m 56s)
  • 10:57 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1092 (duration: 00m 52s)
  • 10:47 marostegui: Deploy schema change on db1116:3318 T203709
  • 10:40 godog: stop sending metrics to old graphite hardware
  • 10:15 gehel: restart elasticsearch on relforge for plugin upgrade - T209293
  • 09:54 moritzm: restarting jenkins on releases1001 to pick up Java security update
  • 09:25 _joe_: uploading new versions of php-msgpack, php-geoip compatible with both php 7.0 and php 7.2 to thirdparty/php72 T208433
  • 09:23 marostegui: Deploy schema change on db1092 T203709
  • 09:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1092 (duration: 00m 52s)
  • 09:20 elukey: rollout new prometheus-mcrouter-exporter to mw* - previous rollout didn't work as expected
  • 09:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1104 (duration: 00m 55s)
  • 08:37 moritzm: updating remaining rsyslog on stretch to 8.38.0-1~bpo9+1wmf1
  • 07:21 marostegui: Deploy schema change on db1104 T203709
  • 07:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 (duration: 00m 53s)
  • 07:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 (duration: 00m 54s)
  • 07:05 elukey: powercycle lvs2006 - mgmt/serial console blank, not responsive since hours ago
  • 06:02 marostegui: Add ipb_sitewide column to db1073:labtestwiki
  • 05:43 marostegui: Stop MySQL on pc2004 to transfer its data to pc2007 - T208383
  • 05:42 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool pc2004 - T208383 (duration: 00m 53s)
  • 05:39 marostegui: Deploy schema change on db2048 (s1 codfw master), this will create lag on s1 codfw - T114117
  • 05:34 marostegui: Deploy schema change on db1109 T203709
  • 05:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 55s)

2018-11-12

  • 19:22 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T208663 4ff32d1df - Enable moving files for users with patrol and rollbacker rights on srwiki (duration: 00m 54s)
  • 18:29 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@ee91c41]: GUI update, New Thesaurus endpoint, New updater build and blazegraph update (duration: 11m 28s)
  • 18:17 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@ee91c41]: GUI update, New Thesaurus endpoint, New updater build and blazegraph update
  • 18:03 elukey: rolling restart of aqs on aqs* to pick up new druid datasource settings
  • 17:44 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Optimize s2 for throughput (duration: 00m 53s)
  • 17:19 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool more resources into s2 api (duration: 00m 54s)
  • 17:15 _joe_: restarting HHVM on the high-cpu api hosts in eqiad, to ease the pressure and latencies
  • 17:10 _joe_: depooling mw1222 for debug
  • 16:41 banyek: disabling puppet on parsercache hosts (T208383)
  • 16:14 elukey: upgrade prometheus-mcrouter-exporter on all the mw* hosts to the new version
  • 16:09 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for large wikis
  • 16:02 volans: restarted proton on proton1002
  • 15:45 jynus: stop and upgrade db2094
  • 15:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 (duration: 00m 53s)
  • 14:49 banyek: disabling puppet on parsercache hosts - pc[12]00[456] (T208383)
  • 14:17 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for medium wikis
  • 14:08 moritzm: updating rsyslog on stretch to 8.38.0-1~bpo9+1wmf1
  • 13:59 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for small wikis (small.dblist)
  • 13:59 marostegui: Deploy schema change on db1101:3318 - T203709
  • 13:55 hashar: Upgrading Jenkins on contint1001 , contint2001, releases1001 and releases2002 | T209264
  • 13:46 moritzm: updating libfastjson on stretch to 0.99.8-1~bpo9+1wmf1
  • 13:41 gehel: starting rolling restart of elasticsearch codfw for JVM upgrade
  • 13:32 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for mediawikiwiki
  • 13:23 phuedx: phuedx@mwmaint1002 running resetPageRandom.php maintenance script for testwiki
  • 13:17 zeljkof: EU SWAT finished
  • 13:16 phuedx@deploy1001: Synchronized php-1.33.0-wmf.3/maintenance/resetPageRandom.php: SWAT: Provide a script to reset the page_random column (T208909) (duration: 00m 53s)
  • 13:16 moritzm: updating liblognorm on stretch to 2.0.3-1~bpo9+1wmf1
  • 13:14 phuedx@deploy1001: Synchronized php-1.33.0-wmf.3/autoload.php: SWAT: Provide a script to reset the page_random column (T208909) (duration: 00m 55s)
  • 13:12 elukey: upgrade the Hadoop Analytics cluster to CDH 5.15 (downtime required)
  • 12:54 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule for Wikipedia event in Ireland on 2018-11-13 (T209037) (duration: 00m 53s)
  • 12:15 jiji: Restarting nutcracker on scb200[1-6] - T206450
  • 12:00 moritzm: uploaded jenkins 2.138.3 to apt.wikimedia.org (jessie and stretch)
  • 11:49 hashar: updating puppet CI job for mtail upgrade https://gerrit.wikimedia.org/r/#/c/integration/config/+/472962/
  • 11:37 hashar: contint1001 : cleaning disk | T209123 ?
  • 11:26 moritzm: installing Java security updates on elastic*
  • 10:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 (duration: 00m 55s)
  • 10:48 godog: upload mtail 3.0.0~rc5-1~bpo9+1wmf1 to stretch-wikimedia
  • 10:45 marostegui: Deploy schema change on db2048 (s1 codfw master), this will generate lag on s1 codfw - T51191
  • 10:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 (duration: 00m 53s)
  • 10:32 elukey: upload mcrouter exporter 0.0.0+git20181106 to stretch-wikimedia
  • 09:57 elukey: upgraded cdh packages (cdh 5.10 -> 5.15) for thirdparty/cloudera in jessie/stretch-wikimedia
  • 09:12 marostegui: Deploy schema change on db2048 (s1 codfw master) (replication will be stopped) - T67448
  • 08:53 marostegui: Deploy schema change on db1099:3318 - T203709
  • 08:52 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 (duration: 01m 01s)
  • 08:41 marostegui: Change sync_binlog to 0 and trx_commit to 2 on dbstore2002:3313 to let it catch up
  • {{safesubst:SAL entry|1=08:26 _joe_: uploading new php-{luasandbox,wikidiff2} to stretch main component, rebuild php-{luasandbox,wikidiff2,geoip,msgpack} for php 7.2, upload to stretch component php72, T208433}}
  • 08:23 godog: temporarily disable puppet in codfw before enabling rsyslog_exporter

2018-11-10

  • 01:10 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting (duration: 00m 39s)
  • 01:10 smalyshev@deploy1001: Started deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting
  • 01:07 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting (duration: 00m 04s)
  • 01:07 smalyshev@deploy1001: Started deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting

2018-11-09

  • 21:46 SMalyshev: repooled wdqs1004 - looks like other servers feel worse so probably makes sense to share the load equally
  • 21:16 jiji: Reimaging rdb2003, rdb2004 - T206450
  • 20:46 SMalyshev: depooled wdqs1004 to let it catch up
  • 20:40 legoktm@deploy1001: Synchronized docroot/mediawiki/keys/: Add my (20after4) PGP key to mediawiki.org/keys/keys.(txt|html) (duration: 00m 55s)
  • 20:08 andrewbogott: restarted neutron-linuxbridge-agent on cloudvirt1018 and cloudvirt1023
  • away: repooling labsdb1011 (T189158)
  • 15:24 banyek: depooling labsdb1011 (T189158)
  • 15:23 banyek: depooling labsdb1011
  • 15:08 banyek: repooling labsdb1009 (T189158)
  • 15:06 bblack: cp1008/pinkunicorn: puppet disabled, public-facing testing of new globalsign 2018 certs
  • 15:04 ladsgroup@deploy1001: Finished deploy [ores/deploy@bb39f4b]: T191842 T209060, try II (duration: 14m 43s)
  • 14:50 andrewbogott: rebooting cloudvirt1024 to (I hope) cause a page
  • 14:49 ladsgroup@deploy1001: Started deploy [ores/deploy@bb39f4b]: T191842 T209060, try II
  • 14:48 ladsgroup@deploy1001: deploy aborted: T191842 T209060 (duration: 09m 32s)
  • 14:39 ladsgroup@deploy1001: Started deploy [ores/deploy@0728805]: T191842 T209060
  • 14:18 addshore@deploy1001: Synchronized wmf-config: BETA ONLY: Enable SSR termbox for wikibase on beta - T209143 (duration: 00m 56s)
  • 13:32 moritzm: rebooting acrab for some qemu tests
  • 13:21 godog: upload graphite-web_1.0.2+debian-2.1wmf1 to stretch-wikimedia - T208782
  • 13:10 moritzm: upgrading qemu on ganeti2001 (packages supporting SSBD passthrough)
  • 12:40 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T189158: repool db1106 (duration: 00m 53s)
  • 12:34 banyek: repooling db1106 (T208954)
  • 12:17 kartik@deploy1001: Finished deploy [cxserver/deploy@fc21164]: Update cxserver to 01686f6 (T208831) (duration: 01m 09s)
  • 12:16 kartik@deploy1001: Started deploy [cxserver/deploy@fc21164]: Update cxserver to 01686f6 (T208831)
  • 11:45 banyek: data load finished restarting replication on db1106 (T208954)
  • 11:43 akosiaris: set previous normal wait for scb1001 for apertium service T206439
  • 11:39 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 11:30 akosiaris: upgrade apertium apertium-cat apertium-fra apertium-fra-cat apertium-lex-tools apertium-separable cg3 libapertium3-3.5-1 libcg3-1 lttoolbox on all scb boxes and restart apertium-apy
  • 11:26 akosiaris: upgrade apertium apertium-cat apertium-fra apertium-fra-cat apertium-lex-tools apertium-separable cg3 libapertium3-3.5-1 libcg3-1 lttoolbox on scb1002
  • 11:22 jiji: switch scb*.eqiad.wmnet nutcracker rdb1003:6382 with rdb1005:6379
  • 10:51 vgutierrez: uploaded certcentral 0.6 to apt.wikimedia.org (stretch) - T208859 T208948 T208967 T208970
  • 09:48 ema: repool cp2018, cp2025 (cache_upload) T208588
  • 09:45 banyek: truncating enwiki.archive on db1124 and labsdb hosts too (T208954)
  • 09:21 banyek: stopping replication on db1106 (T208954)
  • 09:21 banyek: stopping replication on db1106 (T208672)
  • 09:08 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T189158: depool db1106 (duration: 00m 55s)
  • 09:02 banyek: depooling db1106 (T208954)
  • 08:28 moritzm: installing nginx security updates
  • 08:05 ema: repool cp2006, cp2012 (cache_text) T208588
  • 04:33 ejegg: restarted recurring donation charge jobs
  • 04:24 ejegg: updated fundraising CiviCRM from 1154cca3f2 to 71755d021b
  • 03:25 ejegg: updated fundraising CiviCRM from 02cc1f80d4 to 1154cca3f2
  • 00:07 ejegg: updated fundraising CiviCRM from 07183ed7cc to 02cc1f80d4
  • 00:03 ejegg: updated payments-wiki from 983ce3af0f to 20542c9184

2018-11-08

  • 22:48 mutante: gerrit - adding Thomas Arrow to 'wmf-deployment' group for +2 on mw-config for T208491 access request
  • 22:37 mutante: gerrit - adding Lucas Werkmeister (WMDE) to 'wmf-deployment' group for +2 on mw-config for T208518 access request
  • 20:28 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.3
  • 19:35 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Disable wmgUseTwoColConflict everywhere" T205942 T208840 T209012 T209036 (duration: 00m 54s)
  • 18:19 shdubsh: update statsd-proxy to 0.0.9-2 on graphite1004
  • 17:29 banyek: depooling labsdb1009 (T189158)
  • 17:24 banyek: repooling labsdb1010 (T189158)
  • 17:07 godog: upload libfastjson 0.99.8-1~bpo9+1wmf1 version bump only
  • 16:59 akosiaris@deploy1001: scap-helm zotero finished
  • 16:59 akosiaris@deploy1001: scap-helm zotero cluster staging completed
  • 16:59 akosiaris@deploy1001: scap-helm zotero [namespace: zotero, clusters: staging]
  • 16:50 akosiaris@deploy1001: scap-helm zotero install --name alextest --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:50 akosiaris@deploy1001: scap-helm zotero install --name alextest --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:29 XioNoX: enable Zayo transit on cr3-ulsfo
  • 15:42 chasemp: disable /etc/logrotate.d/udp2log-mw for a bit on mwlog1001
  • 15:25 Amir1: rolling restart of celery on ores nodes (T209060)
  • 15:20 akosiaris: 'cd /srv/deployment/ores/deploy/submodules/wheels && sudo -u deploy-service git lfs pull' on all ores1* and ores2* hosts T209060
  • 15:07 XioNoX: zeroize asw-c8-codfw (decom)
  • 14:12 moritzm: rebooting releases2001 for some tests with ssbd for KVM
  • 13:52 moritzm: installing postgres updates on labsdb1006/1007
  • 13:38 jiji: Done reimaging rdb1006 - T206450
  • 13:37 moritzm: draining ganeti2001 for reboot/kernel security update
  • 13:36 moritzm: failing over ganeti master in codfw from ganeti2001 to ganeti2003
  • 13:13 godog: upload rsyslog 8.38.0-1~bpo9+1wmf1 to stretch-wikimedia, version bump only
  • 13:07 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-enable wmgUseTwoColConflict on dewiki - T205942 T208840 T209012 T209036 (duration: 00m 53s)
  • 12:56 akosiaris: increase weight of scb1001 for apertium to 99+%
  • 12:56 akosiaris@puppetmaster1001: conftool action : set/weight=3800; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 12:53 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-enable wmgUseTwoColConflict on group0 only T205942 T208840 T209012 T209036 (duration: 00m 54s)
  • 12:52 moritzm: draining ganeti2002 for reboot/kernel security update
  • 12:41 jiji: Shutdown and reimage rdb200[56] - T206450
  • 12:31 moritzm: draining ganeti2003 for reboot/kernel security update
  • 12:30 zeljkof: EU SWAT finished
  • 12:29 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/TwoColConflict: SWAT: Fix harmless edits turning into conflicts (T205942 T208840 T209012 T209036) (duration: 00m 55s)
  • 12:19 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set AdvancedSearch to default on group0 wikis (T207641) (duration: 00m 55s)
  • 12:18 moritzm: draining ganeti2004 for reboot/kernel security update
  • 11:57 moritzm: draining ganeti2005 for reboot/kernel security update
  • 11:51 akosiaris: increase weight of scb1001 for apertium to 50%
  • 11:50 akosiaris@puppetmaster1001: conftool action : set/weight=38; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 11:41 moritzm: draining ganeti2006 for reboot/kernel security update
  • 11:18 moritzm: draining ganeti2007 for reboot/kernel security update
  • 11:05 moritzm: draining ganeti2008 for reboot/kernel security update
  • 10:52 jiji: Reimaging rdb1006 to stretch - T206450
  • 10:52 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable wmgUseTwoColConflict everywhere T209012 T208840 T195724 (duration: 00m 58s)
  • 10:26 elukey: restart memcached on mc2029 (was depooled yesterday for network maintenance)
  • 10:23 jiji: restarting pdfrender on scb1003
  • 10:19 volans: restarting pdfrender on scb1004
  • 10:18 volans: restarting pdfrender on scb1002
  • 10:18 _joe_: restarting pdfrender on scb1001
  • 10:02 moritzm: installing ppp security updates on trusty
  • 09:37 godog: keep 2x not 3x copies of older (>15d) logstash elasticsearch indices
  • 09:29 moritzm: installing curl security updates
  • 09:29 godog: temporarily set elasticsearch logstash watermark to low:0.85 and high:0.9
  • 06:34 bawolff@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/OpenStackManager/special/SpecialNovaSudoer.php: T203885 (duration: 00m 54s)
  • 05:30 bawolff: deployed patch T208881
  • 01:18 mutante: scb1004 - systemctl restart pdfrender (T174916)
  • 00:43 jforrester@deploy1001: Synchronized php-1.33.0-wmf.3/includes/resourceloader/ResourceLoader.php: ResourceLoader: Fail less hard when JSON serialization of config fails I673f59d93 (duration: 00m 53s)
  • 00:33 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T205368 Enable BotPasswords on Governance wiki (duration: 00m 55s)
  • 00:32 ejegg: updated fundraising CiviCRM from 769dcf6456 to 07183ed7cc
  • 00:26 James_F: Created the bot_passwords table for Governance wiki T205368
  • 00:21 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208449 Disable wgWelcomeSurveyEnabled everywhere in production (duration: 00m 54s)
  • 00:18 jforrester@deploy1001: Synchronized wmf-config/extension-list: T208081 Drop the Petition extension from extension-list (duration: 00m 53s)
  • 00:16 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208081 Drop the Petition extension from InitialiseSettings (duration: 00m 52s)
  • 00:14 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: T208081 Drop the Petition extension from CommonSettings (duration: 00m 53s)
  • 00:12 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208899 Enabling wgMediaInTargetLanguage for testwiki (duration: 00m 54s)
  • 00:00 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208081 Disable the Petition extension in production (duration: 00m 52s)

2018-11-07

  • 23:48 catrope@deploy1001: Synchronized wmf-config/CommonSettings.php: Make GrowthExperiments flag operative in CommonSettings (duration: 00m 53s)
  • 23:44 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add flag for GrowthExperiments to InitialiseSettings (duration: 00m 53s)
  • 23:37 catrope@deploy1001: Finished scap: Full scap to rebuild i18n for the addition of the GrowthExperiments extension (duration: 39m 40s)
  • 23:21 jiji: Disabled nagios checks on rdb1006 and rdb2005 due to rdb1005 reimaging - T206450
  • 22:57 catrope@deploy1001: Started scap: Full scap to rebuild i18n for the addition of the GrowthExperiments extension
  • 22:13 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Revert "labswiki rollback to 1.33.0-wmf.2"
  • 22:07 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/LdapAuthentication/LdapAuthenticationPlugin.php: Expose methods used by OpenStackManager T208995 (duration: 00m 54s)
  • 22:06 XenoRyet: updated payments-wiki from 34506ce636 to 983ce3af0f
  • 22:02 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: Allow Cloud VPS 172.16.0.0/16 for $wmgAllowLabsAnonEdits wikis T208986 (duration: 00m 54s)
  • 22:02 arlolra: Updated Parsoid to 970751a (T206940)
  • 21:54 arlolra@deploy1001: Finished deploy [parsoid/deploy@4edc771]: Updating Parsoid to 970751a (duration: 09m 34s)
  • 21:45 arlolra@deploy1001: Started deploy [parsoid/deploy@4edc771]: Updating Parsoid to 970751a
  • 21:21 ladsgroup@deploy1001: Finished deploy [ores/deploy@25dfa4f]: T191842 T197096 (duration: 17m 24s)
  • 21:18 krinkle@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/AbuseFilter/includes/AbuseFilter.php: T208144 - I0fdda5 (duration: 00m 53s)
  • 21:16 krinkle@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/VipsScaler: Id9f82afd (duration: 00m 55s)
  • 21:06 krinkle@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/AbuseFilter/includes/AbuseFilter.php: T208144 - I0fdda51010243 (duration: 00m 53s)
  • 21:06 banyek: stopping replication on db2072 (T208954)
  • 21:04 krinkle@deploy1001: Synchronized php-1.33.0-wmf.3/includes/jobqueue/jobs/RefreshLinksJob.php: T208147 -I7f5fafe9439d8a7b4 (duration: 00m 54s)
  • 21:03 ladsgroup@deploy1001: Started deploy [ores/deploy@25dfa4f]: T191842 T197096
  • 20:55 banyek: depool labsdb1010 (T189158)
  • 20:24 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: rollback labswiki to 1.33.0-wmf.2
  • 20:12 thcipriani@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.3 (duration: 00m 53s)
  • 20:11 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.3
  • 20:00 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T206173 Adding namespaces to Governance wiki (duration: 00m 55s)
  • 19:50 chasemp: labstore1007:~# mkdir /srv/security/
  • 19:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-sync for skipped apaches due to maintenance (whoops) (duration: 00m 55s)
  • 19:48 XioNoX: Revert "Redirect eqsin/ulsfo caches to eqiad" - T208272
  • 19:47 XioNoX: repool codfw - T208272
  • 19:45 XioNoX: asw-c-codfw maintenance finished successfuly - T208272
  • 18:51 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T201285: Disable wgRawHTML on Governance wiki (duration: 05m 12s)
  • 18:31 onimisionipe: restarting relforge-eqiad and relforge-eqiad-small-alpha clusters on relforge100[1-2]
  • 18:21 XioNoX: power down asw-c4-codfw - T208272
  • 17:31 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sourceswiki to wikidata clients (duration: 00m 53s)
  • 17:25 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Inject wikidata rc records on wikidata itself (duration: 00m 53s)
  • 17:21 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable WMEUnderstandingFirstDay on testwiki (duration: 00m 53s)
  • 16:37 XioNoX: remove asw-c-codfw FPC8 from config - T208272
  • 16:35 XioNoX: shutdown asw-c-codfw FPC8 - T208272
  • 16:33 jforrester@deploy1001: Synchronized php-1.33.0-wmf.3/resources/src/mediawiki.rcfilters/styles/mw.rcfilters.ui.less: T208898 Hot-deploy to wmf.3 (duration: 00m 53s)
  • 16:22 jforrester@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/Echo/modules/styles: Hot-deploy T208930 to wmf.3 (duration: 00m 54s)
  • 16:20 XioNoX: Enable all VC ports (except uplinks) on spines - T208272
  • 15:58 XioNoX: Redirect eqsin/ulsfo caches to eqiad - T208272
  • 15:57 XioNoX: depool codfw for row C maintenance - T208272
  • 15:36 moritzm: installing Java security updates on relforge*
  • 15:29 jiji@deploy1001: Synchronized wmf-config/ProductionServices.php: Remove jobqueue_redis references, T198220 (duration: 00m 54s)
  • 15:21 akosiaris: T206439 direct 30% of the apertium.svc.eqiad.wmnet traffic to scb1001. Will increase tomorrow to 50%
  • 15:20 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 15:16 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 15:15 akosiaris: T206439 pool upgraded scb1001 to apertium.svc.eqiad.wmnet as a form of canary
  • 15:13 moritzm: uploaded nginx 1.13.6-2+wmf2 to apt.wikimedia.org/stretch-wikimedia
  • 14:55 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 14:37 oblivian@deploy1001: Synchronized docroot/wwwportal/w/search-redirect.php: Fixing redirects if no language is specified (duration: 00m 54s)
  • 14:33 moritzm: uploaded nginx 1.13.6-2+wmf2~jessie1 to apt.wikimedia.org/jessie-wikimedia
  • 14:32 akosiaris: T206439 upload apertium-cat_2.6.0-1+wmf1 apertium-fra-cat_1.5.0-1+wmf1 apertium-fra_1.5.0-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 14:26 bblack: rebooting graphite1004
  • 14:16 akosiaris: T206439 upload apertium-separable_0.3.2-1+wmf1 apertium-lex-tools_0.2.1-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 14:07 akosiaris: T206439 upload apertium_3.5.2-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 13:43 bblack: hi
  • 13:21 Amir1: ladsgroup@mwmaint1002:/srv/mediawiki/php-1.33.0-wmf.1$ mwscript sql.php --wiki=sourceswiki extensions/Wikibase/client/sql/entity_usage.sql (T208858)
  • 12:38 zeljkof: EU SWAT finished
  • 12:36 zfilipin@deploy1001: Synchronized wmf-config: SWAT: BC: Enable Schema.org page split test (T208763) (duration: 00m 54s)
  • 12:35 akosiaris: T206439 upload hfst-ospell_0.5.0-1+wmf1to apt.wikimedia.org/jessie-wikimedia/main
  • 12:27 akosiaris: T206439 upload cg3_1.1.7-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 12:16 zfilipin@deploy1001: Synchronized wmf-config/InterwikiSortOrders.php: SWAT: Add dty, gor, inh, kbp and lfn to InterwikiSortOrders (T208217) (duration: 00m 53s)
  • 12:12 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for Art&Feminism event in Chile (T208866) (duration: 00m 54s)
  • 12:12 akosiaris: T206439 upload hfst_3.15.0-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 12:08 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Remove expired throttle rules (duration: 01m 05s)
  • 11:49 akosiaris: T206439 upload lttoolbox_3.5.0-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 11:49 akosiaris: T206439 upload lttoolbox_3.5.0-1+wmf1
  • 10:06 hashar: CI: switched operations/puppet job to be based on Stretch ( T208422 ) and to add python3 ( T208873 )
  • 09:25 _joe_: run systemctl reset-failed on ms-be1029, had a failed debmonitor session
  • 08:16 kartik@deploy1001: Finished deploy [cxserver/deploy@6f97d25]: Update cxserver to f9ffd24 (duration: 04m 59s)
  • 08:11 kartik@deploy1001: Started deploy [cxserver/deploy@6f97d25]: Update cxserver to f9ffd24
  • 02:55 ejegg: disabled recurring charge jobs
  • 00:46 mutante: tegmen - shutting down for renaming and reinstall (T208824)
  • 00:11 dereckson@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 02m 44s)

2018-11-06

  • 23:55 mutante: cp1084 - network went down, powercycled, probably T203194
  • 22:49 ejegg: updated fundraising CiviCRM from e0742d2210 to 769dcf6456
  • 21:50 mutante: icinga1001-"MediaWiki EtcdConfig up-to-date" checks were all UNKNOWN because systemd unit update-etcd-mw-config-lastindex was present but service not running. it was turned off in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/427328/ on purpose. manually ran "systemctl start update-etcd-mw-config-lastindex" and the checks all work (T202782)
  • 21:49 mutante: icinga1001 - the "MediaWiki EtcdConfig up-to-date" checks were all unknown on the new icinga server, this was because systemd unit update-etcd-mw-config-lastindex was present but service not running. that was turned off in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/427328/ on purpose. manually ran "systemctl start update-etcd-mw-config-lastindex" to start it and the checks
  • 21:40 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Group0 wikis to 1.33.0-wmf.3
  • 20:19 thcipriani@deploy1001: Finished scap: testwiki to 1.33.0-wmf.3 and rebuild l10n cache (duration: 34m 06s)
  • 19:45 thcipriani@deploy1001: Started scap: testwiki to 1.33.0-wmf.3 and rebuild l10n cache
  • 19:44 thcipriani@deploy1001: Pruned MediaWiki: 1.32.0-wmf.23 (duration: 07m 19s)
  • 18:36 thcipriani: cutting branch for MediaWiki and extensions version 1.33.0-wmf.3
  • 17:37 XioNoX: add vlan-analytics1-a-eqiad interface-range on asw2-a-eqiad
  • 16:47 XioNoX: enable cr4-ulsfo zayo transport to cr1-codfw
  • 16:23 akosiaris@deploy1001: scap-helm zotero install --name alextest --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:23 akosiaris@deploy1001: scap-helm zotero install --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:12 mutante: einsteinium - temp disabling icinga notifications and puppet, reloading icinga (for extra caution while deploying global NRPE change)
  • 16:07 mutante: planet1001 - disabling puppet, editing NRPE config, testing allowed_hosts change
  • 16:07 akosiaris@deploy1001: scap-helm -h finished
  • 16:07 akosiaris@deploy1001: scap-helm -h cluster staging completed
  • 16:07 akosiaris@deploy1001: scap-helm -h [namespace: -h, clusters: staging]
  • 15:59 banyek: updating facts for the puppet compilers
  • 15:37 akosiaris: create zotero namespace in eqiad, codfw, staging cluster T201611
  • 15:20 godog: switch all graphite read traffic to graphite1004
  • 15:16 XioNoX: push `lldp port-id-subtype interface-name` to all compatible switches - T208630
  • 15:14 jiji: scb1001/scb1002 switched nutcracker redis from rdb1001:6382 to rdb1009:6379
  • 15:08 XioNoX: push `lldp port-id-subtype interface-name` to all routers - T208630
  • 14:40 jynus_: reducing consistenct temp. on db2048 to avoid lagging
  • 14:30 moritzm: installing Ruby 2.1 security updates
  • 14:22 godog: add graphite1004 to graphite cluster for reads
  • 14:21 moritzm: installing clamav security updates on mendelevium/ticket.wikimedia.org
  • 13:25 moritzm: restart HHVM on canaries to pick up new curl
  • 13:10 XioNoX: zeroize asw-b-eqiad (decom) - T208788
  • 12:33 moritzm: installing curl security updates
  • 12:17 moritzm: installing chromium security updates on proton* (new upstream release tested in deployment-prep)
  • 12:12 zeljkof: EU SWAT finished
  • 12:10 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update several Wikidata-related configs (duration: 00m 55s)
  • 11:37 moritzm: installing Ruby 2.3 security updates
  • 11:37 moritzm: installing Ruby 2.3 security updates on trusty
  • 11:30 moritzm: installing Mono security updates
  • 11:23 moritzm: installing Ruby 1.9 security updates on trusty
  • 11:07 banyek: stopping replication on db2077 (T208672)
  • 07:25 _joe_: also restarting on the other eqiad nodes
  • 07:25 _joe_: restarting tilerator on maps1002
  • 04:47 kartik@deploy1001: Finished deploy [cxserver/deploy@ddb0031]: Update cxserver to 17f9a10 (T144467, T198699, T208386) (duration: 05m 26s)
  • 04:42 kartik@deploy1001: Started deploy [cxserver/deploy@ddb0031]: Update cxserver to 17f9a10 (T144467, T198699, T208386)
  • 03:40 eileen: civicrm revision changed from 99895316de to e0742d2210, config revision is e832b5a04a
  • 00:26 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/471875/ (duration: 00m 51s)
  • 00:07 maxsem@deploy1001: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/471874/ (duration: 00m 54s)
  • 00:03 eileen: civicrm revision changed from 042eeaeca9 to 99895316de, config revision is e832b5a04a

2018-11-05

  • 22:41 mutante: sodium - reboot after disk replacement (T202705)
  • 21:51 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@a92fce5]: Increase cirrusSearchLinksUpdate concurrency to 100 (duration: 01m 02s)
  • 21:50 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@a92fce5]: Increase cirrusSearchLinksUpdate concurrency to 100
  • 21:49 arlolra: Updated Parsoid to 8ed698b (T205334, T208360)
  • 21:42 mobrovac@deploy1001: Started restart [zotero/translation-server@50f216a]: Free up some memory
  • 21:41 arlolra@deploy1001: Finished deploy [parsoid/deploy@96d739b]: Updating Parsoid to 8ed698b (duration: 10m 59s)
  • 21:38 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: Removed decomissioned citoid url T133001 (duration: 00m 53s)
  • 21:30 arlolra@deploy1001: Started deploy [parsoid/deploy@96d739b]: Updating Parsoid to 8ed698b
  • 21:23 ppchelko@deploy1001: Finished deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324 take 2 (duration: 09m 18s)
  • 21:14 ppchelko@deploy1001: Started deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324 take 2
  • 21:10 ppchelko@deploy1001: Finished deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324 (duration: 12m 15s)
  • 21:09 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Ensure all wikis on 1.33.0-wmf.2
  • 20:58 ppchelko@deploy1001: Started deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324
  • 20:38 ppchelko@deploy1001: Finished deploy [restbase/deploy@5b8ad3c] (dev-cluster): Update deps, removed sections table (duration: 03m 40s)
  • 20:35 ppchelko@deploy1001: Started deploy [restbase/deploy@5b8ad3c] (dev-cluster): Update deps, removed sections table
  • 19:53 akosiaris: do a depool, scap deploy, scap wikiversions-compile, hhvm restart and then a pool in eqiad mediawiki servers
  • {{safesubst:SAL entry|1=19:50 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:470868|Disable page issues A/B test (duration: 00m 53s)}}
  • 19:44 sbisson@deploy1001: Synchronized php-1.33.0-wmf.2/includes/block/BlockRestriction.php: SWAT: BlockRestriction::update() unnecessarily does a SELECT on the page table. (duration: 01m 00s)
  • 19:19 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable error logging for WikimediaEvents (duration: 00m 52s)
  • 19:12 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable GrowthExperiments logging channel (duration: 00m 53s)
  • 18:57 _joe_: restarting hhvm on mwdebug1002
  • 18:44 godog: pool graphite1004 for reads - T196484
  • 18:43 XioNoX: delete asw2-b - asw-b interface - T183585
  • 18:41 XioNoX: remove asw-b-eqiad from LibreNMS - T183585
  • 18:37 XioNoX: remove vrrp priority 70 on cr2-eqiad:ae2 to failback VIPs to cr2 - T183585
  • 18:26 XioNoX: re-enable ae2 on cr2-eqiad - T183585
  • 18:21 thcipriani: rollback mwdebug1001 group2 wikis
  • 18:13 thcipriani: testing php-1.33.0-wmf.2 on group2 wikis on mwdebug1001
  • 18:05 XioNoX: disable ae2 on cr2-eqiad - T183585
  • 18:02 XioNoX: set vrrp priority 70 on cr2-eqiad:ae2 to failover VIP to cr1 - T183585
  • 16:49 XioNoX: Update LLDP config on cr3-ulsfo - T208630
  • 16:48 vgutierrez: uploaded certcentral 0.5 to apt.wikimedia.org (stretch) - T208572 T208378
  • 16:06 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting MCR to read-new on all wikis (T198308) (duration: 00m 55s)
  • 13:57 jynus_: increase consistency of db2050, dbstore2002 s3 after them catching up replication T208462
  • 12:33 ladsgroup@deploy1001: Finished deploy [ores/deploy@096ffb3]: T208577 T181632 T208608 (duration: 22m 58s)
  • 12:23 zeljkof: EU SWAT finished
  • 12:23 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Increase wikidata dispatchers to 3 (duration: 00m 54s)
  • 12:16 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgForeignUploadTargets to [] for zhwiki (T208397) (duration: 00m 54s)
  • 12:10 ladsgroup@deploy1001: Started deploy [ores/deploy@096ffb3]: T208577 T181632 T208608
  • 12:05 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Revert "Anniversary logo for cswiki" (T207589) (duration: 00m 58s)
  • 10:02 godog: reformat xfs filesystems on ms-be1040 - T199198
  • 09:17 elukey@deploy1001: Finished deploy [analytics/refinery@9d39efa]: fixing stat1004 (duration: 00m 04s)
  • 09:17 elukey@deploy1001: Started deploy [analytics/refinery@9d39efa]: fixing stat1004
  • 09:08 joal@deploy1001: Finished deploy [analytics/refinery@9d39efa]: regular analytics weekly deploy (duration: 05m 21s)
  • 09:02 joal@deploy1001: Started deploy [analytics/refinery@9d39efa]: regular analytics weekly deploy

2018-11-04

  • 23:42 jynus_: deleting the same row on all s8 broken servers
  • 23:39 jynus_: deleting one row on db1104
  • 20:38 krinkle@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/FlaggedRevs/frontend/specialpages/reports/ProblemChanges_body.php: T176232 - Ia43626584e (duration: 01m 17s)
  • 18:32 jynus_: reduce temp. consistency level of s4, s5, and s6 codfw masters to prevent excessive lagging due to ongoing mediawiki core maintenance
  • 08:42 eileen: process-control config revision is e832b5a04a renable running job list (all jobs on again now0
  • 08:38 eileen: process-control config revision is e16b2c1c61 renable jobs
  • 02:00 eileen: I think I got the rest of the jobs off process-control config revision is 4422254128
  • 01:52 eileen: process-control config revision is 6ec67b3d01 - also turn off omnirecipient repair job
  • 01:40 eileen: process-control config revision is 5b72cfe874 - reapply turn off q jobs

2018-11-03

  • 09:35 elukey: run tcpdump on mc1035 to grab memcache traffic (rotating pcaps, ~30G maximum)

2018-11-02

  • 17:04 thcipriani: rollback group2 wikis to 1.33.0-wmf.1 on mwdebug100{1,2}
  • 16:54 thcipriani: deploying 1.33.0-wmf.2 to group2 wikis on mwdebug1002
  • 16:43 _joe_: live-hacking removal of time limit on mwdebug1001
  • 16:32 thcipriani: deploying 1.33.0-wmf.2 to group2 wikis on mwdebug1001
  • 15:12 jynus: restarting replication @ db2074 after db2094:s3 table fix T208565
  • 15:00 jynus: stopping replication on db2074 to fix db2094:s3 T208565
  • 14:01 vgutierrez: reimaging eeden.wikimedia.org as jessie test system - T208583
  • 11:43 jynus: ignoring cawikimedia.archive replication on db2094:s3 until a reimport happens T208565
  • 11:29 jijiki: Rebooting mw2244 (spare system) for maintenance
  • 10:52 ema: restart varnish-be on cp3032 T208574
  • 08:19 jynus: performing alter table on dbstore2002 s3 and reducing consistency to improve recovery time T208462 T204006
  • 08:01 jynus: reducing consitency on db2050 to improve recovery time T208462
  • 07:59 jynus: performing alter table on db2050 T208462 T204006
  • 07:38 godog: reformat ms-be1043 xfs filesystems - T199198
  • 07:38 jynus: reducing consistency temporarily (flush, binlog sync) at db2040 to prevent lagging
  • 07:26 jynus: reducing consistency temporarily (flush, binlog sync) at db2035 to prevent lagging

2018-11-01

  • 23:01 shdubsh: restart hhvm on mw1261
  • 22:29 ejegg: restarted fundraising queue consumer jobs
  • 22:21 ejegg: updated fundraising CiviCRM from 65130ef3dd to 042eeaeca9
  • 22:18 ejegg: turned off fundraising queue jobs for civi update
  • 22:12 _joe_: rolling restart of hhvm on appservers and api in eqiad
  • 22:09 shdubsh: cumin -b 2 -s 30 "O:mediawiki::appserver and *.eqiad.wmnet" "restart-hhvm"
  • 22:05 _joe_: restarting hhvm on mw1238,1240
  • 22:02 _joe_: restart hhvm on mw1244
  • 21:52 shdubsh: restart hhvm on mw1247
  • 21:49 _joe_: depooling mw1238 for debugging
  • 21:09 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group2 back to 1.33.0-wmf.1
  • 20:55 hoo: Restarted hhvm on mwdebug2002
  • 19:40 hoo: Ran "UPDATE wb_changes_dispatch SET chd_seen = '775203911' WHERE chd_site LIKE '%wikt%' AND chd_seen < '775180000';" on wikidata master (dispatching for wiktionaries)
  • 19:00 hoo@deploy1001: Synchronized php-1.33.0-wmf.1/includes/export/WikiExporter.php: Fix for missing end tag </page> on some exports (T207974) (duration: 01m 01s)
  • 18:38 hoo@deploy1001: Synchronized php-1.33.0-wmf.2/includes/export/WikiExporter.php: Fix for missing end tag </page> on some exports (T207974) (duration: 00m 55s)
  • 18:25 jijiki: Enabling puppet on mw servers (T206923)
  • 18:19 hoo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove now redundant Wikidata config for wiktionary (T208317) (duration: 00m 54s)
  • 18:12 hoo@deploy1001: Synchronized dblists/wikidataclient.dblist: Add all wiktionaries to wikidataclient.dblist, sort list (T208317) (duration: 00m 57s)
  • 18:02 gehel: restart nginx on relforge100*
  • 17:57 jijiki: Disabling puppet on mw servers (T206923)
  • 16:07 anomie@mwmaint1002: Running migrateComments.php on section 4 wikis for T166733
  • 13:46 anomie@mwmaint1002: Running migrateComments.php on remaining section 3 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on section 7 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on wikitech for T166733
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on wikitech for T188132
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on section 6 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on section 8 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 8 wikis for T188132
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 7 wikis for T188132
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 6 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateComments.php on section 5 wikis for T166733
  • 13:36 anomie@mwmaint1002: Running migrateComments.php on section 1 wikis for T166733
  • 13:36 anomie@mwmaint1002: Running migrateComments.php on section 2 wikis for T166733
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 5 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 4 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on remaining section 3 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 2 wikis for T188132
  • 13:35 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 1 wikis for T188132
  • 12:50 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: List wikidataclient-test in CS.php dblists T208488 (duration: 00m 57s)
  • 09:10 elukey: added a tmux session on mw1314m mw1344, mw1316 that checks mcrouter stats every 10s
  • 00:58 onimisionipe: repooling wdqs1004. It has caught up on lag with others
  • 00:22 tgr@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/SyntaxHighlight_GeSHi/extension.json: SWAT: Follow-up I3daca6fb: Fix exception thrown when inserting new code block (invalidate RL cache) (duration: 00m 53s)
  • 00:20 tgr@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/SyntaxHighlight_GeSHi/modules/ve-syntaxhighlight/ve.ui.MWSyntaxHighlightWindow.js: SWAT: Follow-up I3daca6fb: Fix exception thrown when inserting new code block (duration: 00m 54s)
  • 00:13 mobrovac@deploy1001: Finished deploy [restbase/deploy@e8f3a85] (dev-cluster): Add title normalisation and remove Accept-Language header duplicates (duration: 03m 00s)
  • 00:10 mobrovac@deploy1001: Started deploy [restbase/deploy@e8f3a85] (dev-cluster): Add title normalisation and remove Accept-Language header duplicates
  • 00:07 tgr@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Move auth logging to different channels for easier counting (T150300, T123243) (duration: 00m 53s)
  • 00:05 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Move auth logging to different channels for easier counting (T150300, T123243) (duration: 00m 53s)

2018-10-31

  • 23:53 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Undeploy RelatedSites from pt and ro Wikivoyages (T202761) (duration: 00m 54s)
  • 23:50 tgr@deploy1001: Synchronized dblists/wikidataclient.dblist: SWAT: Add trwiktionary to wikidataclient.dblist (T204419) (duration: 00m 53s)
  • 23:44 tgr@deploy1001: Synchronized static/images/wmf-hor-googpub.png: SWAT: Update: add Wikimedia logo for SEO (T198946, T207790) (duration: 00m 53s)
  • 23:36 tgr@deploy1001: Synchronized php-1.33.0-wmf.2/skins/MinervaNeue/skin.json: SWAT: MinervaNeue: Article width should not be full screen (T193061) (invalidate RL cache) (duration: 00m 53s)
  • 23:34 tgr@deploy1001: Synchronized php-1.33.0-wmf.2/skins/MinervaNeue/resources/skins.minerva.tablet.styles/common.less: SWAT: MinervaNeue: Article width should not be full screen (T193061) (duration: 00m 53s)
  • 23:25 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable the print schema (T208454) (duration: 00m 55s)
  • 23:07 tstarling@deploy1001: Synchronized php-1.33.0-wmf.1: security patch (duration: 03m 45s)
  • 22:58 tstarling@deploy1001: sync-dir aborted: security patch (duration: 00m 13s)
  • 22:56 tstarling@deploy1001: Synchronized php-1.33.0-wmf.2: security patch (duration: 06m 47s)
  • 22:36 catrope@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/MobileFrontend/includes/specials/SpecialMobileContributions.php: Fix PHP warnings in User.php (T208469) (duration: 00m 53s)
  • 22:19 catrope@deploy1001: Synchronized php-1.33.0-wmf.2/includes/user/User.php: Revert debugging code for T208469 (duration: 00m 52s)
  • 22:17 catrope@deploy1001: Synchronized php-1.33.0-wmf.2/includes/user/User.php: Temporary logging to debug T208469 (duration: 00m 53s)
  • 22:03 catrope@deploy1001: Synchronized php-1.33.0-wmf.2/includes/user/User.php: Temporary logging to debug T208469 (duration: 00m 53s)
  • 21:47 catrope@deploy1001: Synchronized php-1.33.0-wmf.2/includes/user/User.php: Temporary logging to debug T208469 (duration: 00m 53s)
  • 21:43 catrope@deploy1001: Synchronized php-1.33.0-wmf.2/includes/user/User.php: Temporary logging to debug T208469 (duration: 00m 57s)
  • 20:51 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T203821 Fix config typo to point at testwikidatawiki properly (duration: 00m 53s)
  • 20:34 thcipriani@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.2 (duration: 00m 53s)
  • 20:32 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.2
  • 20:28 ladsgroup@deploy1001: Finished deploy [ores/deploy@70ba14b]: Upgrade to celery4 and flask 0.12.4, logstash fixes: T181546 T181630 T168921 T205256 T169586 T208258 T178441 (duration: 21m 29s)
  • 20:06 ladsgroup@deploy1001: Started deploy [ores/deploy@70ba14b]: Upgrade to celery4 and flask 0.12.4, logstash fixes: T181546 T181630 T168921 T205256 T169586 T208258 T178441
  • 19:38 jijiki: Enabling puppet on mw hosts
  • 19:37 thcipriani@deploy1001: Synchronized php: rollback group1 wikis to 1.33.0-wmf.2 (duration: 00m 52s)
  • 19:36 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: rollback group1 wikis to 1.33.0-wmf.2
  • 19:20 thcipriani@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.2 (duration: 00m 53s)
  • 19:19 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.2
  • 19:11 jijiki: Disabling puppet on mw hosts for T206923
  • 19:01 chasemp: install logster on mwlog1001
  • 18:59 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T203821 Enable Partial Blocks on testwiki and testwikidata in case we have to stop the train (duration: 00m 56s)
  • 18:13 bblack: disabling puppet on labstore100[45], ahead of interface::rps change merges
  • 18:08 bblack: disabling puppet on all LVS balancers, ahead of interface::rps change merges
  • 17:41 bblack: reboot cp1085, network interface transmit hang
  • 17:08 onimisionipe: depooling wdqs1004 to catch up on lag
  • 17:04 jforrester@deploy1001: Synchronized php-1.33.0-wmf.2/includes/Block.php: Hot-deploy train blocker T208398 with 3a1f95d4d (duration: 00m 56s)
  • 16:56 onimisionipe: repooling wdqs1005 - It has caught up with others
  • 14:43 cmjohnson1: movd wtp1034 eth0 to new switch...it was left over
  • 14:24 anomie@mwmaint1002: Running migrateComments.php on group0 for T166733
  • 14:22 anomie@mwmaint1002: Running migrateImageCommentTemp.php on group0 for T188132
  • 14:18 anomie: Testing dologmsg on mwmaint1002
  • 14:17 hasharLunch: Adding a Stretch based CI job for operations/puppet (non voting job for now) | T208422
  • 14:11 herron: re-enabling ircecho on einsteinium
  • 13:39 herron: temporarily stopping ircecho on einsteinium
  • 12:35 onimisionipe: depooling wdqs1005 to catch up with others
  • 11:36 dereckson@deploy1001: Synchronized docroot/noc/createTxtFileSymlinks.sh: UNIX-agnostic shebang for createTxtFileSymlinks (Gerrit:470585, no-op in prod) (duration: 00m 54s)
  • 11:08 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: Do not load WikibaseQuality (T205064) (duration: 01m 05s)
  • 10:57 hashar: contint1001: upgraded java and restarted Jenkins
  • 10:56 hashar: restarting CI jenkins on contint1001
  • 10:23 volans: restarted pdfrender on scb1003
  • 09:24 elukey: upgraded memkeys to 20181031-1 on all the mc* - T208376
  • 09:16 elukey: upload memkeys 20181031-1 to jessie-wikimedia thirdparty
  • 08:39 godog: start rolling out rsyslog 8.38 to stretch hosts

2018-10-30

  • 23:23 thcipriani@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Throttle lift for Wikidata event at University of Edinburgh T208236 (duration: 00m 54s)
  • 22:53 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: [Beta Cluster] UploadWizard: Enable Structured Data captions when WBMI is enabled (duration: 00m 53s)
  • 22:38 jforrester@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/VisualEditor/: Hot-deploy UBN train blocker VisualEditor bug T208366 (duration: 00m 56s)
  • 20:53 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.33.0-wmf.2
  • 20:45 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/VisualEditor/lib/ve/src/ui/actions/ve.ui.WindowAction.js: ve.ui.WindowAction: Fix exception when opening windows T208347 (duration: 00m 54s)
  • 20:22 twentyafterfour: hotfixing T208254 (restarting apache2 on phab1001)
  • 19:40 thcipriani@deploy1001: Finished scap: testwiki to 1.33.0-wmf.2 and rebuild l10n cache (duration: 19m 45s)
  • 19:20 thcipriani@deploy1001: Started scap: testwiki to 1.33.0-wmf.2 and rebuild l10n cache
  • 18:59 thcipriani@deploy1001: Pruned MediaWiki: 1.32.0-wmf.22 (duration: 03m 10s)
  • 18:52 ebernhardson: restarted mjolnir-kafka-bulk-daemon on all elastic hosts
  • 18:51 thcipriani@deploy1001: Pruned MediaWiki: 1.32.0-wmf.20 (duration: 07m 55s)
  • 18:46 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@14dd09e]: adjust kafka bulk daemon timeouts (duration: 03m 42s)
  • 18:42 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@14dd09e]: adjust kafka bulk daemon timeouts
  • 18:42 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@14dd09e]: adjust kafka bulk daemon timeouts (duration: 00m 48s)
  • 18:41 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@14dd09e]: adjust kafka bulk daemon timeouts
  • 18:34 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@14dd09e]: (no justification provided) (duration: 00m 26s)
  • 18:34 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@14dd09e]: (no justification provided)
  • 18:14 andrew@deploy1001: Finished deploy [horizon/deploy@ce0b9b4]: Rolling out fix for T208099 (duration: 03m 19s)
  • 18:10 andrew@deploy1001: Started deploy [horizon/deploy@ce0b9b4]: Rolling out fix for T208099
  • 17:23 thcipriani: starting branch cut for MediaWiki and extensions 1.33.0-wmf.2
  • 16:55 mforns: Finished deployment of AQS using scap
  • 16:48 mforns@deploy1001: Finished deploy [analytics/aqs/deploy@3a1d937]: (no justification provided) (duration: 02m 00s)
  • 16:46 mforns@deploy1001: Started deploy [analytics/aqs/deploy@3a1d937]: (no justification provided)
  • 16:45 mforns: Starting deployment of AQS using scap
  • 16:31 godog: install carbon-c-relay 3.2-1 on graphite1004
  • 16:27 shdubsh: updated graphite-in cname to graphite1004 - T196484
  • 16:11 hashar: contint1001: rm -fR /srv/zuul-debug-logs # old logs from May 2018
  • 16:07 ema: reboot cp-ats hosts for L1TF kernel/microcode updates T203011
  • 15:17 anomie: Running migrateComments.php on test wikis and mediawikiwiki for T166733
  • 15:16 anomie: Running migrateImageCommentTemp.php on test wikis and mediawikiwiki for T188132
  • 14:56 godog: gradually upgrade rsyslog to 8.38 on jessie hosts - T206633
  • 14:12 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add entry to wmgWikibaseClientEntityNamespaces for wiktionaries T208293 (duration: 00m 47s)
  • 13:56 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add item & property to wmgWikibaseClientRepoNamespaces for wiktionaries T208293 (duration: 00m 48s)
  • 13:54 godog: reenable puppet in eqiad
  • 13:41 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase IS.php use 120 for property namespace (duration: 00m 47s)
  • 13:31 ema: prometheus-trafficserver-exporter 0.2.0-1 uploaded to stretch-wikimedia T204232
  • 13:27 godog: temporarily disable puppet in eqiad to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/470382
  • 12:17 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase.php, check for 2 more wmg vars before using them (duration: 00m 47s)
  • 12:13 elukey: start memkeys on mc1035 to periodically dump the status of the most used keys - memkeys will use a bit of resources, please stop it if needed (root tmux) - T203786
  • 12:10 jynus: removing s3 replication filters on dbstore1002 T184805
  • 12:06 jynus: removing s3 replication filters on labsdb1009/10/11 T184805
  • 12:02 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: commonswiki, Wikibase, clientDbList as empty array (duration: 00m 47s)
  • 12:01 jynus: finishing deleting moved to s5, s3 wikis T184805
  • 11:59 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Changing the language of votewiki to Persian (fa) (T207560) (duration: 00m 48s)
  • 11:52 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase.php, wrap $wmgWikibaseClientPropertyOrderUrl use in condition (duration: 00m 46s)
  • 11:41 addshore@deploy1001: Synchronized wmf-config: 2x beta only patches, 66b8d5b 8fd1614 (duration: 00m 53s)
  • 10:40 moritzm: installing ghostscript security updates
  • 10:07 moritzm: installing exiv2 security updates
  • 09:52 moritzm: installing mysql-5.5 security updates on trusty/jessie (only clients as packaged in Debian/Ubuntu)
  • 09:40 moritzm: installing libmspack security updates
  • 09:24 moritzm: installing paramiko security updates
  • 08:24 jynus: starting to delete moved to s5, s3 wikis T184805
  • 07:01 Krinkle: graphite2001: Remove Graphite data from corrupted names under media_* and ve_* (T189530)
  • 06:58 Krinkle: graphite1004: Remove Graphite data from corrupted names under media_* and ve_* (T189530)
  • 06:57 Krinkle: graphite1001: Remove Graphite data from corrupted names under media_* and ve_* (T189530)

2018-10-29

  • 21:44 mutante: contint2001 - installing libtiff upgrade, with the same command from apt/history.log that had shown as failed
  • 20:45 arlolra: Updated Parsoid to b9fa661 (T100841, T186965, T167349, T198618, T206040, T207956)
  • 20:35 arlolra@deploy1001: Finished deploy [parsoid/deploy@e36608c]: Updating Parsoid to b9fa661 (duration: 12m 27s)
  • 20:23 arlolra@deploy1001: Started deploy [parsoid/deploy@e36608c]: Updating Parsoid to b9fa661
  • 20:01 XioNoX: rollback redirect ns1 to authdns1001
  • 19:56 moritzm: rebooting authdns2001 for kernel security update
  • 19:48 XioNoX: redirect ns1 to authdns1001 - try 3
  • 19:47 XioNoX: replace radon IPs with authdns1001 on cr1/2-eqiad
  • 19:40 XioNoX: redirect ns1 to authdns1001 - try 2
  • 19:31 XioNoX: redirect ns1 to authdns1001
  • 19:27 XioNoX: rollback redirect ns0 to authdns2001
  • 19:23 moritzm: rebooting authdns1001 for kernel security update
  • 19:12 XioNoX: redirect ns0 to authdns2001
  • 19:03 twentyafterfour: restart apache on phab1001 to hotfix T208254
  • 18:53 moritzm: installing tiff security updates for jessie
  • 18:33 thcipriani: deploy1001:scap pull (ref: T208196)
  • 18:31 thcipriani: deploy1001:sudo -u l10nupdate scap cdb-refresh-json --directory /srv/mediawiki-staging/php-1.33.0-wmf.1/cache/l10n (ref: T208196)
  • 18:24 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable hewikivoyage HD logo (T208148) (duration: 00m 47s)
  • 18:23 catrope@deploy1001: Synchronized static/images: Update logo for hewikivoyage, add HD logos (T208148) (duration: 00m 48s)
  • 18:22 catrope@deploy1001: sync-file aborted: Update logo for hewikivoyage, add HD logos (duration: 00m 07s)
  • 18:22 herron: moving mail_smarthost (and wikimail_smarthost) to hiera (gerrit 469524)
  • 18:20 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@7eeede7]: Improved time handling for Kafka, GUI Update and caching removal from updater (duration: 10m 42s)
  • 18:16 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable PageTriage/Copyvio on testwiki and enwiki (duration: 00m 47s)
  • 18:13 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable Wikidata data access on trwiktionary (T204419) (duration: 00m 48s)
  • 18:09 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@7eeede7]: Improved time handling for Kafka, GUI Update and caching removal from updater
  • 16:49 ottomata: reassigning eventlogging_ReadingDepth partition 0 from 1002,1004,1006 to 1003,1001,1005 to move preferred leadership from 1002 to 1003
  • 16:42 addshore@deploy1001: Synchronized wmf-config: phpdoc comments only (duration: 00m 48s)
  • 16:37 addshore@deploy1001: Synchronized wmf-config: Wikibase.php, finish the grand cleanup (duration: 00m 48s)
  • 16:36 ebernhardson: create wikimaniawiki_general indices for eqiad and codfw elasticsearch clusters
  • 16:35 addshore@deploy1001: Synchronized wmf-config: Wikibase, move client list config to nice part of Wikibase.php (duration: 00m 47s)
  • 16:32 addshore@deploy1001: Synchronized wmf-config: Wikibase, move namespace config to IS.php PT 2/2 (duration: 00m 47s)
  • 16:31 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, move namespace config to IS.php PT 1/2 (duration: 00m 47s)
  • 16:20 chasemp: remove emailbot from acl*sre-team in phab, afaik this is unused now and unmaintained
  • 15:56 addshore@deploy1001: Synchronized wmf-config: BETA ONLY (duration: 00m 48s)
  • 15:48 godog: power off ms-be2021 for controller alarms troubleshooting - T208096
  • 15:44 vgutierrez: uploaded certcentral 0.4 to apt.wikimedia.org (stretch) - T207927
  • 15:43 addshore@deploy1001: Synchronized wmf-config: Create and use wikidataclient-test.dblist PT 2/2 (duration: 00m 48s)
  • 15:42 addshore@deploy1001: Synchronized dblists: Create and use wikidataclient-test.dblist PT 1/2 (duration: 00m 48s)
  • 15:19 cmjohnson1: cloudvirt1019 going down to re-seat the backplane cables
  • 15:17 addshore@deploy1001: Synchronized wmf-config: Remove Wikibase-* config files (duration: 00m 47s)
  • 15:14 addshore@deploy1001: Synchronized wmf-config: Stop loading Wikibase-* files (duration: 00m 47s)
  • 15:06 addshore@deploy1001: Synchronized wmf-config: PT2/2 Totally empty Wikibase-* files (duration: 00m 47s)
  • 15:04 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: PT1/2 Totally empty Wikibase-* files (duration: 00m 46s)
  • 14:58 banyek: shutting down db1117 for hardware maintenance (T208150)
  • 14:52 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase.php, check $wmgWBRepoSettingsSparqlEndpoint is set (duration: 00m 46s)
  • 14:37 addshore@deploy1001: Synchronized wmf-config: PT 2/2 Wikibase, move repo definitions to IS.php gerrit:470373 (duration: 00m 47s)
  • 14:35 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT 1/2 Wikibase, move repo definitions to IS.php gerrit:470373 (duration: 00m 48s)
  • 14:17 gehel: repooling wdqs1003, catched up on lag
  • 14:16 hashar: Restarting Jenkins
  • 13:25 addshore@deploy1001: Synchronized wmf-config: Wikibase, move 2 usage tracking configs to Wikibase.php PT 2/2 (duration: 00m 47s)
  • 13:24 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase, move 2 usage tracking configs to Wikibase.php PT 1/2 (duration: 00m 46s)
  • 13:22 addshore@deploy1001: Synchronized wmf-config: Wikibase, move more misc settings to IS.php PT 2/2 (duration: 00m 47s)
  • 13:21 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, move more misc settings to IS.php PT 1/2 (duration: 00m 49s)
  • 13:19 addshore@deploy1001: Synchronized wmf-config: Remove unused wmgArticlePlaceholderSearchEngineIndexed (duration: 00m 48s)
  • 13:16 addshore@deploy1001: Synchronized wmf-config: Wikibase, set wgArticlePlaceholderSearchEngineIndexed in IS.php PT 2/2 (duration: 00m 47s)
  • 13:15 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, set wgArticlePlaceholderSearchEngineIndexed in IS.php PT 1/2 (duration: 00m 47s)
  • 13:13 addshore@deploy1001: Synchronized wmf-config: Wikibase, Move quality contraints settings to IS.php PT 2/2 (duration: 00m 47s)
  • 13:12 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, Move quality contraints settings to IS.php PT 1/2 (duration: 00m 47s)
  • 13:10 addshore@deploy1001: Synchronized wmf-config: Wikibase, cleanup some duplicated settings PT 2/2 (duration: 00m 47s)
  • 13:09 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase, cleanup some duplicated settings PT 1/2 (duration: 00m 46s)
  • 13:06 addshore@deploy1001: Synchronized wmf-config: Wikibase, Move wgArticlePlaceholderImageProperty to IS.php PT 2/2 (duration: 00m 46s)
  • 13:05 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, Move wgArticlePlaceholderImageProperty to IS.php PT 1/2 (duration: 00m 46s)
  • 13:03 addshore@deploy1001: Synchronized wmf-config: Wikibase, move some property lists to IS php files PT 2/2 (duration: 00m 47s)
  • 13:02 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, move some property lists to IS php files PT 1/2 (duration: 00m 47s)
  • 13:01 addshore@deploy1001: sync-file aborted: Wikibase, move some property lists to IS php files PT 1/2 (duration: 00m 04s)
  • 13:00 addshore@deploy1001: Synchronized wmf-config: BETA FILES only (duration: 00m 47s)
  • 12:57 addshore@deploy1001: Synchronized wmf-config: Wikibase, Move badge related config to IS.php PT 2/2 (duration: 00m 47s)
  • 12:55 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, Move badge related config to IS.php PT 1/2 (duration: 00m 45s)
  • 12:52 addshore@deploy1001: Synchronized wmf-config: PT2/2 Wikibase, Move property suggester settings to IS files (duration: 00m 47s)
  • 12:51 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT1/2 Wikibase, Move property suggester settings to IS files (duration: 00m 47s)
  • 12:47 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase, Move $wgPropertySuggesterMinProbability to IS.php PT 2/2 (duration: 00m 47s)
  • 12:46 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikibase, Move $wgPropertySuggesterMinProbability to IS.php PT 1/2 (duration: 00m 47s)
  • 12:43 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: PT2/2 gerrit:470348 Wikibase, move dispatching settings to IS.php (duration: 00m 47s)
  • 12:42 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT1/2 gerrit:470348 Wikibase, move dispatching settings to IS.php (duration: 00m 48s)
  • 11:44 moritzm: installing graphicsmagick update for stretch
  • 11:23 zeljkof: EU SWAT finished
  • 11:19 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove global action related permissions (T208035) (duration: 00m 48s)
  • 10:22 gehel: switch wdqs1003 and wdqs1006 completed, wdqs1003 still depooled to catch up on update lag - T207947
  • 10:21 hashar: Restore ti l10n files on deploy1001:/srv/mediawiki-staging/php-1.33.0-wmf.1/cache/l10n/upstream # T208196
  • 10:15 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T187299 Double sampling of performance perception survey on ruwiki (duration: 00m 47s)
  • 10:12 hashar: Moving l10n_cache-ti.cdb files on deploy1001: sudo -u l10nupdate mv l10n_cache-ti.cdb.json l10n_cache-ti.cdb.json-back-T208196 && sudo -u l10nupdate mv l10n_cache-ti.cdb.MD5 l10n_cache-ti.cdb.MD5-back-T208196 # T208196
  • 10:10 addshore@deploy1001: Synchronized wmf-config: final sync (duration: 00m 47s)
  • 10:09 addshore@deploy1001: Synchronized wmf-config/Wikibase-production.php: PT3/3 gerrit:470161 Wikibase, Create and use wmgWikibaseRepoStatementSections (duration: 00m 47s)
  • 10:08 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: PT2/3 gerrit:470161 Wikibase, Create and use wmgWikibaseRepoStatementSections (duration: 00m 47s)
  • 10:07 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT1/3 gerrit:470161 Wikibase, Create and use wmgWikibaseRepoStatementSections (duration: 00m 50s)
  • 10:04 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:470160 Wikibase.php, move a bunch of config into clean area NOOP (duration: 00m 47s)
  • 10:03 gehel: starting to switch wdqs1003 and wdqs1006 - T207947
  • 10:02 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:470159 PT2/2 Wikibase, add IS.php setting for each possible extension (duration: 00m 47s)
  • 10:01 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:470159 PT1/2 Wikibase, add IS.php setting for each possible extension (duration: 00m 47s)
  • 09:57 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:470158 Wikibase, Remove unused wmgUseWikibaseQualityExternalValidation (duration: 00m 47s)
  • 09:55 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT2/2 gerrit:470156 Wikibase, remove unused wmgWikibaseClientSettings (duration: 00m 47s)
  • 09:54 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: PT1/2 gerrit:470156 Wikibase, remove unused wmgWikibaseClientSettings (duration: 00m 47s)
  • 09:51 addshore@deploy1001: Synchronized wmf-config/Wikibase-production.php: pt3/3 gerrit:470155 Wikibase, create and use wmgWikibaseClientInjectRecentChanges (duration: 00m 46s)
  • 09:50 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: pt2/3 gerrit:470155 Wikibase, create and use wmgWikibaseClientInjectRecentChanges (duration: 00m 47s)
  • 09:49 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: pt1/3 gerrit:470155 Wikibase, create and use wmgWikibaseClientInjectRecentChanges (duration: 00m 47s)
  • 09:45 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:470154 PT2/2 Wikibase, put all wgNamespaceAliases in IS.php (duration: 00m 46s)
  • 09:44 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:470154 PT1/2 Wikibase, put all wgNamespaceAliases in IS.php (duration: 00m 47s)
  • 09:43 addshore@deploy1001: sync-file aborted: gerrit:470154 PT1/2 (duration: 00m 00s)
  • 09:41 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:470153 PT2/2 Wikibase, define $wgExtraNamespaces in IS.php (duration: 00m 47s)
  • 09:40 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:470153 PT1/2 Wikibase, define $wgExtraNamespaces in IS.php (duration: 00m 46s)
  • 09:37 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:470152 Wikibase, kill $wmgWBSharedSettings (duration: 00m 47s)
  • 09:35 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: PT2/2 gerrit:470151 Wikibase, move wmgWBSiteLinkGroups to IS.php (duration: 00m 46s)
  • 09:34 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT1/2 gerrit:470151 Wikibase, move wmgWBSiteLinkGroups to IS.php (duration: 00m 47s)
  • 09:30 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: PT2/2 gerrit:470150 Wikibase, Split specialSiteLinkGroups and manage from IS.php (duration: 00m 46s)
  • 09:30 ema: resume cache hosts rolling reboots for kernel/microcode updates T203011
  • 09:29 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT1/2 gerrit:470150 Wikibase, Split specialSiteLinkGroups and manage from IS.php (duration: 00m 47s)
  • 09:19 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: PT2/2 gerrit:470339 Introduce wmgWikibaseMaxSerializedEntitySize (duration: 00m 46s)
  • 09:17 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: PT1/2 gerrit:470339 Introduce wmgWikibaseMaxSerializedEntitySize (duration: 00m 47s)
  • 09:07 addshore@deploy1001: Finished scap: sync with no changes (duration: 14m 39s)
  • 08:56 elukey: restart yarn on an-master100[1,2] to pick up new zookeeper timeout settings (10s -> 20s) - T206943
  • 08:52 addshore@deploy1001: Started scap: sync with no changes
  • 08:50 addshore@deploy1001: sync aborted: (no justification provided) (duration: 00m 03s)
  • 08:50 addshore@deploy1001: Started scap: (no justification provided)
  • 08:49 addshore@deploy1001: sync-l10n aborted: (no justification provided) (duration: 01m 19s)
  • 08:42 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208088 Enable performance perception survey shuffling (duration: 00m 47s)
  • 08:38 gilles@deploy1001: Synchronized php-1.33.0-wmf.1/extensions/QuickSurveys: T208088 Add ability to shuffle answers display order (duration: 01m 51s)
  • 08:35 gilles@deploy1001: sync aborted: T208088 Enable performance QuickSurvey shuffling (duration: 00m 00s)
  • 08:35 gilles@deploy1001: Started scap: T208088 Enable performance QuickSurvey shuffling
  • 08:35 gilles@deploy1001: sync aborted: T208088 Enable performance QuickSurvey shuffling (duration: 06m 30s)
  • 08:29 gilles@deploy1001: Started scap: T208088 Enable performance QuickSurvey shuffling
  • 08:07 godog: reformat ms-be1042 xfs filesystems - T199198
  • 08:00 gilles: Deploying time-sensitive backport to QuickSurveys
  • 02:08 onimisionipe: repooling wdqs1003. It has caught up with others

2018-10-28

  • 23:36 krinkle@deploy1001: Synchronized php-1.33.0-wmf.1/extensions/Graph: T184128 - I02da92de33 (duration: 00m 58s)
  • 19:17 onimisionipe: depooling wdqs1003 to catch up on lag
  • 17:30 elukey: restart yarn resource manager on an-master1002 to force failover to an-master1001 - T206943
  • 16:36 onimisionipe: repooling wdqs1003 - it didn't really catch up with others, but lag time on others are beginning to up.
  • 13:38 onimisionipe: depooling wdqs1003 again to catch up with others
  • 02:16 onimisionipe: repooling wdqs1003 - it has caught up with others
  • 00:03 krinkle@deploy1001: Synchronized php-1.33.0-wmf.1/extensions/CirrusSearch: T206967 - Ia23d19cf1e6 (duration: 01m 02s)

2018-10-27

  • 22:22 krinkle@deploy1001: Synchronized php-1.33.0-wmf.1/resources/src: T208093- I25012a2c6f (duration: 00m 58s)
  • 21:24 banyek: resetting power on db1117 as the host is DOWN and the serial console shows nothing
  • 20:56 onimisionipe: depooling wdqs1003 to catch up with others
  • 16:18 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase, fix duplicate specialSiteLinkGroups key T208124 (duration: 00m 54s)
  • 15:57 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase, make sure specialSiteLinkGroups has wikidata group (duration: 00m 54s)
  • 12:32 Amir1: Deployed patch for T207576
  • 12:29 banyek: resuming replication on s1@dbstore2002 as table compression is finished (T204930)
  • 09:17 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY (4x patches) (duration: 00m 55s)
  • 09:09 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Remove wgArticlePlaceholderSearchIntegrationBackend BETA override (duration: 01m 00s)
  • 08:34 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikibase, Set siteLinkGroups settings on all wikis again T208048 T208077 T208074 (duration: 00m 54s)
  • 08:24 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY T208043 (duration: 01m 06s)
  • 03:41 SMalyshev: depool wdqs1003 again to let it catch up some more
  • 02:35 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@7eeede7]: Re-deploy Updater to deal with performance issues (duration: 00m 38s)
  • 02:34 smalyshev@deploy1001: Started deploy [wdqs/wdqs@7eeede7]: Re-deploy Updater to deal with performance issues
  • 02:34 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@e9392f4]: Re-deploy Updater to deal with performance issues (duration: 00m 05s)
  • 02:33 smalyshev@deploy1001: Started deploy [wdqs/wdqs@e9392f4]: Re-deploy Updater to deal with performance issues
  • 00:00 mutante: icinga1001 - using wmf-auto-reimage to reinstall gets stuck at initial puppet run after reboot - Still waiting for Puppet after 105.0 minutes - aborting on cumin, loggin in directly and manually running puppet (T202782 T208100)

2018-10-26

  • 22:54 mutante: sodium - attempted to replace broken disk for RAID - did not go well
  • 21:38 ejegg: updated fundraising CiviCRM from 97506677e8 to 65130ef3dd
  • 21:34 aaron@deploy1001: Synchronized php-1.33.0-wmf.1/autoload.php: 86c0b56 (duration: 00m 52s)
  • 21:33 aaron@deploy1001: Synchronized php-1.33.0-wmf.1/tests: 86c0b56 (duration: 01m 08s)
  • 20:03 mutante: icinga1001 - disabled puppet, changed: check_result_reaper_frequency=2 ; max_check_result_reaper_time=10 to test if it lowers latency (T208066)
  • 19:40 chasemp: remove 2fa for charlottepotero and cwd users in phab (so they can readd)
  • 19:09 SMalyshev: repooled wdqs1003 - looks like it caught up now
  • 17:18 SMalyshev: depool wdqs1003 again to let it catch up some more
  • 16:10 ejegg: updated payments-wiki to 34506ce636
  • 15:32 elukey: rolling restart of all prometheus-mcrouter-exporters on app/api servers - metrics not reported after the last mcrouter restart
  • 15:20 gehel: repooling wdqs1003, other nodes are starting to lag as well
  • 14:56 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Prevent sysops from disabling 2FA for other users as part of upcoming feature (duration: 00m 53s)
  • 13:02 gehel: depool wdqs1003 to catch up on updates
  • 07:51 bawolff: adjust patch T207916
  • 07:16 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T207916 13b993ab9f - auth log on in arwiki (duration: 00m 54s)
  • 06:51 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T207916 13b993ab9f - auth log on in group1 (duration: 00m 54s)
  • 06:33 moritzm: uploaded openjdk-8 backport for recent Java 8 security updates to apt.wikimedia.org/jessie-wikimedia
  • 06:24 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: ecf579e9f9 - T207916 - enable auth log group0 (duration: 00m 55s)
  • 06:01 bawolff: adjust patch for T207916
  • 05:07 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: d3b2c346 T207916 (duration: 00m 55s)
  • 04:58 SMalyshev: depooled wdqs1003 again, let's see if it helps it catch up now
  • 04:12 bawolff: deploy patch T207916
  • 03:19 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: bc9b863e - T207900 - enable CSP report only for users w/session everywhere (duration: 00m 55s)
  • 03:10 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: 745d0b61 - T207900 - enable CSP report only for users w/session enwiki (duration: 00m 55s)
  • 03:01 bawolff@deploy1001: Synchronized wmf-config/CommonSettings.php: T207900 - Add wikimedia.org (no subdomain) to allow list for math (duration: 00m 53s)
  • 02:54 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: 745d0b61 - T207900 - enable CSP report only for users w/session enwiki (duration: 00m 53s)
  • 02:43 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: a8aa9d6aae - T207900 - enable CSP report only for users w/session fawiki, frwiki, svwiki, eswiki, ruwiki, zhwiki, dewiki (duration: 00m 56s)
  • 02:22 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: d743db261 - T207900 - enable CSP report only for users w/session arwiki (duration: 00m 54s)
  • 02:05 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: bd55034d122 - T207900 - enable CSP report only for users w/session on medium wikis (duration: 00m 55s)
  • 01:41 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T207900 (b74911f6201) enable csp users with session all group1 wikis (duration: 00m 55s)
  • 01:28 bawolff@deploy1001: Synchronized wmf-config/CommonSettings.php: Ia518c031 (duration: 00m 55s)
  • 01:26 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@e9392f4]: Re-deploy Updater to deal with performance issues (duration: 31m 28s)
  • 01:21 bawolff@deploy1001: Synchronized wmf-config/CommonSettings.php: T207900 - deploy CSP to people with session on enwikiquote (duration: 00m 54s)
  • 01:19 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T207900 - deploy CSP to people with session on enwikiquote (duration: 00m 55s)
  • 00:56 twentyafterfour: twentyafterfour@deploy1001 rebuilt and synchronized wikiversions files: group2 wikis to 1.33.0-wmf.1 refs T206655
  • 00:55 smalyshev@deploy1001: Started deploy [wdqs/wdqs@e9392f4]: Re-deploy Updater to deal with performance issues
  • 00:39 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: Post-SWAT: De-register all entities on WBMI installations calling themselves Commons I09e066f2 (duration: 00m 56s)
  • 00:32 ejegg: updated payments-wiki from f5999d963d to 57e8438e9c
  • 00:18 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Define and specify lexeme NS for wikidatawiki (duration: 00m 55s)

2018-10-25

  • 23:59 twentyafterfour@deploy1001: Synchronized php-1.33.0-wmf.1/includes/parser/Parser.php: deploy https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/469799/ refs T208000 (duration: 00m 56s)
  • 23:57 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/469799/
  • 23:53 jforrester@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/CentralNotice/special/SpecialCentralNotice.php: SWAT Sync versions of SpecialCentralNotice to avoid dirty repo checkout T208004 (duration: 00m 56s)
  • 23:39 maxsem@deploy1001: Synchronized php-1.33.0-wmf.1/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/CentralNotice/+/469794/ (duration: 00m 57s)
  • 23:26 maxsem@deploy1001: Synchronized php-1.33.0-wmf.1/extensions/GlobalPreferences/: https://gerrit.wikimedia.org/r/c/469793/ (duration: 00m 58s)
  • 21:35 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: rolling back group1 refs T206655 T208000
  • 21:29 XioNoX: configure 208.80.153.185/29 on cr1/2-codfw - T207663
  • 21:25 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group2 wikis to 1.33.0-wmf.1 refs T206655
  • 20:48 twentyafterfour: staying at group1, error rate seems to have stabilized
  • 20:43 ejegg: updated fundraising python tools from 5a2d39b41b to af5dbee8eb
  • 20:37 ejegg: updated standalone SmashPig deployment from b638ca02bc to f65daa8550
  • 20:32 twentyafterfour: db error rate increased again. rolling back
  • 20:31 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.1 refs T206655 (duration: 00m 54s)
  • 20:30 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.1 refs T206655
  • 20:07 jgleeson_: Updated paymentswiki from 5a7a8e7e4f to f5999d963d
  • 20:04 twentyafterfour: still haven't deployed wmf.1 yet error rate increased and icinga is alerting about mediawiki exceptions + wdqs1010 degraded
  • 20:02 twentyafterfour@deploy1001: Finished scap: full sync to be sure that 1.33.0-wmf.1 is fully deployed (duration: 36m 57s)
  • 19:54 mutante: mw1272 - repooled (T207983)
  • 19:51 mutante: mw1272 - rebooting (a stop job is running for HHVM PH/Hack runtime) (T207983)
  • 19:47 mutante: mw1272 - depooled, restarting hhvm (T207983)
  • 19:45 mutante: mw1272 - depooled
  • 19:26 twentyafterfour@deploy1001: Started scap: full sync to be sure that 1.33.0-wmf.1 is fully deployed
  • 19:23 twentyafterfour: beginning mediawiki train. Will start with group1 and then monitor the situation for a few minutes. If everything looks good then we go to group2.
  • 19:20 sbisson@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/ContentTranslation/: SWAT: Add detailed logging for AbuseFilter (duration: 00m 56s)
  • 18:37 sbisson@deploy1001: Synchronized php-1.33.0-wmf.1/extensions/ContentTranslation/: SWAT: Remove the session parameter from AbuseFilter logging (duration: 00m 56s)
  • 17:58 aaron@deploy1001: Synchronized php-1.33.0-wmf.1/extensions/Translate/tag: c5fa239 (duration: 00m 55s)
  • 17:52 aaron@deploy1001: Synchronized php-1.33.0-wmf.1/includes/page/WikiPage.php: f3b5a1d (duration: 00m 54s)
  • 17:52 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@4967dba]: Test deploy new update & scripts (duration: 00m 28s)
  • 17:51 smalyshev@deploy1001: Started deploy [wdqs/wdqs@4967dba]: Test deploy new update & scripts
  • 17:50 aaron@deploy1001: Synchronized php-1.33.0-wmf.1/tests/phpunit/includes/page/WikiPageDbTestBase.php: f3b5a1d (duration: 00m 55s)
  • 17:36 aaron@deploy1001: Synchronized php-1.33.0-wmf.1/includes/changetags/ChangeTags.php: 08f8e6a (duration: 00m 55s)
  • 17:24 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@95452cf]: Update mobileapps to 58cbdff (T206527) (duration: 03m 50s)
  • 17:20 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@95452cf]: Update mobileapps to 58cbdff (T206527)
  • 17:20 mutante: planet - regenerating feeds for 'en' and 'de', others will follow by cron. switching to new theme. replaced bootstrap with bulma. removed jQuery. thanks to paladox
  • 16:34 gehel@puppetmaster1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=wdqs,name=wdqs1005.codfw.wmnet
  • 16:34 gehel@puppetmaster1001: conftool action : set/weight=20; selector: dc=eqiad,cluster=wdqs,name=wdqs1004.codfw.wmnet
  • 16:34 gehel: decreasing relative weight of wdqs1003 in LVS to ease the updater
  • 16:24 shdubsh: installed patched nagios-nrpe-plugin and nagios-nrpe-server on icinga1001 - T207775
  • 15:36 elukey: shutdown aqs1006 to replace one broken disk - T206915
  • 15:31 SMalyshev: depooling wdqs1003 again, it's not catching up like the other hosts
  • 15:16 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-enable WBMI on Beta Commons (duration: 00m 54s)
  • 15:11 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert "logging: Disable Wikibase.NewItemIdFormatter channel" (duration: 00m 55s)
  • 15:08 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Explicitly set wgLexemeEnableRepo for wikidatas gerrit:469625 (duration: 00m 55s)
  • 15:02 godog: test rsyslog 8.38 upgrade on lithium - T136312
  • 14:28 elukey: upgrade druid on druid100[4-6] to Druid 0.12.3
  • 14:20 banyek: running dns update (gerrit patch: 467711)
  • 13:48 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting comment table migration stage to write-new/read-both on all wikis (T166733) (duration: 00m 55s)
  • 13:46 godog: reformat ms-be2043 xfs filesystems - T199198
  • 13:29 XioNoX: test successful, rollback add term return-tcp permit on cr2-codfw
  • 13:28 XioNoX: test add term return-tcp permit on cr2-codfw
  • 12:14 volans: rebooting cumin1001 to pick new kernel and clear any potential weird state after OOMs
  • 12:01 zeljkof: EU SWAT finished
  • 11:17 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for Johannesburg Event on 2018-10-27 (T207742) (duration: 00m 55s)
  • 11:09 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Stop collecting data CitaitonUsage and CitationUsagePageLoad (T191086 T203253) (duration: 00m 57s)
  • 10:57 volans: restart pdfrender on scb1003
  • 10:11 elukey: upgrade druid100[1-3] to druid 0.12.3
  • 09:51 gehel: resetting deployment directory on wdqs1003
  • 09:15 elukey@deploy1001: Finished deploy [analytics/turnilo/deploy@84bf1ad]: Upgrade to 1.8.1 (duration: 00m 10s)
  • 09:15 elukey@deploy1001: Started deploy [analytics/turnilo/deploy@84bf1ad]: Upgrade to 1.8.1
  • 09:10 ema: resume cache hosts rolling reboots for kernel/microcode updates T203011
  • 07:16 vgutierrez: Uploaded certcentral 0.3 to apt.wikimedia.org (stretch) - T207737 T207478
  • 07:11 moritzm: installing requests security updates on trusty
  • 06:17 SMalyshev: depooling wdqs1003 again, it's not catching up like the other hosts
  • 06:06 elukey: upload druid 0.12.3-1 debs to stretch-wikimedia

2018-10-24

  • 23:24 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/469495/ (duration: 00m 54s)
  • 23:15 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/462040/ (duration: 00m 55s)
  • 23:08 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Deploy csp report-only to small.dblist wikis T207900 (duration: 00m 56s)
  • 22:38 bawolff@deploy1001: Synchronized wmf-config/CommonSettings.php: Deploy csp report-only to outreachwiki T207900 (duration: 00m 54s)
  • 22:36 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Deploy csp report-only to outreachwiki T207900 (duration: 00m 54s)
  • 22:33 bawolff@deploy1001: scap failed: average error rate on 8/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 22:27 eileen_: civicrm revision changed from 1c0a1b2406 to 97506677e8, config revision is c0a8be03a1
  • 21:33 banyek: compressing tables in s1@dbstore2002 (T204930)
  • 21:26 banyek: pausing replication on dbstore2002 (T204930)
  • 19:38 twentyafterfour: The train is now blocked by database lock contention of unknown origin
  • 19:31 twentyafterfour: the errors were all coming from wmf.26 but the error rate skyrocketed after deploying 1.33.0-wmf.1 to group1 so there is some query in the new branch which is holding a lock. T207881
  • 19:19 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.1 refs T206655
  • 18:16 XioNoX: enable BGP sessions to transit/peering on cr2-eqord - T204170
  • 17:20 gehel: repooling all elasticsearch servers in eqiad
  • 17:12 cmjohnson1: rebooting cloudvirt1019
  • 17:04 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s)
  • 17:03 jforrester@deploy1001: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 16:36 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Re-disable WBMI on Beta Commons for now T180981 (duration: 00m 54s)
  • 16:31 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:469444 Wikibase.php, dont load wikidata repo settings on other repos (take 2) (duration: 00m 54s)
  • 16:04 XioNoX: power-off cr1-eqord - T204170
  • 16:00 twentyafterfour: 15:59:06 Synchronized php-1.33.0-wmf.1/extensions/EventBus/: revert "Set event datetime with microsecond resolution." on 1.33.0-wmf.1 refs T207817 (duration: 00m 56s)
  • 15:59 XioNoX: disable BGP sessions to transit/peering on cr1-eqord - T204170
  • 15:54 twentyafterfour: deploying https://gerrit.wikimedia.org/r/469451
  • 14:23 herron: scheduled icinga downtime and disabling puppet on logstash hosts. deploying role::kafka::logging to logstash elasticserach data hosts
  • 13:35 XioNoX: pre-configure switch ports for labvirt1007/8/9/12:eth1 in cloud-virt-instance-trunk range on asw2-b-eqiad
  • 13:17 ema: begin cache hosts rolling reboots for kernel/microcode updates T203011
  • 12:24 ema: cp-ats: upgrade trafficserver to 8.0.0-1wm1 T204232
  • 12:12 ema: cp1072: upgrade trafficserver to 8.0.0-1wm1 T204232
  • 11:22 ema: cp1071: upgrade trafficserver to 8.0.0-1wm1 T204232
  • 10:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Restore db1092 and db1104 original weight (duration: 00m 52s)
  • 10:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 and starting to restore db1104 original weight (duration: 00m 54s)
  • 10:28 marostegui: Compare revision table on dewiki cebwiki shwiki srwiki mgwiktionary enwikivoyage on db1100 and db2075 - T184805
  • 09:54 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s)
  • 09:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1092 (duration: 00m 54s)
  • 09:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1092 and db1087 (duration: 01m 05s)
  • 08:55 marostegui: Stop MySQL for upgrade and reboot on db1087
  • 08:47 marostegui: Update MySQL on db1092 for upgrade and reboot
  • 08:03 godog: fix aggregation to 'sum' for MediaWiki.RevisionSlider - T205416
  • 07:33 gehel: powercycling wdqs1010 - T207817
  • 07:19 _joe_: powercycling wdqs1009
  • 07:04 elukey: powercycle wdqs1008
  • 06:59 elukey: powercycle wdqs1007
  • 06:55 elukey: powercycle wdqs1006 (depool first)
  • 06:46 elukey: powercycle wdqs1005
  • 06:42 SMalyshev: repooled wdqs1003
  • 06:35 _joe_: powercycling wdqs[2001-2002,2004-2006].codfw.wmnet, one at a time
  • 06:33 elukey: powercycle wdqs1004
  • 05:24 kartik@deploy1001: Finished deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445) (duration: 04m 06s)
  • 05:20 kartik@deploy1001: Started deploy [cxserver/deploy@80dc518]: Update cxserver to 9ad60d9 (T207445)
  • 02:34 mutante: powercycled wdqs1009 - by request
  • 02:24 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 03s)
  • 02:24 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue
  • 02:12 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue (duration: 00m 23s)
  • 02:12 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@d4692ea]: Reverting update on wdqs1003 to fix wdqs-updater issue
  • 01:56 tstarling@deploy1001: Synchronized php-1.33.0-wmf.1/includes/page/WikiPage.php: T207530 (duration: 00m 53s)
  • 01:46 tstarling@deploy1001: Synchronized php-1.32.0-wmf.26/includes/page/WikiPage.php: fix deletion performance regression T207530 (duration: 00m 55s)
  • 01:37 bawolff: deployed T207750
  • 00:28 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.33.0-wmf.1 refs T206655
  • 00:26 twentyafterfour: finished with mediawiki train for group0 refs T206655
  • 00:08 twentyafterfour@deploy1001: Finished scap: syncing 1.33.0-wmf.1 refs T206655 (duration: 36m 58s)

2018-10-23

  • 23:31 twentyafterfour@deploy1001: Started scap: syncing 1.33.0-wmf.1 refs T206655
  • 23:30 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.26/includes/export/WikiExporter.php: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/469319/ refs T207628 (duration: 01m 39s)
  • 22:16 eileen: civicrm revision changed from bde28d4453 to 1c0a1b2406, config revision is c0a8be03a1
  • 22:14 twentyafterfour: scap prep 1.33.0-wmf.1
  • 21:47 mutante: icinga1001 - replacing check_ping with check_fping as the standard host check command, for faster host checks (another tip from Nagios Tuning guide, still manual testing) (T202782)
  • 21:30 mutante: icinga1001 - changing check_result_reaper_frequecy from 10 to 3, trying to lower average check latency. "allow faster check result processing -> requires more CPU" (T202782)
  • 19:31 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.26/skins/MinervaNeue/resources/skins.minerva.scripts/pageIssuesLogger.js: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/skins/MinervaNeue/+/469244/ refs T207423 (duration: 00m 48s)
  • 19:27 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/mediawiki/skins/MinervaNeue/+/469244/
  • 19:22 bawolff: deploy patch T207778
  • 18:17 mutante: icinga - performance/latency comparison - https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=4 vs https://icinga-stretch.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=4 (T202782)
  • 18:13 mutante: icinga1001 - manually set max_concurrent_checks to 0 (unlimited), restart icinga, keep puppet disabled, for testing (it ran into the limit of 10000 all the time, causing lots of logging, and the CPU power is actually slightly lower than on einsteinium (T202782) refs: Nagios Tuning, point 7 https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/tuning.html
  • 17:20 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BETA: Set wmgWikibaseCachePrefix for commonswiki I0badd355723 (duration: 00m 46s)
  • 17:18 ejegg: updated standalone SmashPig deploy from 2292111bda to b638ca02bc
  • 17:15 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: For WBMI, intentionally rather than implicitly install Wikibase I38574e670 (duration: 00m 47s)
  • 17:13 mutante: icinga1001 rm /var/log/user.log.1 - was 14G and using 25% of the / partition and server out of disk :/
  • 17:06 ejegg: rolled SmashPig back to 2292111bda
  • 17:03 ejegg: updated standalone SmashPig deployment from 2292111bda to 18da9727d8
  • 16:20 volans: restarted pdfrender on scb1004
  • 14:47 herron: added confluent-kafka-2.11 1.1.0-1 package to jessie-wikimedia/thirdparty T206454
  • 14:34 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting comment table migration stage to write-new/read-both on group 1 (T166733) (duration: 00m 46s)
  • 14:22 anomie@deploy1001: Synchronized php-1.32.0-wmf.26/includes/filerepo/file/LocalFile.php: Backport for T207419 (duration: 00m 47s)
  • 14:02 gehel: repooling / banning elastics1031 - T207724
  • 14:01 moritzm: installing spice security updates
  • 14:00 ema: upload trafficserver 8.0.0-1wm1 to stretch-wikimedia/main T204232
  • 13:49 gehel: depooling / banning elastics1031 - T207724
  • 13:43 gehel: depooling / banning elastics1029 - T207724
  • 13:35 gehel: rolling restart of blazegraph for change to blazegraph home dir
  • 13:22 gehel: depooling / banning elastics1018 - T207724
  • 12:29 gehel: depooling / banning elastics1028 and 1030 - T207724
  • 11:23 zeljkof: EU SWAT finished
  • 11:20 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for Wikipedia in Ort (T207714) (duration: 00m 46s)
  • 11:11 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable RCPatrol for srwikiquote (T207732) (duration: 00m 47s)
  • 10:13 ema: upload libc++ 6.0.1 to stretch-wikimedia/main T204232
  • 09:42 jynus: stopping db1087 to fix db1124
  • 09:31 gehel: depooling / banning elastics1017 and 1022 - T207724
  • 09:13 godog: roll-restart thumbor to send statsd traffic through statsd_exporter - T205870
  • 08:08 godog: update hp firmware to 6.60 on ms-be2017 - T141756
  • 07:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1100 - T184805 (duration: 00m 48s)
  • 06:50 elukey: powercycle ms-be2017 (frozen since ~8hrs ago)
  • 06:42 elukey: restart yarn and hdfs daemon on analytics1068 to pick up correct config (the host was down since before we swapped the Hadoop masters due to hw failure)
  • 06:39 marostegui: Stop replication on db1092 and db1087 for checking T206743
  • 06:02 marostegui: Deploy schema change on s3 - T207359
  • 00:35 SMalyshev: temp depooled wdq1003 to let it catch up
  • 00:17 Amir1: evening SWAT is done

2018-10-22

  • 23:59 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.26/includes/changetags/ChangeTags.php: SWAT: Fix bad join on ChangeTag subquery (T207313) (duration: 00m 47s)
  • 23:39 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@d4692ea]: Redeploy Updater for T207673 (duration: 10m 12s)
  • 23:29 smalyshev@deploy1001: Started deploy [wdqs/wdqs@d4692ea]: Redeploy Updater for T207673
  • 22:12 pmiazga@deploy1001: Synchronized wmf-config//InitialiseSettings-labs.php: SWAT: beta: Disable page issues A/B test on beta cluster only (T200792) (duration: 00m 46s)
  • 21:44 mutante: adding new prod ServerAlias punjabi.wikimedia.org to Apache cluster (T207583)
  • 21:13 ayounsi@deploy1001: Finished deploy [librenms/librenms@0fd8da6]: Revert LibreNMS upgrade - T207481 (duration: 00m 08s)
  • 21:13 ayounsi@deploy1001: Started deploy [librenms/librenms@0fd8da6]: Revert LibreNMS upgrade - T207481
  • 21:08 andrewbogott: rebooting cloudvirt1023
  • 20:52 ayounsi@deploy1001: Finished deploy [librenms/librenms@737683a]: Upgreade LibreNMS to 1.44 - T207481 (duration: 00m 10s)
  • 20:52 ayounsi@deploy1001: Started deploy [librenms/librenms@737683a]: Upgreade LibreNMS to 1.44 - T207481
  • 20:29 ladsgroup@deploy1001: Finished deploy [ores/deploy@e89e880]: Use redis task tracker (T152012) (duration: 22m 02s)
  • 20:06 ladsgroup@deploy1001: Started deploy [ores/deploy@e89e880]: Use redis task tracker (T152012)
  • 18:54 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT Deploy TemplateWizard everywhere T202545, re-try (duration: 00m 45s)
  • 18:50 jforrester@deploy1001: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 18:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: [Beta] Temporarily disable WBMI from Beta Commons whilst Wikibse is fixed T180981 (duration: 00m 46s)
  • 18:38 jforrester@deploy1001: Synchronized php-1.32.0-wmf.26/resources/src/mediawiki.rcfilters/styles/mw.rcfilters.ui.ChangesListWrapperWidget.highlightCircles.seenunseen.less: SWAT RCFIlters: Fix highlight circles for unseen changes T207472 (duration: 00m 46s)
  • 18:36 jforrester@deploy1001: Synchronized php-1.32.0-wmf.26/skins/MinervaNeue/resources/skins.minerva.scripts/pageIssuesLogger.js: SWAT Fix reading depth logging part 2 T207423 (duration: 00m 46s)
  • 18:35 jforrester@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.readingDepth.js: SWAT Fix reading depth logging part 1 T207423 (duration: 00m 46s)
  • 18:31 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT Add TemplateWizard to the BF allow list T205290 (duration: 00m 48s)
  • 18:05 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: [Beta Cluster] Load but don't enable MediaInfo on Beta Commons cf. T180981 (duration: 00m 45s)
  • 18:00 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: For WikibaseMediaInfo wikis, load basic Wikibase repo code cf. T180981 (duration: 00m 46s)
  • 17:57 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Allow enablement of the WikibaseMediaInfo, still off everywhere cf. T180981 (duration: 00m 48s)
  • 17:46 jforrester@deploy1001: Synchronized wmf-config/extension-list: Add WikibaseMediaInfo i18n to cache cf. T180981 (duration: 00m 46s)
  • 17:40 mobrovac@deploy1001: Finished deploy [proton/deploy@b3e254a]: Update Puppeteer to v1.9.0 - T207416 (duration: 01m 34s)
  • 17:38 mobrovac@deploy1001: Started deploy [proton/deploy@b3e254a]: Update Puppeteer to v1.9.0 - T207416
  • 17:34 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@975a67b]: WDQS deployment - GUI update and binaries upgrade (duration: 11m 47s)
  • 17:23 XioNoX: enable cr2:xe-4/0/0 (to asw-a) for optics replacement - T203719
  • 17:22 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@975a67b]: WDQS deployment - GUI update and binaries upgrade
  • 17:20 XioNoX: disable cr2:xe-4/0/0 (to asw-a) for optics replacement - T203719
  • 17:19 elukey@deploy1001: Finished deploy [analytics/refinery@1de5f44]: Deploy new version of Camus and pageview whitelist (duration: 07m 05s)
  • 17:16 cmjohnson1: analytics1068 down for mother board swap
  • 17:12 elukey@deploy1001: Started deploy [analytics/refinery@1de5f44]: Deploy new version of Camus and pageview whitelist
  • 17:11 XioNoX: re-enable puppet fleet-wide for puppetmaster1001 uplink move
  • 17:10 XioNoX: moving puppetmaster1001 uplink to asw2-b
  • 17:08 andrew@deploy1001: Finished deploy [horizon/deploy@431a55d]: Rolling out fix for T207510 (duration: 03m 40s)
  • 17:07 XioNoX: disable puppet fleet-wide for puppetmaster1001 uplink move
  • 17:05 andrew@deploy1001: Started deploy [horizon/deploy@431a55d]: Rolling out fix for T207510
  • 16:24 arturo: T206261 2h icinga downtime cloudnet1003/4 for another patch
  • 15:54 ejegg: updated payments-wiki from 06848600ed to 5a7a8e7e4f
  • 15:51 ejegg: updated fundraising CiviCRM from 1f10dc8a18 to bde28d4453
  • 15:35 XioNoX: push firewall changes to pfw3-eqiad - T207175
  • 15:35 ejegg: updated standalone SmashPig deployment from 581c685326 to 2292111bda
  • 15:03 mforns@deploy1001: Finished deploy [analytics/refinery@bbebc20]: deploying refinery together with refinery-source v0.0.79 (duration: 10m 16s)
  • 14:52 mforns@deploy1001: Started deploy [analytics/refinery@bbebc20]: deploying refinery together with refinery-source v0.0.79
  • 14:49 XioNoX: push firewall changes to pfw3-codfw - T207175
  • 13:19 marostegui: Run myloader for enwikivoyage cebwiki shwiki srwiki mgwiktionary on db2052 (s5 codfw master) - T184805
  • 13:12 kartik@deploy1001: Finished deploy [cxserver/deploy@5f53734]: Update cxserver to 7f996f3 (T207445) (duration: 03m 53s)
  • 13:08 kartik@deploy1001: Started deploy [cxserver/deploy@5f53734]: Update cxserver to 7f996f3 (T207445)
  • 11:51 zeljkof: eu swat finished
  • 11:49 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable rollbacker right on srwikisource (T206935) (duration: 00m 46s)
  • 11:37 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable autopatroller, patroller and rollbacker rights on srwikiquote (T206936) (duration: 00m 49s)
  • 11:28 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable suppressredirect and markbotedit rights to rollbackers on it.wikiversity (T207300) (duration: 00m 46s)
  • 11:21 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable cx2outreach campaign (T207031) (duration: 00m 47s)
  • 11:09 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Anniversary logo for cswiki (T207589) (duration: 00m 47s)
  • 11:06 zfilipin@deploy1001: sync-file aborted: SWAT: Test if logo specified in wgLogo/wgLogoHD exists (T207053) (duration: 00m 02s)
  • 10:03 arturo: icinga downtime for cloudnet1003/4 for T206261
  • 09:16 marostegui: Remove replication filters from db2052 (s5 codfw master) - T184805
  • 09:04 marostegui: Run mydumper on db1100 for enwikivoyage cebwiki shwiki srwiki mgwiktionary - T184805
  • 08:58 marostegui: Stop replication in sync on db1100 and db2052 (codfw master) to reimport wikis - T184805
  • 08:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1100 - T184805 (duration: 00m 47s)
  • 08:29 moritzm: powercycling ms-be1018, stuck during reboot
  • 08:28 jynus: performing deletes on db1087 to fix wb_terms on labs
  • 08:27 marostegui: Deploy schema change on db2043 (s3 master) without replication - T204006
  • 08:22 marostegui: Disconnect codfw -> eqiad replication on s5 (db1070)
  • 08:19 marostegui: Disconnect codfw -> eqiad replication on s3 (db1075)
  • 08:13 marostegui: Disconnect codfw -> eqiad replication on es3 (es1017)
  • 08:11 marostegui: Disconnect codfw -> eqiad replication on es2 (es1015)
  • 08:08 marostegui: Disconnect codfw -> eqiad replication on x1 (db1069)
  • 08:05 marostegui: Disconnect codfw -> eqiad replication on s8 (db1071)
  • 08:03 marostegui: Disconnect codfw -> eqiad replication on s7 (db1062)
  • 08:01 marostegui: Disconnect codfw -> eqiad replication on s6 (db1061)
  • 07:59 marostegui: Disconnect codfw -> eqiad replication on s4 (db1068)
  • 07:57 marostegui: Disconnect codfw -> eqiad replication on s2 (db1066)
  • 07:52 marostegui: Disconnect codfw -> eqiad replication on s1 (db1067)
  • 07:38 moritzm: rebooting swift-be servers in eqiad for kernel security update
  • 07:24 godog: reformat ms-be2042 - T199198
  • 06:34 marostegui: Deploy schema change on db2036 - T204006
  • 06:11 marostegui: Deploy schema change on db2050 - T204006
  • 06:00 marostegui: Deploy schema change on db2057 - T204006
  • 05:47 marostegui: Deploy schema change on s3 db2074 (and db2094 sanitarium) - T204006
  • 05:31 marostegui: Deploy schema change on dbstore2002:3313 - T204006
  • 05:29 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Clarify db2033 BBU status (duration: 00m 49s)
  • 04:37 kartik@deploy1001: Finished deploy [cxserver/deploy@904151f]: Update cxserver to eee8974 (T207070, T203077, T199529) (duration: 05m 42s)
  • 04:31 kartik@deploy1001: Started deploy [cxserver/deploy@904151f]: Update cxserver to eee8974 (T207070, T203077, T199529)

2018-10-21

  • 22:15 onimisionipe: repooling wdqs1003 as it has caught up on lag
  • 20:42 banyek: resuming replication on s4@dbstore2002 (T204930)
  • 16:15 reedy@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 04m 52s)
  • 15:57 bawolff: adjust patch for T194204
  • 12:39 onimisionipe: depooling wdqs1003 to catchup on lag time

2018-10-20

  • 23:05 reedy@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/CentralAuth/: Update setEmail (duration: 00m 55s)
  • 21:29 gehel: repooling wdqs1003 (still some lag, but 100[45] start to be impacted)
  • 19:54 gehel: depooling wdqs1003 to catch up on lag
  • 13:53 reedy@deploy1001: Synchronized php-1.32.0-wmf.26/includes/auth/AuthManager.php: (no justification provided) (duration: 00m 55s)
  • 12:46 hoo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add CentralAuth related permissions to stewards at metawiki (T207531) (duration: 01m 09s)
  • 05:38 marostegui: Force writeback on db2033 - T184888

2018-10-19

  • 20:33 twentyafterfour: deployed RCFilters: Fix completely broken highlight circles refs T207472
  • 20:32 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.26/resources/src/mediawiki.rcfilters/styles/: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/468636/ (duration: 00m 54s)
  • 20:31 twentyafterfour: deploying https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/468636/ to the full cluster.
  • 20:28 twentyafterfour: deployed https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/468636/ to mwdebug1002
  • 19:20 mutante: ns0 / ns1 - authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsdctl reload-zones - to add new language shn (T206777)
  • 19:16 mutante: ns2/multatuli - gnddctl reload-zones
  • 19:12 mutante: labweb1001 / wikitech - disabling 2fa for myself, logging in , re-enabling it again
  • 17:49 ejegg: updated fundraising CiviCRM from 83874e75ba to 1f10dc8a18
  • 17:47 mutante: DNS - 'authdns-gen-zones -f /srv/authdns/git/templates /etc/gdnsd/zones && gdnsd checkconf && gdnsd reload-zones' - needed when adding new languages to langs.tmpl - adding "shn" (Shan language) T206777
  • 16:36 XioNoX: deactivate BGP to 15426 in ams-ix (down and no reply to emails) - T207428
  • 14:16 banyek: disconnecting s4 replication on dbstore2002 (T204930)
  • 14:12 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Remove useless comments (duration: 00m 54s)
  • 13:58 vgutierrez: Uploaded certcentral 0.2 to apt.wikimedia.org (stretch) - T207457
  • 11:46 banyek: starting compression of s4 tables @dbstore2002 (T204930)
  • 11:33 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T207313 UBN - Revert back wikidata for change_tag backend (duration: 00m 59s)
  • 10:53 arturo: icinga downtime for 2h for clounet1003/1004 to deploy patch related to T206261
  • 09:37 godog: bump /proc/sys/net/core/rmem_default temporarily to 6MB and bounce statsd-proxy statsite-instances on graphite1004 - T196484
  • 08:53 banyek: adding wmf-pt-kill_2.2.20-1+wmf4 package for stretch (T206521)
  • 08:28 jynus: stopping db1092 and db1087 in sync
  • 07:50 godog: bump /proc/sys/net/core/rmem_default temporarily to 2MB and bounce statsd-proxy statsite-instances on graphite1004 - T196484
  • 07:20 marostegui: Remove mwmaint1001 grants from m5 - https://phabricator.wikimedia.org/T201343 https://phabricator.wikimedia.org/T192457
  • 07:15 godog: powercycle ms-be1021, [19601329.556259] sd 0:1:0:1: rejecting I/O to offline device
  • 07:05 godog: bump /proc/sys/net/core/rmem_default temporarily to 1MB and bounce statsd-proxy statsite-instances on graphite1004 - T196484
  • 06:13 marostegui: Deploy schema change on s7 codfw host by host without replication - T204006
  • 05:58 marostegui: Deploy schema change on s2 codfw host by host without replication - T204006
  • 05:25 marostegui: Deploy schema change on s1 codfw host by host without replication - T204006
  • 01:49 krinkle@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/WikimediaEvents/includes/WikimediaEventsHooks.php: Ic74a9d5601b8c (duration: 00m 55s)

2018-10-18

  • 22:00 mutante: lvs1011,lvs1012 - manually editing nagios NRPE config and restarting service (to make monitoring from icinga1001 work and puppet is disabled)
  • 21:52 mutante: eeden - manually editing nagios NRPE config and restarting service (to make monitoring from icinga1001 work and puppet is disabled)
  • 21:49 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group2 wikis to 1.32.0-wmf.26 refs T191072
  • 21:46 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.26/includes/filerepo/file/LocalFile.php: sync Id97e1c refs T207419 (duration: 00m 53s)
  • 21:29 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.26/includes/filerepo/file/LocalFile.php: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/468470/ refs T207419 (duration: 00m 54s)
  • 20:49 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group2 wikis to 1.32.0-wmf.24 refs T191072
  • 20:39 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.26
  • 20:21 volans: start ferm on db2042, it failed to start at reboot due to DNS resolution timeout
  • 19:22 ejegg: updated SmashPig standalone deploy from 5f21d3f2db to 581c685326
  • 19:21 ejegg: updated payments-wiki from a3892e4ed3 to 06848600ed
  • 19:17 shdubsh: rebooting graphite1004
  • 19:11 shdubsh: upping ring buffer size on graphite1004 in an attempt to mitigate dropped packets at the interface -- T196484
  • 19:02 sbisson@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/PageTriage/: SWAT: Use Main Object Stash for keeping track of PageTriage last use (duration: 00m 54s)
  • 18:19 awight: Restarting ORES services for T88997
  • 17:33 ladsgroup@deploy1001: Finished deploy [ores/deploy@4ac4c8b]: Logstash support for ores: T181546 T169586 T168921 T181630 T205256 (duration: 23m 48s)
  • 17:19 herron: aborted enabling kafka on logstash elasticsearch cluster due to puppet errors. reverted change T206454
  • 17:09 ladsgroup@deploy1001: Started deploy [ores/deploy@4ac4c8b]: Logstash support for ores: T181546 T169586 T168921 T181630 T205256
  • 17:00 twentyafterfour@deploy1001: Synchronized php: group1 wikis to 1.32.0-wmf.26 refs T191072 (duration: 00m 53s)
  • 16:59 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.26 refs T191072
  • 16:57 herron: enabling kafka on logstash elasticsearch cluster T206454
  • 16:55 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/WikibaseQualityConstraints/src/ServiceWiring.php: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikibaseQualityConstraints/+/468352/ refs T207394 (duration: 00m 54s)
  • 16:52 mobrovac@deploy1001: Finished deploy [restbase/deploy@6c879fa]: Have 100% of traffic directed to Proton as well - T186748 (duration: 20m 52s)
  • 16:31 mobrovac@deploy1001: Started deploy [restbase/deploy@6c879fa]: Have 100% of traffic directed to Proton as well - T186748
  • 15:51 XioNoX: trunk cloud-instances2-b-eqiad between asw-b-eqiad and asw2-b-eqiad
  • 15:50 cmjohnson1: disabling checks on cloudvirt1019 for maintenance
  • 15:42 twentyafterfour: twentyafterfour@deploy1001 Synchronized php: group1 wikis to 1.32.0-wmf.24 refs T191072 (duration: 00m 53s)
  • 15:35 twentyafterfour@deploy1001: scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 14:46 moritzm: installing tomcat8 security updates
  • 14:34 moritzm: remove labvirt1018 from debmonitor (T207317)
  • 14:28 godog: temporarily bump default socket receive memory to 1MB on graphite1001, restart statsd-proxy and statsite
  • 14:22 godog: begin reformat of ms-be2041 - T199198
  • 14:21 banyek: shutting down mysql and powering down db2042 (T202051)
  • 14:13 godog: corrections to the statements above, graphite1004 not graphite1001
  • 14:11 godog: ditto for statsite instances on graphite1001, temporarily bump receive socket memory to 1MB and bounce the service
  • 14:08 godog: temporarily bump receive socket memory for statsd-proxy on graphite1001 and bounce the service
  • 13:51 moritzm: installing libidn security updates
  • 12:59 moritzm: installing libssh security updates
  • 12:55 godog: bounce statsd-proxy on graphite1001
  • 11:59 addshore: SWAT done
  • 11:59 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Wikidata.org: enable sense data type T203888 (duration: 00m 54s)
  • 11:54 mobrovac@deploy1001: Finished deploy [restbase/deploy@1041a02]: Disable onthisday check - T203588 (duration: 21m 23s)
  • 11:54 zfilipin@deploy1001: Synchronized tests/InitialiseSettingsTest.php: SWAT: Test if logo specified in wgLogo/wgLogoHD exists (T207053) (duration: 00m 53s)
  • 11:49 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix typo in IS.php: use ltwiki instead of ltwikipedia (T207081) (duration: 00m 54s)
  • 11:39 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use testwikidatawiki instead of testwikidata in IS.php (T207089) (duration: 00m 53s)
  • 11:33 mobrovac@deploy1001: Started deploy [restbase/deploy@1041a02]: Disable onthisday check - T203588
  • 11:29 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use new wordmarks in uzwiki (T205226) (duration: 00m 53s)
  • 11:10 zfilipin@deploy1001: Synchronized static/images/mobile/copyright/: SWAT: Upload uz specific wordmark (T205226) (duration: 00m 54s)
  • 10:59 addshore: wikidata senses deploy slot done
  • 10:57 addshore: addshore@mwmaint1002:~$ mwscript purgeList.php --wiki wikidatawiki --namespace 146
  • 10:57 mobrovac@deploy1001: Finished deploy [restbase/deploy@88c8f26]: Parallelise onthisday call, take #4 (duration: 03m 52s)
  • 10:55 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: RejectParserCacheValue Wikidata lexemes before sense deployment T203888 (duration: 00m 54s)
  • 10:54 addshore@deploy1001: sync-file aborted: RejectParserCacheValue Wikidata lexemes before sense deploymentT203888 (duration: 00m 00s)
  • 10:53 mobrovac@deploy1001: Started deploy [restbase/deploy@88c8f26]: Parallelise onthisday call, take #4
  • 10:53 mobrovac@deploy1001: Finished deploy [restbase/deploy@88c8f26]: Parallelise onthisday call, take #3 (duration: 04m 13s)
  • 10:51 addshore@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/WikibaseLexeme: Wikidata: Make statement group IDs on Senses unique (duration: 00m 59s)
  • 10:49 mobrovac@deploy1001: Started deploy [restbase/deploy@88c8f26]: Parallelise onthisday call, take #3
  • 10:49 mobrovac@deploy1001: Finished deploy [restbase/deploy@88c8f26]: Parallelise onthisday call, take #2 (duration: 07m 32s)
  • 10:41 mobrovac@deploy1001: Started deploy [restbase/deploy@88c8f26]: Parallelise onthisday call, take #2
  • 10:41 mobrovac@deploy1001: Finished deploy [restbase/deploy@88c8f26]: Parallelise onthisday call - T203588 (duration: 11m 24s)
  • 10:34 addshore@deploy1001: Synchronized wmf-config/Wikibase-production.php: Combine if blocks in Wikibase-production NOOP (duration: 00m 53s)
  • 10:32 volans@deploy1001: Finished deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (5) - T205896 (duration: 00m 29s)
  • 10:31 volans@deploy1001: Started deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (5) - T205896
  • 10:31 volans@deploy1001: Finished deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (5) - T205896 (duration: 01m 37s)
  • 10:31 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: BETA ONLY Remove wgLexemeEnableSenses from IS-labs (duration: 00m 53s)
  • 10:30 volans@deploy1001: Started deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (5) - T205896
  • 10:29 mobrovac@deploy1001: Started deploy [restbase/deploy@88c8f26]: Parallelise onthisday call - T203588
  • 10:28 volans@deploy1001: Finished deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (4) - T205896 (duration: 00m 05s)
  • 10:28 volans@deploy1001: Started deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (4) - T205896
  • 10:15 addshore: purging wikidata lexemes
  • 10:12 volans@deploy1001: Finished deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (3) - T205896 (duration: 00m 29s)
  • 10:11 volans@deploy1001: Started deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (3) - T205896
  • 10:10 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable senses on wikidatawiki T203888 (duration: 00m 53s)
  • 10:09 volans@deploy1001: Finished deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (2) - T205896 (duration: 02m 01s)
  • 10:07 volans@deploy1001: Started deploy [netbox/deploy@1cd4d43]: Upgrade to upstream v2.4.6 (2) - T205896
  • 10:00 volans@deploy1001: Finished deploy [netbox/deploy@438f1c0]: Upgrade to upstream v2.4.6 - T205896 (duration: 03m 07s)
  • 09:57 volans@deploy1001: Started deploy [netbox/deploy@438f1c0]: Upgrade to upstream v2.4.6 - T205896
  • 09:52 XioNoX: activate bgp group Customer6 on cr4-ulsfo
  • 09:20 banyek: enabling replication monitor check on pc1005 pc1006 pc2005 pc2006 (T206992)
  • 09:18 godog: bounce statsd-proxy on graphite1001
  • 09:08 moritzm: powercycling ms-be2019, stuck during reboot
  • 09:01 banyek: enabling replication monitor check on pc1004 (T206992)
  • 08:56 banyek: enabling replication monitor check on pc2004 (T206992)
  • 08:41 banyek: disabling puppet on parser caches (T206992)
  • 08:40 banyek: adding replication monitoring checks to parsercache hosts (T206992)
  • 08:26 vgutierrez: Uploaded certcentral 0.1-2 to apt.wikimedia.org (stretch)
  • 07:56 moritzm: rebooting swift backend servers in codfw for spectre v3/v4/L1TF security updates
  • 07:43 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: Wikidata dispatch: reduce concurrent dispatchers to 2 (duration: 00m 59s)
  • 05:34 marostegui: Restarting a failed s8 backup from dbstore1001 to db1116:3318
  • 05:05 XioNoX: start office-DC link renumbering - T205985
  • 02:51 ejegg: updated fundraising CiviCRM from 7b8d33bb4e to 83874e75ba
  • 00:32 twentyafterfour: restarting apache on phab1001 to apply b3bfff1

2018-10-17

  • 22:56 awight: Restarting ORES uwsgi service for T88997
  • 22:38 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.32.0-wmf.26 refs T191072
  • 22:36 robh: bast4001 reboot is my fault, power cables were justled when i was decommssioning lvs4002 right above it in the rack
  • 22:31 ejegg: updated fundraising CiviCRM from 5eac0634e6 to 7b8d33bb4e
  • 22:24 ejegg: updated payments-wiki from 0385ad02a7 to a3892e4ed3
  • 22:21 ppchelko@deploy1001: Finished deploy [restbase/deploy@88c8f26] (dev-cluster): Spread requests beetween MCS nodes for onthisday (duration: 02m 54s)
  • 22:18 ppchelko@deploy1001: Started deploy [restbase/deploy@88c8f26] (dev-cluster): Spread requests beetween MCS nodes for onthisday
  • 20:50 arlolra: Updated Parsoid to e6b708b (T204622, T187848, T207093)
  • 20:40 arlolra@deploy1001: Finished deploy [parsoid/deploy@babf1da]: Updating Parsoid to e6b708b (duration: 08m 41s)
  • 20:32 arlolra@deploy1001: Started deploy [parsoid/deploy@babf1da]: Updating Parsoid to e6b708b
  • 20:17 mobrovac@deploy1001: Started restart [proton/deploy@a657059]: (no justification provided)
  • 20:10 ejegg: updated fundraising CiviCRM from 4cc21d61c5 to 5eac0634e6
  • 19:26 shdubsh: restart eventlogging for statsd DNS change - T88997
  • 19:23 twentyafterfour: Mediawiki train is still blocked by T207288
  • 19:19 godog: restart zuul for statsd DNS change - T88997
  • 19:12 mutante: scb1003 - restart pdfrender
  • 19:09 godog: roll-restart eventbus for statsd DNS change - T88997
  • 19:00 krinkle@deploy1001: Synchronized php-1.32.0-wmf.26/includes/cache/: T193271 - I25aa0e27200a0 (duration: 01m 01s)
  • 18:57 awight: Restarting ORES cluster to refresh DNS, T88997
  • 18:48 banyek: repooling labsdb1009 (T181650)
  • 18:48 shdubsh: restart navtiming on webperf nodes
  • 18:39 godog: restart jmxtrans on kafka hosts
  • 18:17 shdubsh: moving statsd cname to graphite1004
  • 18:07 banyek: depooling labsdb1009 (T181650)
  • 17:08 banyek: depooling labsdb1009 (T181650)
  • 16:53 banyek: repooling labsdb1011
  • 15:53 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.26/extensions/AbuseFilter/: sync AbuseFilter revision 4e2a6b6 to 1.32.0-wmf.26 refs T207220 (duration: 00m 58s)
  • 15:34 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T206593: Enabling db2096 for x1 (duration: 00m 56s)
  • 15:31 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T206593: Enabling db2096 for x1 (duration: 00m 56s)
  • 15:28 banyek: enabling db2096 for cluster x1 (T206593)
  • 14:33 godog: upload prometheus-statsd-exporter 0.7.0+ds1-2 - T205870
  • 14:01 marostegui: Repool labsdb1010, depool labsdb1011 - T181650
  • 13:08 gehel: applying rps NIC config for all wdqs nodes - T206105
  • 13:05 banyek: deplooling labsdb1010 (T181650)
  • 12:56 banyek: enabling notifications on db2096 (T206593)
  • 12:55 banyek: enabling notifications on db2096
  • 11:40 Amir1: EU SWAT is done
  • 11:40 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable reading from new backend of change tag everywhere (T194164) (duration: 00m 57s)
  • 11:32 moritzm: installing graphicsmagick security updates
  • 11:30 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T206702 Enable client side error counting on Minerva (duration: 00m 57s)
  • 11:26 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T207196 gerrit:467736 Wikidata: enable JSON-LD data format on test.wikidata.org (duration: 00m 56s)
  • 11:21 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: T207196 Wikidata: add setting for setting the enabled entity data forms gerrit:467735 PT 2/2 (duration: 00m 56s)
  • 11:19 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: T207196 Wikidata: add setting for setting the enabled entity data forms gerrit:467735 PT 1/2 (duration: 00m 57s)
  • 11:17 Amir1: ladsgroup@mwmaint1002:~$ mwscript deleteLocalPasswords.php --wiki=enwiki --delete --batch-size 200 (This will cause lag on codfw)
  • 11:15 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: T205611 T205330 Remove Wikidata RejectParserCacheValue hook gerrit:467913 (duration: 00m 56s)
  • 11:11 addshore@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: Increase wikidata dispatch randomness to 30 (duration: 00m 56s)
  • 11:08 addshore@deploy1001: Synchronized wmf-config/Wikibase-production.php: SWAT: T207019 gerrit:467343 Enable WBQualityConstraintsSuggestionsBetaFeature on wikidatawiki (duration: 00m 56s)
  • 11:04 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT gerrit:467691 Add constraint-suggestions to wgBetaFeaturesWhitelist (duration: 01m 10s)
  • 11:04 ariel@deploy1001: Finished deploy [dumps/dumps@ed7eed9]: use lbzip2 for recombine steps if configured (duration: 00m 03s)
  • 11:04 ariel@deploy1001: Started deploy [dumps/dumps@ed7eed9]: use lbzip2 for recombine steps if configured
  • 09:34 XioNoX: update interfaces and BGP IPs for office-DC link (DC side, interfaces still disabled) - T205985
  • 09:30 banyek: truncating parsercache tables on pc2006 (T206740)
  • 09:12 _joe_: reenabling puppet (not running it) in codfw
  • 09:12 _joe_: change applied to all appservers serving traffic
  • 09:08 _joe_: running puppet on all apaches (appserver/api) in eqiad to pick up the wikipedia.org vhost refactor
  • 09:05 _joe_: running puppet on mwdebug1001, then testing again wikipedia.org for regressions
  • 09:04 _joe_: puppet disabled on the appservers, now merging the wikipedia.org conversion to mediawiki::web::vhost
  • 08:43 mobrovac@deploy1001: Started restart [proton/deploy@a657059]: (no justification provided)
  • 08:30 kartik@deploy1001: Finished deploy [cxserver/deploy@b30a323]: Update cxserver to 29e01e4 (T206305, T204668) (duration: 03m 54s)
  • 08:27 kartik@deploy1001: Started deploy [cxserver/deploy@b30a323]: Update cxserver to 29e01e4 (T206305, T204668)
  • 08:09 banyek: stopping binlog purgers on the parsercache hosts (the binlogs will be kept for 24hrs) - T206740
  • 08:00 banyek: truncating parsercache tables on pc2005 (T206740)
  • 06:52 jynus: fixing s8 master drifts T206743
  • 02:10 ejegg: updated payments-wiki from 7fb1aae963 to 0385ad02a7
  • 01:24 legoktm@deploy1001: Synchronized wmf-config/CommonSettings.php: Add REL1_32 to ExtensionDistributor (duration: 00m 59s)

2018-10-16

  • 22:11 ppchelko@deploy1001: Finished deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 6 (duration: 01m 18s)
  • 22:09 ppchelko@deploy1001: Started deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 6
  • 22:09 ppchelko@deploy1001: Finished deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 5 (duration: 05m 16s)
  • 22:04 ppchelko@deploy1001: Started deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 5
  • 22:04 ppchelko@deploy1001: Finished deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 4 (duration: 03m 53s)
  • 22:00 ppchelko@deploy1001: Started deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 4
  • 22:00 ppchelko@deploy1001: Finished deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 3 (duration: 04m 15s)
  • 21:58 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.32.0-wmf.24 refs T191072
  • 21:55 ppchelko@deploy1001: Started deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 3
  • 21:55 ppchelko@deploy1001: Finished deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 2 (duration: 09m 11s)
  • 21:46 ppchelko@deploy1001: Started deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required, take 2
  • 21:45 ppchelko@deploy1001: Finished deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required (duration: 03m 53s)
  • 21:42 ppchelko@deploy1001: Started deploy [restbase/deploy@d9e3a09]: Downgrade major-greater to minor-greater if no-cache is required
  • 21:18 twentyafterfour@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.32.0-wmf.26 refs T191072
  • 20:55 twentyafterfour@deploy1001: Finished scap: Syncing 1.32.0-wmf.26 refs T191072 (duration: 26m 32s)
  • 20:28 twentyafterfour@deploy1001: Started scap: Syncing 1.32.0-wmf.26 refs T191072
  • 20:14 shdubsh: restarted pdfrender on scb1003
  • 18:44 ppchelko@deploy1001: Started restart [proton/deploy@a657059]: Try restarting again for metrics
  • 18:43 ppchelko@deploy1001: Started restart [proton/deploy@a657059]: Try restarting again for metrics
  • 18:42 ppchelko@deploy1001: Finished deploy [proton/deploy@a657059]: Try restarting for metrics (duration: 00m 20s)
  • 18:42 ppchelko@deploy1001: Started deploy [proton/deploy@a657059]: Try restarting for metrics
  • 17:01 _joe_: restarted pdfrender on scb1004
  • 16:33 akosiaris: depool restbase-async from eqiad in order to test traffic going to parsoid codfw
  • 16:15 _joe_: disabled puppet on all appservers, merging wikidata apache change, re-enabling puppet on mwdebug1001 for testing
  • 14:51 mobrovac@deploy1001: Finished deploy [proton/deploy@a657059]: Rollback to puppeteer v1.5.0 - T186748 (duration: 00m 49s)
  • 14:51 mobrovac@deploy1001: Started deploy [proton/deploy@a657059]: Rollback to puppeteer v1.5.0 - T186748
  • 14:28 godog: roll-restart elasticsearch on logstash100[456] to change elasticsearch data dir - T206454
  • 14:06 godog: depool in turn logstash1008 and logstash1009 to change elasticsearch data dir - T206454
  • 13:55 godog: depool logstash1007 to change elasticsearch data dir - T206454
  • 13:54 XioNoX: router back and healthy, enable external BGP sessions on cr2-eqdfw - T203261
  • 13:51 moritzm: rebooting acamar for update to stretch-proposed-updates kernel
  • 13:44 XioNoX: reboot cr2-eqdfw for upgrade - T203261
  • 13:43 XioNoX: disable external BGP sessions on cr2-eqdfw - T203261
  • 13:43 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting comment table migration stage to write-new/read-both on group 0 (T166733) (duration: 00m 50s)
  • 13:34 XioNoX: start install process on cr2-eqdfw (non impacting before reboot) - T203261
  • 13:11 akosiaris: pool codfw for apertium|citoid|cxserver|eventbus|eventstreams|graphoid|mathoid|mobileapps|ores|parsoid|pdfrender|proton|recommendation-api|restbase|restbase-async|wdqs|wdqs-internal|zotero
  • 13:11 akosiaris@puppetmaster1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=^apertium|citoid|cxserver|eventbus|eventstreams|graphoid|mathoid|mobileapps|ores|parsoid|pdfrender|proton|recommendation-api|restbase|restbase-async|wdqs|wdqs-internal|zotero$
  • 13:08 elukey: restart memcached on mc1035 with -R 200 (will wipe the object cache shard as consequence) - T203786
  • 12:57 akosiaris: pool mathoid eqiad
  • 12:52 gtirloni: T186571 removed legofan4000 user from project-tools group (leftover from T165624 legofan4000->macfan4000 rename)
  • 12:44 akosiaris@deploy1001: scap-helm mathoid finished
  • 12:43 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 12:43 akosiaris@deploy1001: scap-helm mathoid upgrade production stable/mathoid --reset-values -f mathoid.yaml [namespace: mathoid, clusters: eqiad]
  • 12:35 akosiaris: depool eqiad mathoid for helm chart upgrade
  • 12:32 akosiaris: pool codfw mathoid
  • 12:14 akosiaris@deploy1001: scap-helm mathoid finished
  • 12:14 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 12:14 akosiaris@deploy1001: scap-helm mathoid upgrade production stable/mathoid --reset-values -f mathoid.yaml [namespace: mathoid, clusters: codfw]
  • 12:08 Amir1: EU SWAT is done
  • 12:08 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable reading from new backend of change_tag in s7 (T194164) (duration: 00m 50s)
  • 12:03 akosiaris@deploy1001: scap-helm mathoid finished
  • 12:03 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 12:03 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.24/includes/changetags/ChangeTags.php: SWAT: Avoid fatals when the filter tags is empty (T194164) (duration: 00m 50s)
  • 12:03 akosiaris@deploy1001: scap-helm mathoid upgrade production stable/mathoid --set main_app.limits.memory=1G [namespace: mathoid, clusters: codfw]
  • 12:02 akosiaris@deploy1001: scap-helm mathoid upgrade production stable/mathoid --set main_app.limits.memory=1g [namespace: mathoid, clusters: codfw]
  • 11:49 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Re-enable search integration for ArticlePlaceholder (T195751) (duration: 00m 50s)
  • 11:38 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Translate on idwikimedia (T204292) (duration: 00m 49s)
  • 11:32 banyek: the binlog purging stopped on pc2004 (T206740)
  • 11:27 akosiaris: upgrade mathoid chart to version 0.0.12
  • 11:26 akosiaris@deploy1001: scap-helm mathoid finished
  • 11:26 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 11:26 akosiaris@deploy1001: scap-helm mathoid upgrade production stable/mathoid [namespace: mathoid, clusters: codfw]
  • 11:24 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add throttle rule for editathon at University of North Carolina at Charlotte (T207043) (duration: 00m 49s)
  • 11:18 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule for WMCL Editathon (T206914) (duration: 00m 49s)
  • 11:09 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add throttle rule for "Night of the Digital Language" (T206408) (duration: 00m 49s)
  • 11:05 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Remove expired throttle rule (T207015) (duration: 00m 50s)
  • 11:02 banyek: truncating tables in parsecache@pc2004 (T206740)
  • 10:52 moritzm: rolling reboot of thumbor in eqiad for kernel security updates
  • 10:50 godog: run puppet on scb to deploy db configuration for recommendation-service
  • 10:37 banyek: stopping pc2005 -> pc1005 replication (T206740)
  • 10:37 banyek: stopping pc2006 -> pc1006 replication (T206740)
  • 10:22 jynus: running database maintenance tasks on cumin1001, expect very high memory usage
  • 09:53 akosiaris: upload blubber_0.6.0-1_amd64 to apt.wikimedia.org/jessie-wikimedia/main and apt.wikimedia.org/stretch-wikimedia/main T206766
  • 09:03 moritzm: rolling reboot of thumbor in codfw for kernel security updates
  • 08:56 banyek: stopping pc2004 -> pc1004 replication (T206740)
  • 08:42 moritzm: removed mwmaint1001 from debmonitor (T192457)
  • 07:46 akosiaris: upgrade apertium-apy throught the fleet T199447
  • 07:46 akosiaris: upgrade apertium-apy throught the fleet
  • 07:22 akosiaris: upload apertium-apy_0.11.4-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main T199447
  • 07:22 akosiaris: upload apertium-apy_0.11.4-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 07:20 akosiaris@deploy1001: scap-helm mathoid finished
  • 07:19 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 07:19 akosiaris@deploy1001: scap-helm mathoid upgrade production stable/mathoid [namespace: mathoid, clusters: codfw]
  • 07:19 akosiaris@deploy1001: scap-helm mathoid upgrade [namespace: mathoid, clusters: codfw]
  • 07:17 moritzm: installing net-snmp security updates
  • 06:32 legoktm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert "Enable reading from new backend of change_tag in s7" (T194164) (duration: 00m 50s)
  • 06:05 jynus: stopping db1092 and db1087 in sync T206743
  • 05:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Remove db1092 BBU comments after BBU replacement (duration: 00m 52s)
  • 00:23 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@c81dd9e]: Redeploy Updater for removal of props channel (duration: 10m 21s)
  • 00:13 smalyshev@deploy1001: Started deploy [wdqs/wdqs@c81dd9e]: Redeploy Updater for removal of props channel

2018-10-15

  • 20:52 arlolra: Updated Parsoid to 8f3ff40 (T205642, T206003, T187848, T205455, T205743)
  • 20:37 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@834d00a]: Update mobileapps to c2a4ef9 (T206701 T206467 T168875) (duration: 03m 47s)
  • 20:34 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@834d00a]: Update mobileapps to c2a4ef9 (T206701 T206467 T168875)
  • 20:32 arlolra@deploy1001: Finished deploy [parsoid/deploy@b758124]: Updating Parsoid to 8f3ff40 (duration: 11m 43s)
  • 20:20 arlolra@deploy1001: Started deploy [parsoid/deploy@b758124]: Updating Parsoid to 8f3ff40
  • 19:37 mforns@deploy1001: Finished deploy [analytics/refinery@3f4adf8]: deploy refinery together with source version 0.0.78 without all removed old jars (duration: 05m 18s)
  • 19:33 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@ff3bf90]: Redeploy 1010 (duration: 00m 28s)
  • 19:33 smalyshev@deploy1001: Started deploy [wdqs/wdqs@ff3bf90]: Redeploy 1010
  • 19:32 mforns@deploy1001: Started deploy [analytics/refinery@3f4adf8]: deploy refinery together with source version 0.0.78 without all removed old jars
  • 19:27 mforns@deploy1001: Finished deploy [analytics/refinery@1fc53d9]: deploy refinery together with source version 0.0.78 (duration: 15m 56s)
  • 19:11 mforns@deploy1001: Started deploy [analytics/refinery@1fc53d9]: deploy refinery together with source version 0.0.78
  • 18:59 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable reading from new backend of change_tag in s7 (T194164) (duration: 00m 49s)
  • 18:59 mutante: LDAP - added crusnov to wmf and ops groups
  • 18:51 tgr: pulled gerrit 467315 to mwdeploy1001 (no-op, no scap needed)
  • 18:47 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@ff3bf90]: GUI updates and new Updater build (duration: 13m 57s)
  • 18:44 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: cswikivoyage has HD logo even the project doesnt exist (T207066) (duration: 00m 49s)
  • 18:39 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable AICaptcha data collection (T186244) (duration: 00m 49s)
  • 18:33 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix a typo in wgLogoHD (mapwiki => napwiki) T207056, Remove techcomwikis row in wgLogo, techcomwiki doesnt exist T207056 (duration: 00m 48s)
  • 18:33 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@ff3bf90]: GUI updates and new Updater build
  • 18:30 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: Beta: Show share button on mobile web for beta user (no-op) (duration: 00m 49s)
  • 18:14 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT enable senses on testwikidatawiki T203887 (duration: 00m 49s)
  • 18:10 addshore@deploy1001: Synchronized wmf-config/Wikibase-production.php: SWAT: T207019 Enable WBQualityConstraintsSuggestionsBetaFeature on testwikidatawiki (duration: 00m 49s)
  • 18:01 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@ff3bf90]: Test deployment - GUI update and new Updater build(wdqs1009) (duration: 02m 11s)
  • 17:59 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@ff3bf90]: Test deployment - GUI update and new Updater build(wdqs1009)
  • 17:57 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@ff3bf90]: Test deployment - GUI update and new Updater build(wdqs1009) (duration: 02m 10s)
  • 17:55 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@ff3bf90]: Test deployment - GUI update and new Updater build(wdqs1009)
  • 16:54 marostegui: Start replication on db1087 and db1092 to avoid them lagging behind the whole night (nothing running there at this time)
  • 16:36 cmjohnson1: replacing pem0 on asw2-a7-eqiad T206972
  • 16:18 _joe_: restart prometheus-mcrouter-exporter.service across the fleet
  • 15:39 marostegui: Stop MySQL and poweroff db1092 for BBU replacement - T205514
  • 15:31 andrewbogott: restarting slapd on seaborgium as a test for T205463
  • 15:14 cmjohnson1: replacing optics asw2-b fpc2 -fpc8
  • 15:13 mforns@deploy1001: Finished deploy [analytics/refinery@9b288c5]: deploy refinery together with source version 0.0.77 (duration: 20m 19s)
  • 14:53 mforns@deploy1001: Started deploy [analytics/refinery@9b288c5]: deploy refinery together with source version 0.0.77
  • 14:46 marostegui: Ease consistency replication options on db2048 to mitigate lag
  • 14:29 moritzm: rebooting backup2001 for some tests
  • 13:35 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting MCR migration stage to write-both/read-new on Commons (T198308) (duration: 00m 49s)
  • 13:32 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T206593: adding db2096 to hosts (and repooling db2069) (duration: 00m 49s)
  • 13:30 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T206593: adding db2096 to hosts (and repooling db2069) (duration: 00m 49s)
  • 13:16 jynus: stopping db1092 and db1087 in sync T206743
  • 13:10 Jeff_Green: auithdns-update to deploy saiph->frpig2001 rename
  • 13:02 godog: upload prometheus-statsd-exporter 0.7.0 - T205870
  • 12:45 banyek: rebooting db2096
  • 12:44 gehel: reseting kafka offsets on wdqs public cluster
  • 12:44 elukey: complete rolling restart of eventbus on kafka[12]00[1-3] for python security upgrades (only codfw was done)
  • 12:41 elukey: upgrade prometheus-memcached-exporter on swift and thumbor
  • 11:57 Amir1: start of mwscript deleteLocalPasswords.php --delete --batch-size 200 on all wikis
  • 11:38 zeljkof: EU SWAT finished
  • 11:29 hoo: Started rebuildItemsPerSite on mwmaint1002 (T44325). Can be killed at any time, if necessary.
  • 11:26 zfilipin@deploy1001: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 11:09 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable reading from ct_tag_id in s7 (T194164) (duration: 00m 49s)
  • 10:57 moritzm: installing ghostscript security updates for jessie
  • 10:47 moritzm: installing tomcat7 security updates
  • 10:43 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 49s)
  • 10:42 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 49s)
  • 09:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1092 for recloning - T206743 (duration: 00m 49s)
  • 09:45 marostegui: Stop MySQL on db1116:3318 to reclone db1092
  • 09:41 banyek: max_binlog_size is set back to 1048576000 on ParseCache hosts (T206740)
  • 09:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Restore original weight for db1104 (duration: 00m 49s)
  • 09:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase traffic for db1104 (duration: 00m 48s)
  • 08:58 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T206593: depooling db2069 (duration: 00m 48s)
  • 08:50 elukey: restart hadoop yarn resource managers on an-master* to pick up new jvm settings
  • 08:49 XioNoX: repool eqsin - T206861
  • 08:48 banyek: depooling db2033 (T206593)
  • 08:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1104 - T206743 (duration: 00m 49s)
  • 08:17 moritzm: installing imagemagick security update
  • 07:57 godog: reformat ms-be2040 with crc=1 finobt=0 - T199198
  • 07:32 banyek: reimaging db2096(T206593)
  • 07:31 banyek: reimaging db2096
  • 07:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T206743 (duration: 00m 48s)
  • 07:15 marostegui: Stop MySQL at db1116:3318 to clone db1104
  • 07:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T206743 (duration: 00m 49s)
  • 07:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1109 (duration: 00m 49s)
  • 06:55 XioNoX: add v6 monitoring for mr1-ulsfo OOB - T206778
  • 06:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1109 (duration: 00m 49s)
  • 06:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1109 (duration: 00m 50s)
  • 05:20 kartik@deploy1001: Finished deploy [cxserver/deploy@fd74c3b]: Update cxserver to b51f363 (T203077, T99934, T203550) (duration: 04m 25s)
  • 05:16 kartik@deploy1001: Started deploy [cxserver/deploy@fd74c3b]: Update cxserver to b51f363 (T203077, T99934, T203550)
  • 05:16 marostegui: Stop MySQL on db1109 for recloning - T206743
  • 05:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 50s)
  • 05:11 marostegui: Stop MySQL on db1116:3318 to use it to clone db1109
  • 03:18 kartik@deploy1001: Finished deploy [cxserver/deploy@5a70ef1]: Update cxserver to 47a864b (T205420, T203077, T205700, T205616) (duration: 04m 44s)
  • 03:14 kartik@deploy1001: Started deploy [cxserver/deploy@5a70ef1]: Update cxserver to 47a864b (T205420, T203077, T205700, T205616)
  • 00:45 krinkle@deploy1001: Synchronized multiversion/MWRealm.php: I79fb3d194a58: use env.php (duration: 00m 49s)
  • 00:08 krinkle@deploy1001: Synchronized wmf-config/: I79fb3d194a: add env.php file (not yet used) (duration: 00m 50s)

2018-10-14

  • 23:42 krinkle@deploy1001: Synchronized multiversion/getMWVersion: Ice9a74e73481 no-op (duration: 00m 49s)
  • 23:21 krinkle@deploy1001: Synchronized wmf-config/ProductionServices.php: If4d8faa4 (duration: 00m 48s)
  • 21:48 krinkle@deploy1001: Synchronized multiversion/MWMultiVersion.php: I83b2bdd53c13e (duration: 00m 50s)
  • 20:47 krinkle@deploy1001: Synchronized wmf-config/import.php: beta-only (duration: 00m 54s)
  • 16:34 volans: forcing a puppet run on all eqsin hosts with batch 1 to clear most of the alarms - T206861
  • 08:54 elukey: restart Yarn resource manager on an-master1002 to force an-master1001 to take the leadership back - T206943
  • 08:34 elukey: powercycle restbase1015 (frozen, no ssh, no metrics, no root console via serial available)
  • 00:48 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/CentralAuth/includes/specials/SpecialGlobalGroupMembership.php: T203767 - If2bfa092b (duration: 00m 50s)

2018-10-13

  • 23:37 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T45086 - I4857e8ac (duration: 00m 51s)
  • 03:07 bblack: eqsin repooled

2018-10-12

  • 18:56 brion: restarted vp9 background transcodes in eqiad, via mwmaint1002
  • 18:37 addshore: modified attachLatest.php script finished running over 9395 pages T206743
  • 18:25 addshore: running modified attachLatest.php script over ~9000 pages on wikidatawiki (with added wait for slaves) T206743
  • 15:50 mutante: repair /dev/sde1 on ms-be2041 - T199198
  • 15:48 mutante: repair /dev/sdh1 on ms-be1043 - T199198
  • 14:23 _joe_: depooling eqsin via geodns due to loss of power redundancy
  • 13:35 gehel: repooling wdqs1003 catched up on lag
  • 12:59 gehel: depooling wdqs1003 to catch up on lag
  • 12:20 bblack: uploading gdnsd 2.99.9942-beta-1+wmf1 to stretch-wikimedia
  • 10:51 _joe_: depooling mw2252 for mcrouter tests T203786
  • 10:27 hoo: Updated the Wikidata property suggester with data from Monday's JSON dump and applied the T132839 workarounds
  • 10:08 addshore@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/WikimediaEvents/extension.json: T205283 gerrit:466843 Update Schema:WMDEBannerEvents rev to 18437830 (duration: 00m 52s)
  • 09:01 elukey: rolling restart of eventbus on kafka[1,2]00[1-3] to pick up python security upgrades
  • 05:54 moritzm: installing git security updates on trusty
  • 02:25 ejegg: updated fundraising tools from 3754f32 to 5a2d39b

2018-10-11

  • 23:33 Reedy: ran mwscript extensions/ShortUrl/populateShortUrlTable.php --wiki=gomwiki T206741
  • 23:32 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable shorturl on gomwiki (duration: 00m 48s)
  • 23:30 Reedy: created shorturl table on gomwiki T206741
  • 23:26 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable FileExporter to Meta-Wiki (duration: 00m 49s)
  • 23:21 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable CongressLookup (duration: 00m 49s)
  • 23:05 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/includes/jobqueue/jobs/ThumbnailRenderJob.php: T203135 - Ib4640e (duration: 00m 49s)
  • 22:56 dzahn@neodymium: conftool action : set/pooled=inactive; selector: name=mwmaint1001.eqiad.wmnet
  • 22:53 mutante: netbox - correction, mwmaint1001 to status "Staged", following new lifecycle docs T192457
  • 22:50 mutante: netbox - renamed mwmaint1001 to mw1279, changed status to inventory, renamed in DNS - T192457
  • 22:45 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/includes/Revision/RenderedRevision.php: I553dba13486 (duration: 00m 51s)
  • 22:30 mutante: mwmaint1001 - shutting down after final backup of /home, renaming back to mw1297 in DNS and DHCP, and reinstalling (T192457)
  • 21:53 mutante: mwmaint1001 - schduled downtime, is being renamed back to mw1297 and reinstalled
  • 21:47 mutante: mwmaint2001 - rsyncing home dirs from mwmaint1002 to /root/home-mwmaint1002 (which includes home-terbium even!) in case anyone is missing anything from one of mwaint*
  • 21:41 mutante: mwmaint2001 - deleting 60G of unneeded files from home
  • 20:37 XioNoX: add IPv6 to mr1-ulsfo OOB - T206778
  • 18:46 sbisson@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/PageTriage/: SWAT: Handle page that are unnominated for deletion (duration: 00m 50s)
  • 18:34 sbisson@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/PageTriage/modules/ext.pageTriage.views.list/ext.pageTriage.listControlNav.js: SWAT: Default to deleted and others when no type is selected on mode switch (duration: 00m 50s)
  • 18:22 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove config for RCFilters variables being removed from Core (duration: 00m 49s)
  • 18:14 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2083 and db2085:3318 (duration: 00m 48s)
  • 18:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 (duration: 00m 49s)
  • 18:09 sbisson@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Add copyviobot group management to relevant wikis (duration: 00m 49s)
  • 17:36 gehel: repooling wdqs1003, catched up on lag
  • away: automated binlog purging started on pc2004, pc2005, pc2006
  • 16:54 gehel: depooling wdqs1003 to let it catch up on lag
  • 15:38 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1087 (duration: 00m 50s)
  • 15:12 marostegui: Stop MySQL on db2085:3318 to reclone db1101:3318 - T206743
  • 15:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 (duration: 00m 49s)
  • 15:04 akosiaris: Media storage/Swift Swift set to active/passive
  • 15:01 akosiaris: Media storage/Swift Swift set to active/active
  • 14:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 (duration: 00m 48s)
  • 14:52 jynus: deploying wikidata row fix to db1087 with replication enabled
  • 14:47 END: (PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0) (volans@neodymium)
  • 14:47 START: - Cookbook sre.switchdc.services.02-restore-ttl (volans@neodymium)
  • 14:36 END: (PASS) - Cookbook sre.switchdc.services.01-switch-dc (exit_code=0) (volans@neodymium)
  • 14:36 Switching: services parsoid, restbase, restbase-async, mobileapps, apertium, citoid, cxserver, eventstreams, graphoid, mathoid, proton, pdfrender, recommendation-api, zotero, eventbus, ores, wdqs, wdqs-internal: codfw => eqiad (volans@neodymium)
  • 14:36 START: - Cookbook sre.switchdc.services.01-switch-dc (volans@neodymium)
  • 14:35 END: (PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0) (volans@neodymium)
  • 14:30 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T206743: mariadb: Depool db1087 (duration: 00m 49s)
  • 14:30 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (volans@neodymium)
  • 14:28 banyek: depooling db1087 (T206743)
  • 14:28 banyek: depooling db1087
  • 14:15 elukey: reboot eventlog1002 for kernel upgrades
  • 14:15 jynus: applying row filling to (most) eqiad s8 dbs, including the mater
  • 14:13 moritzm: install libxml2 security updates on jessie servers
  • 13:55 jynus: recovering rows to db1092
  • 13:26 jynus: filling in missing rows on dbstore1002
  • 13:23 marostegui: Stop MySQL on db2083 to reclone db1116:3318 - T206743
  • 13:21 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2083 (duration: 00m 49s)
  • 13:20 marostegui: Stop MySQL on db1116:3318 to reclone it from db2083 - T206743
  • 12:43 elukey: upgrade prometheus-memcached-exporter on mc1*
  • 12:38 elukey: upgrade prometheus-memcached-exporter on mc2*
  • 12:15 elukey: upgrade prometheus-memcached-exporter on mc2035
  • 12:14 elukey: upload prometheus-memcached-exporter_0.4.1+git20181010.2fa99eb-1 to (jessie|stretch)-wikimedia
  • 12:12 Amir1: EU SWAT is done
  • 11:30 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set some small wikis to read new for change tag backend (T194164) (duration: 00m 50s)
  • 11:10 marostegui: Stop MYSQL on db2085:3318 and db1099:3318 T206743
  • 11:09 marostegui: Stop MYSQL on db2088:3318 and db1099:3318 T206743
  • 11:08 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2085:3318 and db1099:3318 (duration: 00m 49s)
  • 11:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db2085:3318 and db1099:3318 (duration: 00m 49s)
  • 11:07 banyek: binlog expiration set to 60 days on db2045
  • 08:30 banyek: setting up some automated binlog purge mechanism on pc1004,pc1005,pc1006
  • 08:26 jynus: setting up replication from pc2005 -> pc1005 and from pc2006 -> pc2006
  • 08:20 jynus: setting up replication from pc2004 -> pc1004
  • 08:04 banyek: purging binary logs on pc1006
  • 08:04 banyek: purging binary logs on pc1005
  • 08:04 jynus: running /usr/local/bin/mwscript purgeParserCache.php --wiki=aawiki --age=1900800 --msleep 0
  • 08:04 banyek: purging binary logs on pc1004
  • 07:57 gehel: rolling restart blazegraph on wdqs-internal for config change - T206648
  • 07:43 addshore: deploy https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Wikibase/+/466031 to mwmaint1002 only (increasing tracking of wikidata dispatching) T205865
  • 07:36 elukey: roll restart of aqs on aqs100[4-9] to pick up new Druid settings
  • 06:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase db1092 weight (duration: 00m 49s)
  • 05:43 marostegui: Purge binary logs on pc2005 due to disk space issues - T206740
  • 05:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 (duration: 00m 48s)
  • 05:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 51s)
  • 02:25 krinkle@deploy1001: Synchronized w/static.php: T127233 - Ic6acb70 (duration: 00m 49s)
  • 02:10 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/includes/page/WikiPage.php: T203942 - Ib211d98498f (duration: 00m 49s)
  • 02:07 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/tests/phpunit/includes/page/: Ib211d98498f (duration: 00m 49s)
  • 01:38 krinkle@deploy1001: Synchronized wmf-config/etcd.php: T176370 - I5e7e5d167d517 (duration: 00m 55s)

2018-10-10

  • 23:08 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/maintenance/resources/foreign-resources.yaml: Ic865e7077d (duration: 00m 49s)
  • 22:59 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/MultimediaViewer/: T206099 - I53dbce0a (duration: 00m 49s)
  • 22:43 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/includes/specials/SpecialDeletedContributions.php: T187619 - Ic6b0d8020553 (duration: 00m 48s)
  • 22:41 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/ORES/includes/FetchScoreJob.php: T204753 - Icc28230585bc (duration: 00m 49s)
  • 22:25 mutante: icinga1001 - chmod 2710 /var/lib/icinga/rw
  • 22:16 krinkle@deploy1001: Synchronized wmf-config/arclamp.php: T206092 - If607ad111a (duration: 00m 48s)
  • 21:51 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/ContentTranslation/specials/SpecialContentTranslation.php: T205433 - Ib34b28 (duration: 00m 49s)
  • 21:48 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/Echo/includes/DiscussionParser.php: T204291 - Ia5323b401b94 (duration: 00m 51s)
  • 21:45 XioNoX: Add icinga1001 to mr* security policies - T206704
  • 20:34 thcipriani: upgrading ci jenkins install on contint1001
  • 20:19 thcipriani: upgrading releases-jenkins jenkins install on releases1001
  • 20:17 thcipriani: upgrading releases-jenkins jenkins install on releases2001
  • 19:58 mutante: icinga - enabled icinga service on icinga1001 (stretch), but all notifications are disabled
  • 19:43 mutante: awight restarted ORES celery workers on ores2003 (~17:00), ores200* (17:05)
  • 19:35 kaldari@deploy1001: Finished scap: (no justification provided) (duration: 22m 05s)
  • 19:13 kaldari@deploy1001: Started scap: (no justification provided)
  • 19:11 kaldari: scap sync to rebuild i18n cache
  • 18:35 XioNoX: disable VC port 1/2 on asw2-c-eqiad:fpc3 (to fpc8)
  • 18:20 otto@deploy1001: Finished deploy [analytics/refinery@28bbee8]: Add accept header to webrequest logs - T170606 (duration: 10m 34s)
  • 18:19 XioNoX: delete sessions to AS6805 on cr2-esams (left AMS-IX)
  • 18:10 otto@deploy1001: Started deploy [analytics/refinery@28bbee8]: Add accept header to webrequest logs - T170606
  • 18:09 otto@deploy1001: Finished deploy [analytics/refinery@4e2d956]: Add accept header to webrequest logs - T170606 (duration: 04m 35s)
  • 18:05 otto@deploy1001: Started deploy [analytics/refinery@4e2d956]: Add accept header to webrequest logs - T170606
  • 17:49 XioNoX: replace 10.195.0.0/25 with 10.195.0.0/24 in prefix-list fundraising-codfw4 on cr1/2-codfw - T206637
  • 16:25 mutante: LDAP - added isaacj to wmf group (for SWAP access, existing shell user since recently) (T206631) (T205840)
  • 16:16 _joe_: restart of now-unused jobqueue redises for stopping the alerts post-switchover
  • 16:09 ejegg: updated CiviCRM from 1165e7ed79 to 4cc21d61c5
  • 15:59 vgutierrez: Uploaded certcentral 0.1 to apt.wikimedia.org (stretch) - T199711
  • 15:55 cmjohnson1: scheduled downtime for host cloudvirt1019 swap raid card T196507
  • 15:35 moritzm: uploaded jenkins 2.138.2 security release to apt.wikimedia.org (jessie/stretch) (T206234)
  • 15:11 _joe_: started again hhvm on mwmaint2001
  • 14:51 ejegg: turned fundraising scheduled jobs back on
  • 14:43 ejegg: turned off fundraising scheduled jobs
  • 14:42 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) (volans@neodymium)
  • 14:42 START: - Cookbook sre.switchdc.mediawiki.08-restore-ttl (volans@neodymium)
  • 14:42 END: (FAIL) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=99) (volans@neodymium)
  • 14:40 START: - Cookbook sre.switchdc.mediawiki.08-start-maintenance (volans@neodymium)
  • 14:39 END: (FAIL) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=99) (volans@neodymium)
  • 14:38 START: - Cookbook sre.switchdc.mediawiki.08-start-maintenance (volans@neodymium)
  • 14:33 oblivian@puppetmaster1001: conftool action : set/weight=15; selector: cluster=api_appserver,service=apache2,dc=eqiad,name=mw123.*
  • 14:31 oblivian@puppetmaster1001: conftool action : set/weight=15; selector: cluster=api_appserver,service=apache2,dc=eqiad,name=mw122.*
  • 14:19 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0) (volans@neodymium)
  • 14:19 START: - Cookbook sre.switchdc.mediawiki.08-update-tendril (volans@neodymium)
  • 14:18 END: (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) (volans@neodymium)
  • 14:18 MediaWiki: read-only period ends at: 2018-10-10 14:18:26.908958 (volans@neodymium)
  • 14:18 START: - Cookbook sre.switchdc.mediawiki.07-set-readwrite (volans@neodymium)
  • 14:18 END: (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) (volans@neodymium)
  • 14:18 START: - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (volans@neodymium)
  • 14:17 END: (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0) (volans@neodymium)
  • 14:17 START: - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (volans@neodymium)
  • 14:17 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-traffic (exit_code=0) (volans@neodymium)
  • 14:15 START: - Cookbook sre.switchdc.mediawiki.04-switch-traffic (volans@neodymium)
  • 14:15 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) (volans@neodymium)
  • 14:14 START: - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (volans@neodymium)
  • 14:14 END: (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) (volans@neodymium)
  • 14:14 START: - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (volans@neodymium)
  • 14:14 END: (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) (volans@neodymium)
  • 14:13 MediaWiki: read-only period starts at: 2018-10-10 14:13:46.068081 (volans@neodymium)
  • 14:13 START: - Cookbook sre.switchdc.mediawiki.02-set-readonly (volans@neodymium)
  • 14:10 END: (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) (volans@neodymium)
  • 14:10 START: - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (volans@neodymium)
  • 14:10 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0) (volans@neodymium)
  • 14:07 START: - Cookbook sre.switchdc.mediawiki.00-warmup-caches (volans@neodymium)
  • 14:07 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0) (volans@neodymium)
  • 14:05 START: - Cookbook sre.switchdc.mediawiki.00-warmup-caches (volans@neodymium)
  • 14:05 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0) (volans@neodymium)
  • 14:01 START: - Cookbook sre.switchdc.mediawiki.00-warmup-caches (volans@neodymium)
  • 14:01 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) (volans@neodymium)
  • 14:01 START: - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (volans@neodymium)
  • 14:00 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) (volans@neodymium)
  • 14:00 START: - Cookbook sre.switchdc.mediawiki.00-disable-puppet (volans@neodymium)
  • 12:18 _joe_: decommissioning conf1001-1003: stopping etcd, nginx, and masking both
  • 11:41 jynus: renaming some s3 wiki tables on eqiad master to prevent split brain T184805
  • 11:29 zeljkof: EU SWAT finished
  • 11:26 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Permissions changes on itwikibooks (T206447) (duration: 00m 57s)
  • 10:54 marostegui: Set a replication filter on db1075 (s3 eqiad) to ignore enwikivoyage, cebwiki, shwiki, srwiki & mgwiktionary - T184805
  • 10:49 marostegui@deploy1001: Synchronized dblists/s5.dblist: Update s5.dblist to reflect the wikis moved from s3 - T184805 (duration: 00m 56s)
  • 10:48 marostegui@deploy1001: Synchronized dblists/s3.dblist: Update s3.dblist to reflect the wikis moved to s5 - T184805 (duration: 00m 58s)
  • 09:12 ema: Traffic: move restbase back to eqiad T203777
  • 09:07 ema: Traffic: set services active/active T203777
  • 09:00 ema: Traffic: route esams caches back to eqiad T203777
  • 08:27 moritzm: installing fuse security updates
  • 08:07 ariel@deploy1001: Finished deploy [dumps/dumps@0714a93]: fix adds/changes dumps generation when prev run is missing (duration: 00m 06s)
  • 08:07 ariel@deploy1001: Started deploy [dumps/dumps@0714a93]: fix adds/changes dumps generation when prev run is missing
  • 08:01 moritzm: rolling out debdeploy 0.0.99.6
  • 07:51 elukey: cleaned up some log files from eventlog1002
  • 02:55 ejegg: updated payments-wiki from 1472604b6e to 7fb1aae963
  • 00:19 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/includes/utils/UIDGenerator.php: T94522 - I2a0c51bea58 (duration: 00m 56s)
  • 00:15 krinkle@deploy1001: sync-file aborted: T205567 - I75f1eb6dc2cb (duration: 00m 01s)
  • 00:14 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/tests/phpunit/includes/utils/: T94522 - I2a0c51bea58 (duration: 01m 02s)

2018-10-09

  • 22:58 SMalyshev: repooled wdqs2003
  • 22:26 shdubsh: repairing /dev/sdl1 on ms-be2040 - T199198
  • 21:52 bblack: cp1085: varnish backend restart for mbox lag
  • 21:50 mutante: releases1001 - restarted jenkins (it went from 200 -> 503 -> 403) curl localhost:8080 works again after restart, icinga check still getting 403 now
  • food: updated fundraising CiviCRM from 7a0d14015e to 1165e7ed79
  • 20:08 mutante: repair /dev/sdg1 on ms-be2041 - T199198
  • 19:37 XioNoX: disable igmp-snooping on asw2-c-eqiad - T201039
  • 19:25 XioNoX: disable igmp-snooping on asw2-b-eqiad - T201039
  • 19:20 XioNoX: bounce igmp-snooping on asw2-b-eqiad
  • 18:24 ottomata: adding Accept header to all varnishkafka generated webrequest logs
  • 17:21 SMalyshev: depooled wdq23 again, sigh
  • 13:54 moritzm: rebooting prometheus1004 for kernel security update
  • 13:41 moritzm: rebooting prometheus1003 for kernel security update
  • 13:28 moritzm: rebooting prometheus2004 for kernel security update
  • 13:13 moritzm: rebooting prometheus2003 for kernel security update
  • 12:54 gehel: silencing wdqs-public lag alerts (service still functional, and SLO unclear) - T199228
  • 12:45 moritzm: installing imagemagick security updates
  • 11:47 END: (ERROR) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=2) (volans@neodymium)
  • 11:47 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (volans@neodymium)
  • 11:45 akosiaris: dry-run services switchover from codfw to eqiad in preparation for Thursday
  • 11:37 END: (ERROR) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=2) (volans@neodymium)
  • 11:37 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (volans@neodymium)
  • 11:14 volans: live-test of the inverted switchdc (eqiad->codfw) completed, all good - T203777
  • 11:14 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0) (volans@neodymium)
  • 11:13 START: - Cookbook sre.switchdc.mediawiki.08-update-tendril (volans@neodymium)
  • 11:12 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) (volans@neodymium)
  • 11:11 START: - Cookbook sre.switchdc.mediawiki.08-start-maintenance (volans@neodymium)
  • 11:11 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) (volans@neodymium)
  • 11:11 START: - Cookbook sre.switchdc.mediawiki.08-restore-ttl (volans@neodymium)
  • 11:11 END: (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) (volans@neodymium)
  • 11:11 [DRY-RUN]: MediaWiki read-only period ends at: 2018-10-09 11:11:05.042622 (volans@neodymium)
  • 11:11 START: - Cookbook sre.switchdc.mediawiki.07-set-readwrite (volans@neodymium)
  • 11:08 END: (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) (volans@neodymium)
  • 11:08 START: - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (volans@neodymium)
  • 11:07 END: (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0) (volans@neodymium)
  • 11:07 START: - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (volans@neodymium)
  • 11:06 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-traffic (exit_code=0) (volans@neodymium)
  • 11:04 START: - Cookbook sre.switchdc.mediawiki.04-switch-traffic (volans@neodymium)
  • 11:03 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) (volans@neodymium)
  • 11:03 START: - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (volans@neodymium)
  • 11:00 END: (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) (volans@neodymium)
  • 10:59 START: - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (volans@neodymium)
  • 10:56 END: (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) (volans@neodymium)
  • 10:56 [DRY-RUN]: MediaWiki read-only period starts at: 2018-10-09 10:56:12.213026 (volans@neodymium)
  • 10:56 START: - Cookbook sre.switchdc.mediawiki.02-set-readonly (volans@neodymium)
  • 10:53 END: (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) (volans@neodymium)
  • 10:53 START: - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (volans@neodymium)
  • 10:51 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0) (volans@neodymium)
  • 10:49 onimisionipe: repooling wdqs2001 catched up on lag - T206423
  • 10:48 START: - Cookbook sre.switchdc.mediawiki.00-warmup-caches (volans@neodymium)
  • 10:47 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0) (volans@neodymium)
  • 10:41 START: - Cookbook sre.switchdc.mediawiki.00-warmup-caches (volans@neodymium)
  • 10:40 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) (volans@neodymium)
  • 10:40 START: - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (volans@neodymium)
  • 10:37 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) (volans@neodymium)
  • 10:36 START: - Cookbook sre.switchdc.mediawiki.00-disable-puppet (volans@neodymium)
  • 10:35 onimisionipe: deploying prometheus-blazegraph-exporter 0.6 on all wdqs clusters - T206123
  • 10:34 volans: about to perform live-test of the inverted switchdc (eqiad->codfw), actions will be real but basically noop due to codfw being already active - T203777
  • 09:25 elukey: swapped Hadoop's hive/oozie from analytics1003 to an-coord1001
  • 09:16 ema: restart pybal on lvs1005 to pick up config changes (conf2001 -> conf1004)
  • 09:00 ema: re-enable puppet/pybal on lvs1002, IPv6 connectivity with phab1001 working again T201039
  • 08:16 elukey: update puppet compiler facts
  • 08:06 onimisionipe: depooling wdqs2001 to catch up on lag -T206423
  • 07:03 akosiaris: restart zuul and zuul-merger on contint1001 for the upgrade of zuul to finish
  • 06:37 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1122 (duration: 00m 57s)
  • 05:19 marostegui: Stop MySQL on db1122 for binlog format change, mysql and kernel upgrade
  • 05:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1122 (duration: 00m 59s)
  • 02:41 krinkle@deploy1001: Synchronized wmf-config/profiler.php: T176916 / T206092 - Ie86e88777c48 (duration: 00m 56s)
  • 02:21 krinkle@deploy1001: Synchronized wmf-config/arclamp.php: T176916 - Id79baae90: ensure file exists before Ie86e88777c48 (duration: 00m 57s)
  • 00:04 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/includes/libs/rdbms/database: T201900 - I8ae754a2518 (duration: 00m 59s)

2018-10-08

  • 22:45 XioNoX: increase accepted-prefix-limit for 24115 on cr4-ulsfo
  • 22:41 XioNoX: clear BGP neighbor cr1-eqsin:AS9583 (bgp limit threshold reached)
  • 21:11 ejegg: updated payments-wiki from d623de9494 to 1472604b6e
  • 20:42 gehel: repooling wdqs2003 catched up on lag - T206423
  • 19:41 XioNoX: troubleshooting asw2-b-eqid with JTAC - T201039
  • 19:08 gehel: depooling wdqs2003 to catch up on lag -T206423
  • 19:00 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable MCR read-new mode on some small wikis (T198308) (duration: 00m 56s)
  • 18:55 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@bd698bd]: WDQS deployment - New federation whitelist entries (duration: 10m 07s)
  • 18:45 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@bd698bd]: WDQS deployment - New federation whitelist entries
  • 18:37 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@bd698bd]: WDQS test deployment - New federation whitelist entries(wdqs1009) (duration: 00m 33s)
  • 18:37 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@bd698bd]: WDQS test deployment - New federation whitelist entries(wdqs1009)
  • 18:36 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Extension:File exporter to mrwikipedia (T206437) (duration: 00m 57s)
  • 16:29 XioNoX: push firewall filter counters on asw2-b-eqiad - T201039
  • 16:28 elukey: restart eventlogging on eventlog1002 for python security upgrades
  • 14:05 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T184805: Revert 'mariadb: Depool db1110 for testing s3 imports' (duration: 00m 57s)
  • 14:03 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T184805: Revert 'mariadb: Depool db1110 for testing s3 imports' (duration: 00m 56s)
  • 13:43 elukey: restart confd on esams nodes to pick up new srv settings
  • 13:41 elukey: restart navtiming.service on webperf1001 to pick up the dns change for etcd
  • 13:39 marostegui: Enable gtid on the following slaves: db2068 db1122 db1117:3323
  • 13:37 elukey: restart confd on all the other eqiad nodes to pick up new srv records
  • 13:32 elukey: restart confd on cp1* to pick up new srv records
  • 13:11 _joe_: purging the dnsrec cache for eqiad,esams etcd client SRV records
  • 13:09 ema: depool eqiad front-edge traffic T201039
  • 13:05 banyek: converting cebwiki.templatelinks to TokuDB on host dbstore1002.eqiad.wmnet (T205544)
  • 13:04 banyek: downtime notifications for dbstore1002 repliaction threads (T205544)
  • 12:49 banyek: pt-kill-wmf enabled on the wikireplicas (T203674)
  • 11:59 _joe_: restart pybal in esams, after running puppet, to switch etcd cluster used
  • 11:46 _joe_: restart pybal on lvs1001
  • 11:46 addshore: SWAT done
  • 11:45 addshore@deploy1001: Synchronized wmf-config/throttle.php: Add throttle exception for Netherlands Hackathon October 2018 - Wiki Techstorm T206241, and remove other rules. (duration: 00m 56s)
  • 11:39 addshore: addshore@mwmaint2001:~$ mwscript namespaceDupes.php --wiki fywiktionary --fix --add-prefix=T202769 # T202769
  • 11:35 addshore: addshore@mwmaint2001:~$ mwscript namespaceDupes.php --wiki fywiktionary --fix # Finished, still 111 pages to fix
  • 11:34 addshore: addshore@mwmaint2001:~$ mwscript namespaceDupes.php --wiki fywiktionary --fix # Started
  • 11:33 addshore: addshore@mwmaint2001:~$ mwscript namespaceDupes.php --wiki fywiktionary # (dryrun, 11529 links to fix, 11529 were resolvable.)
  • 11:32 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: gerrit:455249 Use translated MetaNamespace for fy.wiktionary T202769 (duration: 00m 58s)
  • 11:27 addshore@deploy1001: Synchronized wmf-config/flaggedrevs.php: SWAT: gerrit:464890 Remove the "reviewer" group at ruwikisource T205997 (duration: 00m 57s)
  • 10:41 elukey: restart mcrouter on mw2201 with more verbose logging settings as test
  • 09:55 moritzm: installing python3.5/python2.7 security updates
  • 09:51 godog: rebuild sdc sdh sdj sdi on ms-be2041 with crc=1 finobt=0 - T199198
  • 08:20 marostegui: Disable gtid on es2 and es3 eqiad master
  • 08:20 gehel@puppetmaster1001: conftool action : set/weight=15; selector: dc=codfw,cluster=wdqs,name=wdqs2001.codfw.wmnet
  • 08:20 gehel@puppetmaster1001: conftool action : set/weight=15; selector: dc=codfw,cluster=wdqs,name=wdqs2002.codfw.wmnet
  • 07:50 marostegui: Enabling replication eqiad -> codfw in preparation for DC failover
  • 07:40 marostegui: Disable GTID on s1,s2,s3,s4,s6,s7,s8 eqiad masters in preparation for enabling replication eqiad -> codfw
  • 07:39 _joe_: disabling puppet, doing etcd tests on lvs1006
  • 07:38 gehel@puppetmaster1001: conftool action : set/weight=15; selector: dc=codfw,cluster=wdqs,name=wdqs2002.eqiad.wmnet
  • 07:38 gehel@puppetmaster1001: conftool action : set/weight=15; selector: dc=codfw,cluster=wdqs,name=wdqs2001.eqiad.wmnet
  • 07:38 gehel: reducing relative weight of wdqs2003 in pybal - T206423
  • 07:27 banyek: enabling first time wmf-pt-kill on labsdb1010
  • 07:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1092 with low weight - T205514 (duration: 01m 27s)
  • 07:00 moritzm: installing git security updates

2018-10-07

  • 16:40 dereckson: Reset user email for account "Dominic Mayers" (T206421)
  • 16:35 elukey: run a script in tmux (my username) on mw2201 to poll the status of a mcrouter key/route every 10s using its admin api (very lightweight but kill if needed)
  • 14:52 onimisionipe: repooling wdqs2003. Catched up on Lag and also Lag issues seems to be creeping on wdqs200[1|2]
  • 04:29 SMalyshev: temp depooled wdqs2003
  • 03:12 ejegg: disabled all fundraising scheduled jobs - something that looks like disk issues on civi1001

2018-10-06

  • 21:20 gehel: repooling wdqs2003: catched up on updater lag
  • 20:43 _joe_: restarting apache2 on puppetmaster1001
  • 19:16 onimisionipe: depooling wdqs2003
  • 18:10 elukey: restart Yarn Resource Manager on an-master1002 to force an-master1001 to take the active role back (failed over due to a zk conn issue)
  • 17:07 onimisionipe: restarting wdqs-blazegraph on wdqs2003
  • 13:48 bblack: multatuli: update gdnsd package to 2.99.9930-beta-1+wmf1
  • 13:47 bblack: authdns1001: update gdnsd package to 2.99.9930-beta-1+wmf1 (correction to last msg)
  • 13:46 bblack: authdns1001: update gdnsd package to 2.99.9161-beta-1+wmf1
  • 12:57 bblack: rebooting cp1076
  • 12:49 bblack: depool cp1076, apparently has disk issues

2018-10-05

  • 23:50 bblack: <<<<<<< repooling eqiad edge caches, a few days ahead of intended switchback next Weds, to alleviate some traffic engineering concerns over the weekend >>>>>>
  • 20:48 mutante: T191183 - it's still showing the error page as before but that isn't due to apache issues, it just needs additional ferm rules
  • 20:44 mutante: gerrit - adding gerrit.wmfusercontent.org virtual host for avatars. applied first on gerrit2001, then on cobalt (T191183)
  • 20:03 ejegg: updated fundraising CiviCRM from ebc2e0076c to 7a0d14015e
  • 19:48 banyek: repooling labsdb1009 (T195747)
  • 19:44 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@f8776de]: Redeploy 1009 (duration: 00m 26s)
  • 19:44 smalyshev@deploy1001: Started deploy [wdqs/wdqs@f8776de]: Redeploy 1009
  • 18:37 bblack: authdns2001: upgraded gdnsd to 2.99.9930-beta
  • 18:31 bblack: gdnsd-2.99.9930-beta-1+wmf1 uploaded to stretch-wikimedia
  • 18:26 mutante: icinga - noop on all servers, no change, puppet re-enabled, operations normal
  • 18:08 mutante: disabling puppet on icinga for 5 min for extra safety before a change that should be noop
  • 17:58 banyek: depooling labsdb1009 (T195747)
  • 17:50 banyek: repooling labsdb1011 (T195747)
  • 17:12 elukey: set etcd in codfw as read/write (was readonly) and eqiad as readonly (was read/write)
  • 14:57 banyek: depooling labsdb1011 (T195747)
  • 14:56 banyek: depooling labsdb1011
  • 13:26 banyek: adding wmf-pt-kill_2.2.20-1+wmf3 package for stretch
  • 13:25 moritzm: installing python3.5/2.7 security updates
  • 13:02 volans: upgraded spicerack to version 0.0.9 on sarin/neodymium/cumin* - T199079
  • 12:13 vgutierrez: Creating certcentral1001.eqiad.wmnet in ganeti - T206308
  • 12:12 vgutierrez: Creating certcentral2001.codfw.wmnet in ganeti - T206308
  • 11:59 elukey: deleted bohrium from ganeti via gnt-instance
  • 11:43 moritzm: rebooting wezen for kernel security update
  • 11:29 moritzm: rebooting ruthenium for kernel security update
  • 10:40 jynus: restarting replication on labsdb1010/1 on s3 and s5
  • 10:37 volans: uploaded spicerack_0.0.9-1{,+deb9u1} to apt.wikimedia.org {jessie,stretch}-wikimedia - T199079
  • 10:17 moritzm: rearmed keyholder on netmon2001
  • 10:10 elukey: restart confd on labs-puppetmaster to pick up new etcd settings (eqiad -> codfw)
  • 10:03 _joe_: restarting navtiming.service on webperf1001 to pick up the dns change for etcd
  • 09:37 elukey: restart rsyslog on lithium - broken connection to tegmen - T199406
  • 09:37 banyek: disabling puppet on labsdb1009,labsdb1010,labsdb1011 (T203674)
  • 09:36 banyek: adding wmf-pt-kill_2.2.20-1+wmf2 package for stretch
  • 09:16 volans: rebooting tegmen, console stuck, possible re-occurrence of T199413 (to be confirmed)
  • 09:12 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Move some wikis for s3 to s5 (duration: 00m 56s)
  • 09:06 elukey: stop etcdmirror replication on conf2002
  • 09:05 _joe_: restarting confd on all nodes in eqiad and esams
  • 08:58 _joe_: wiped cached values for the read-only etcd SRV record
  • 08:56 _joe_: read-write connections to etcd only go to codfw now
  • 08:35 _joe_: reenabling notifications for etcdmirror on conf1005
  • 08:02 jynus: start replication on db1069 (x1)
  • 07:54 jynus: starting replicatios on db1075; db1070, db1070:s3 with disabled gtid
  • 07:50 jynus: stopping dbstore1001:x1
  • 07:33 jynus: chaning s3 master for db1070
  • 07:28 jynus: stopping s3 replication on db1070
  • 07:20 jynus: stopping x1 replication on db1069
  • 07:20 godog: temporarily stop prometheus on bast4001 to finalize data transfer - T179050
  • 07:19 jynus: stopping s3 replication on db1075
  • 07:18 jynus: stopping s5 replication on db1070
  • 07:09 moritzm: installing python3.4/2.7 security updates
  • 05:55 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T205599 - Ic28e00c30 (duration: 00m 57s)
  • 05:53 _joe_: upgrading python-etcd on conf1004-6, restarting etcdmirror
  • 05:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Clarify db1092 status - T205514 (duration: 00m 57s)
  • 04:18 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/includes/libs/filebackend/FileBackendStore.php: T205567 - I75f1eb6dc2cb (duration: 00m 56s)
  • 04:16 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/CirrusSearch/includes/DataSender.php: I0769c50c (duration: 01m 01s)
  • 00:31 mutante: LDAP: added user skvjold to group wmf (T204377)

2018-10-04

  • 22:51 ejegg: updated fundraising CiviCRM from 944b954bac to ebc2e0076c
  • 21:27 XioNoX: bounce phab1001 switch port - T201039
  • 20:47 ejegg: updated fundraising CiviCRM from ddf4865650 to 944b954bac
  • 20:23 mforns@deploy1001: Finished deploy [analytics/refinery@3eb9bf2]: deploying refinery together with refinery-source v0.0.76 (duration: 00m 17s)
  • 20:22 mforns@deploy1001: Started deploy [analytics/refinery@3eb9bf2]: deploying refinery together with refinery-source v0.0.76
  • 20:10 mforns@deploy1001: Finished deploy [analytics/refinery@3eb9bf2]: deploying refinery together with refinery-source v0.0.76 (duration: 14m 04s)
  • 19:56 mforns@deploy1001: Started deploy [analytics/refinery@3eb9bf2]: deploying refinery together with refinery-source v0.0.76
  • 19:30 marxarelli: rise in fatals "Fatal error: entire web request took longer than 60 seconds and timed out in /srv/mediawiki/php-1.32.0-wmf.24/includes/Title.php"
  • 19:26 dduvall@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.24
  • 19:15 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@6dc89c0]: Bump cirrusSearchLinksUpdate concurrency to 50 (duration: 00m 53s)
  • 19:14 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@6dc89c0]: Bump cirrusSearchLinksUpdate concurrency to 50
  • 18:49 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:460202|]] (duration: 00m 59s)
  • 18:24 XioNoX: bounce lvs1002:eth1 switch port
  • 18:23 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable PageTriage/ORES on enwiki (T206149) (duration: 01m 01s)
  • 18:21 bblack: lvs1002: puppet disabled, stopping pybal (fail to 1005)
  • 18:07 _joe_: disabled notifications for etcd replication lag on conf1005, not in production
  • 17:47 banyek: repooling labsb1010 (T195747)
  • 17:41 _joe_: uploaded new python-etcd packages for jessie, stretch
  • 17:38 XioNoX: asw2-b-eqiad recabling done - T201039
  • 17:34 elukey: pool kafka1002 (eventbus) after maintenance
  • 17:22 elukey: re-enable ircecho after alarms shower
  • 17:15 andrewbogott: triggering some alerts on labvirt1018 to figure out about alert thresholds
  • 17:06 elukey: stop ircecho on einstenium - alarms shower
  • 17:02 gtirloni: tools - published updated toollabs-* Docker images
  • 16:54 ejegg: updated standalone SmashPig deploy from 82f9d49c23 to 5f21d3f2db
  • 16:52 XioNoX: Step 3) Add missing links - T201039
  • 16:45 shdubsh: etherpad1001 running systemctl reset-failed
  • 16:41 XioNoX: Connect/enable fpc2:0/51-fpc5:1/0 (5m DAC) - T201039
  • 16:39 XioNoX: Enable fpc5-fpc7 - T201039
  • 16:33 twentyafterfour: started phd on phab1001 and re-enabled puppet (I had it disabled to prevent starting phd during read-only)
  • 16:25 twentyafterfour: phabricator is read-write
  • 16:21 jynus: reloading dbproxy1003,8
  • 16:16 marostegui: Stop and reboot db1072 (phabricator master) for maintenance
  • 16:16 twentyafterfour: phabricator is read-only
  • 16:14 XioNoX: Enable all VC ports on FPC2 and FPC7 - T201039
  • 16:13 XioNoX: starting asw2-b-eqiad re-cabling - T201039
  • 16:08 twentyafterfour: logged downtime for phabricator in icinga, stopped phd queue processing in preparation for read-only mode
  • 16:07 jynus: reloading haproxy @ dbproxy1005
  • 16:00 marostegui: Stop MySQL on db1073 for mariadb and kernel upgrade - T201039 T148507
  • 15:58 arturo: icinga downtime every server in the main cloudvps deployment for 2h T201039
  • 15:56 arturo: icinga downtime every server with the cloudXXXX scheme for 2h T201039
  • 15:54 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@55dbb8b]: Proper reconnect on topics change T199444 (duration: 00m 55s)
  • 15:53 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@55dbb8b]: Proper reconnect on topics change T199444
  • 15:52 ppchelko@deploy1001: Finished deploy [changeprop/deploy@5d00448]: Proper reconnect on topics change T199444 (duration: 01m 40s)
  • 15:51 ppchelko@deploy1001: Started deploy [changeprop/deploy@5d00448]: Proper reconnect on topics change T199444
  • 15:41 elukey: depool kafka1002 from eventbus as precautionary step for T201039
  • 14:48 banyek: depooling labsb1010 (T195747)
  • 14:09 marostegui: Sanitize enwikivoyage cebwiki shwiki srwiki mgwiktionary on db1124:3315 T184805
  • 13:46 pmiazga@deploy1001: Finished deploy [proton/deploy@ecb9a0e]: Bugfix:handle undefined response and fix grafana stats (T186748,T201158) (duration: 02m 55s)
  • 13:43 pmiazga@deploy1001: Started deploy [proton/deploy@ecb9a0e]: Bugfix:handle undefined response and fix grafana stats (T186748,T201158)
  • 13:14 banyek: muting alerts on s2replication @dbstore2002 and resuming compression of s2 database tables (T204930)
  • 13:14 banyek: muting alerts on dbstore2002 and resuming compression of s2 database tables (T204930)
  • 12:23 elukey: deploy etcdmirror on conf1005 - T205814
  • 12:06 zeljkof: EU SWAT finished
  • 12:06 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add permission "move-rootuserpages" to usergroup "eliminator" at ptwiki (T205595) (duration: 00m 57s)
  • 12:01 moritzm: rolling reboot of ms-fe hosts in codfw for kernel security update
  • 12:00 zeljkof: one more patch for EU SWAT
  • 11:57 zeljkof: EU SWAT finished
  • 11:57 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add *.nasimonline.ir to wgCopyUploadsDomains whitelist for Commons (T203371) (duration: 00m 56s)
  • 11:52 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: add Radlines.org to $wgCopyUploadsDomains (T203219) (duration: 00m 57s)
  • 11:42 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add .bollywoodhungama.in to wgCopyUploadsDomains (T203363) (duration: 00m 57s)
  • 11:35 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add some namespaces aliases for zhwikiversity (T201675) (duration: 00m 57s)
  • 11:27 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Change acewiki default time zone to Asia/Jakarta (T205693) (duration: 00m 56s)
  • 11:17 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create Photowalk and Photowalk Talk namespaces for bd.wikimedia.org (T205747) (duration: 00m 57s)
  • 10:44 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.23/README: noop sync to verify that scap 3.8.7-1 works (at least on a basic level) (duration: 00m 59s)
  • 10:38 godog: upload scap 3.8.7-1 - T204383
  • 10:36 _joe_: uploading etcd-mirror to stretch-wikimedia T205814
  • 10:08 moritzm: rolling reboot of ms-fe hosts in eqiad for kernel security update
  • 09:13 arturo: T203177 schedule 8h icinga downtime for cloudcontrol1003,1004 and labmon1001
  • 08:52 moritzm: installing python2.7/python3.4/python3.5 security updates on jessie/stretch
  • 08:34 moritzm: installing ca-certificates updates for jessie/stretch
  • 08:09 marostegui: Restart icinga T196336
  • 08:00 gehel: re-enabling puppet on maps1004
  • 07:31 elukey: move Piwik/Matomo from bohrium to matomo1001 - T202962
  • 07:25 godog: reformat ms-be1041 with crc=1 finobt=0 - T199198
  • 06:57 jynus: starting multisource replication of s3 from s5 at eqiad master
  • 06:51 jynus: reenabling consistency configuration on s5 replica databases
  • 06:24 jynus: create manual backup of databases on eqiad s6, s7, s8, x1
  • 05:36 marostegui: Deploy schema change on db2048 (s1 master) - T205913
  • 05:35 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2062 (duration: 00m 56s)
  • 05:30 marostegui: Deploy schema change on db2062 - T205913
  • 05:30 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2062 (duration: 00m 57s)
  • 04:04 SMalyshev: repooled wdqs2003
  • 03:22 SMalyshev: depool wdqs2003 to let it catch up
  • 03:21 SMalyshev: repooled wdqs2001
  • 03:16 ejegg: re-enabled PayPal EC orphan rectifier
  • 03:06 ejegg: updated CiviCRM from 80cb98e33e to ddf4865650
  • 02:43 SMalyshev: depooled wdqs2001 to see if it catches up faster
  • 01:54 ejegg: updated payments-wiki from 8b673cfb4f to d623de9494

2018-10-03

  • 23:54 mutante: scheduled downtime for wdqs as it's flapping and already known
  • 23:45 catrope@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/VisualEditor/: Require Parsoid HTML 2.0.0, and handle its <audio> tags (T201081); ext.visualEditor.mwlanguage: Actually load all of the code (T205834) (duration: 00m 57s)
  • 23:41 catrope@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/VisualEditor/: Require Parsoid HTML 2.0.0, and handle its <audio> tags (T201081) (duration: 00m 59s)
  • 23:29 catrope@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/PageTriage/: Hide copyvio AFC filter option behind flag (T205918) (duration: 00m 57s)
  • 23:23 catrope@deploy1001: Synchronized php-1.32.0-wmf.24/includes/utils/UIDGenerator.php: Make UID clock drift error have more details (T94522) (duration: 00m 58s)
  • 23:20 XenoRyet: shut off Paypal orphan rectifier
  • 23:12 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bump Minerva A/B test rates to 100% on jawiki, ruwiki, fawiki (T200792) (duration: 00m 56s)
  • 22:49 shdubsh: re-enable puppet on einsteinium
  • 22:45 shdubsh: einsteinium: setting enable_notifications=1 and reloading icinga
  • 22:36 herron: herron@neodymium:~$ sudo cumin -b 15 -p 95 '*' 'run-puppet-agent -q --failed-only'
  • 22:20 shdubsh: einsteinium: setting enable_notifications=0 and starting icinga
  • 22:06 herron: herron@neodymium:~$ sudo cumin -b 40 -p 95 'R:file = /etc/nagios/nrpe_local.cfg' run-puppet-agent
  • 22:02 mutante: mw2242 - started nagios-nrpe-server
  • 22:01 shdubsh: icinga stopped manually
  • 21:57 mutante: einstienium - disabling puppet
  • 21:25 bblack: upgraded gdnsd to 2.99.9161 on authdns1001
  • 21:17 dduvall@deploy1001: Synchronized php: group1 wikis to 1.32.0-wmf.24 (duration: 00m 55s)
  • 21:16 dduvall@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.24
  • 21:12 dduvall@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/WikibaseQualityConstraints/src/ServiceWiring.php: deploying fix to 1.32.0-wmf.24 for T206161 (duration: 00m 57s)
  • 20:28 marxarelli: deployed proposed WikibaseQualityConstraints fix and wikiversions bump for wikidatawiki to mwdebug1001 and mwdebug1002 for verification (T206161)
  • 20:18 robh: optic swap on cr4-ulsfo:et-0/0/1
  • 20:03 bblack: upgraded gdnsd to 2.99.9161 on multatuli
  • 19:40 bblack: upgraded gdnsd to 2.99.9161 on authdns2001
  • 19:35 bblack: uploaded 2.99.9161-beta-1+wmf1 to stretch-wikimedia
  • 19:33 mateusbs17: running initial osm import in maps1004
  • 19:23 dduvall@deploy1001: Synchronized php: rollback group1 to 1.32.0-wmf.23 (duration: 00m 54s)
  • 19:18 dduvall@deploy1001: rebuilt and synchronized wikiversions files: rollback group1 to 1.32.0-wmf.23
  • 19:15 marxarelli: rolling back group1 after rapid rise in fatals
  • 19:14 dduvall@deploy1001: scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 18:49 RoanKattouw: Deployed patches for T206130
  • 18:36 papaul: reinstalling OS on lvs2010
  • 18:16 mutante: lvs2010 - schduled downtime for host and services for 12 hours for reinstall
  • 18:09 mutante: lvs2009 - schedule downtime in icinga for 4 hours, reinstall in progress
  • 18:08 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@d5bab41]: Bump cirrusSearchLinksUpdate concurrency to 20 (duration: 00m 57s)
  • 18:07 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@d5bab41]: Bump cirrusSearchLinksUpdate concurrency to 20
  • 18:07 XioNoX: disable ulsfo Zayo transit/transport links
  • 17:42 XioNoX: re-enable cr1-eqiad:ae1 - T201145
  • 17:28 XioNoX: start of recabling asw2-a-eqiad between asw and cr1 - T201145
  • 17:26 XioNoX: disable cr1-eqiad:ae1 - T201145
  • 17:10 papaul: reinstalling OS on lvs2009
  • 16:24 reedy@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/Flow/: fixup flow exporting T203424 (duration: 01m 03s)
  • 15:45 ejegg: updated fundraising CiviCRM from e3e1963915 to 80cb98e33e
  • 14:42 jynus: fixed some prometheus metrics grants on dbstore1001:3306, db1116:3317 and db1116:3318
  • 14:07 banyek: converting wikidatawiki.change_tag to TokuDB on host dbstrore1002 (T205544)
  • 12:54 urandom: DROP unused RESTBase tables - T204752
  • 12:26 stephanebisson: Finished mwscript extensions/ORES/maintenance/BackfillPageTriageQueue.php --wiki enwiki (T203286)
  • 12:12 stephanebisson: Starting mwscript extensions/ORES/maintenance/BackfillPageTriageQueue.php --wiki enwiki (T203286)
  • 11:54 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Don't purge articlequality, draftquality scores (T203286) (duration: 00m 57s)
  • 11:45 banyek: converting enwiki.slots to TokuDB on host dbstrore1002 (T205544)
  • 11:42 pmiazga@deploy1001: Synchronized wmf-config: SWAT: Remove dead config relating to wgRelatedArticlesEnabledBucketSize (T202306) (duration: 00m 57s)
  • 11:38 arturo: downtime cloudcontrol1003,1004 for 2h for T203177
  • 11:30 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create eliminator group at Vietnamese Wikibooks (T202207) (duration: 00m 58s)
  • 11:25 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix a typo in zhwikiversitys importsources definition (T201328) (duration: 00m 57s)
  • 11:20 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Fix a typo in lift account creation cap for cswiki event (T206119) (duration: 00m 56s)
  • 10:41 jynus: start compressing dbstore1001:x1 tables
  • 09:26 jynus: reducing io overhead temporarilly in exchange for crash safety for s5 replicas T184805
  • 09:23 jynus: fixing replication filters on dbstore1002 (again)
  • 08:34 jynus: fixing replication filters on dbstore1002
  • 08:18 jynus: starting importing of certain s3 wikis into eqiad s5 master T184805
  • 07:51 jynus: deploying replication filtes to s5 at labsdb1009/10/11 and dbstore1002 T184805
  • 07:06 mholloway-shell@deploy1001: Finished deploy [kartotherian/deploy@27062b4] (maps1004): Specify WDQS endpoint at wdqs.discovery.wmnet in the service config (T205607) (duration: 00m 28s)
  • 07:05 mholloway-shell@deploy1001: Started deploy [kartotherian/deploy@27062b4] (maps1004): Specify WDQS endpoint at wdqs.discovery.wmnet in the service config (T205607)
  • 06:42 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2055 (duration: 00m 55s)
  • 06:37 marostegui: Deploy schema change on db2055 - T205913
  • 06:37 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2055 (duration: 00m 56s)
  • 06:03 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2085:3311 (duration: 00m 56s)
  • 05:59 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@e1aab7b]: Request Parsoid HTML version 2.0.0 (0866a07) (duration: 03m 32s)
  • 05:57 marostegui: Deploy schema change on db2085:3311 - T205913
  • 05:56 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@e1aab7b]: Request Parsoid HTML version 2.0.0 (0866a07)
  • 05:55 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2085:3311 (duration: 00m 58s)
  • 05:26 marostegui: Deploy schema change on db1067 (s1 eqiad master), lag will be generated - T205913
  • 05:25 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2070 (duration: 00m 57s)
  • 05:24 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/languages/Language.php: T206030 - I985dfa3eb17 (duration: 00m 56s)
  • 05:21 marostegui: Deploy schema change on db1075 (s3 eqiad master), lag will be generated - T205913
  • 05:20 marostegui: Deploy schema change on db2070 - T205913
  • 05:20 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2070 (duration: 00m 56s)
  • 04:45 krinkle@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/NavigationTiming: T205580 - I04c52658fbf6d (duration: 01m 03s)
  • 00:42 Amir1: Evening SWAT is done
  • 00:41 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/GlobalPreferences/resources/ext.GlobalPreferences.global.ooui.js: SWAT: Fail gracefully if we failed to find associated widget (T205991) (duration: 00m 57s)
  • 00:38 mutante: icinga1001 (not prod yet), removing all icinga packages, running puppet to reinstall them, debugging dpkg issue
  • 00:19 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/GlobalPreferences/resources/ext.GlobalPreferences.global.ooui.js: SWAT: Fail gracefully if we failed to find associated widget (T205991) (duration: 00m 55s)

2018-10-02

  • 23:54 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/PageTriage/i18n/en.json: SWAT: Align copyvio log terminology (T199359) (duration: 00m 56s)
  • 23:38 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/PageTriage/modules/ext.pageTriage.views.list/ext.pageTriage.listControlNav.underscore: SWAT: Hide copyvio, none afc filter options behind flag (T205918) (duration: 00m 56s)
  • 23:33 ejegg: updated fundraising CiviCRM from c353eba283 to e3e1963915
  • 23:26 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.24/extensions/ORES/tests/phpunit/includes/HooksTest.php: SWAT: Disable RCFilters in tests (duration: 00m 54s)
  • 23:16 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/FlaggedRevs/frontend/specialpages/reports/ProblemChanges_body.php: SWAT: Fix using the old index when new indexes are not there (T205904) (duration: 00m 57s)
  • 22:53 shdubsh: powercycling icinga1001 after removing problematic entry from fstab
  • 22:26 gtirloni: labstore2003 re-started service block_sync
  • 21:39 XioNoX: Fix unused vlans XLink1/2 on asw2-a5
  • 21:15 banyek: enabling puppet on es2001
  • 21:12 banyek: re-enabling and starting backups on host es2001 (TT205257)
  • 21:01 gtirloni: labstore2003 stopped service block_sync
  • 20:15 dduvall@deploy1001: Finished scap: group0 to php-1.32.0-wmf.24 (duration: 33m 00s)
  • 20:04 Jeff_Green: authdns-update to deploy new IP for frbast2001.frack.eqiad.wmnet
  • 19:50 XioNoX: update prefix-list fundraising-codfw-internal4 to /24 on pfw3-codfw - T204271
  • 19:42 dduvall@deploy1001: Started scap: group0 to php-1.32.0-wmf.24
  • 19:36 dduvall@deploy1001: Pruned MediaWiki: 1.32.0-wmf.19 (duration: 07m 25s)
  • 19:21 XioNoX: update fw policies on pfw3-eqiad - T204271
  • 19:19 XioNoX: update fw policies on pfw3-codfw - T204271
  • 18:39 XioNoX: replace 10.195.0.73/29 with 10.195.0.65/28 on pfw3-codfw - T204271
  • 18:26 XioNoX: remove old 10.195.0.65/29 from pfw3-codfw - T204271
  • 18:24 jynus: restarting ferm on dbstore2002 T205257
  • 18:08 arlolra: Updated Parsoid to 65d6f82 (T163438, T205674, T205673)
  • 18:07 ariel@deploy1001: Finished deploy [dumps/dumps@a9570fb]: fix incr dumps multiversion conf setting (duration: 00m 06s)
  • 18:07 ariel@deploy1001: Started deploy [dumps/dumps@a9570fb]: fix incr dumps multiversion conf setting
  • 18:01 arlolra@deploy1001: Finished deploy [parsoid/deploy@19053a3]: Updating Parsoid to 65d6f82 (duration: 10m 44s)
  • 17:51 arlolra@deploy1001: Started deploy [parsoid/deploy@19053a3]: Updating Parsoid to 65d6f82
  • 17:37 XioNoX: update NAT for frbast2001 on pfw3-codfw - T204271
  • 17:25 XioNoX: update fw policies on pfw3-eqiad - T204271
  • 17:22 XioNoX: update fw policies on pfw3-codfw - T204271
  • 17:22 andrewbogott: upgraded wikitech-static to remotes/origin/REL1_31
  • 17:18 andrewbogott: upgrading debian packages and MediaWiki version on wikitech-static
  • 16:53 jynus: setup test s3 replication channel on db1110 (filtered)
  • 16:49 XioNoX: assign 10.195.0.129/29 to pfw3-codfw:reth0.2133 - T204271
  • 16:38 cmjohnson1: swapping failed disk db1067 T205780
  • 16:04 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@093551f]: Increase cirrusSearchLinksUpdate concurrency (duration: 01m 06s)
  • 16:03 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@093551f]: Increase cirrusSearchLinksUpdate concurrency
  • 15:50 marxarelli: cutting 1.32.0-wmf.24 branch
  • 15:33 gehel: cleanup old cronjob (cleanup GC logs) on all elasticsearch servers
  • 15:24 akosiaris: upgrade mathoid chart version to 0.0.11
  • 15:24 akosiaris@deploy1001: scap-helm mathoid finished
  • 15:23 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 15:23 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 15:23 akosiaris@deploy1001: scap-helm mathoid upgrade production stable/mathoid [namespace: mathoid, clusters: eqiad,codfw]
  • 15:21 akosiaris@deploy1001: scap-helm mathoid finished
  • 15:21 akosiaris@deploy1001: scap-helm mathoid cluster codfw completed
  • 15:21 akosiaris@deploy1001: scap-helm mathoid cluster eqiad completed
  • 15:21 akosiaris@deploy1001: scap-helm mathoid upgrade -h [namespace: mathoid, clusters: eqiad,codfw]
  • 14:11 banyek: powering off dbstore2002.codfw.wmnet for BBU change (T205257)
  • 13:47 marostegui: Deploy schema change on s4 eqiad, this will generate lag on eqiad - T205913
  • 13:06 marostegui: Deploy schema change on s7 eqiad, this will generate lag on eqiad - T205913
  • 12:47 banyek: converting enwiki.content to TokuDB on host dbstrore1002 (T205544)
  • 12:47 banyek: converting enwiki.contents to TokuDB on host dbstrore1002 (T205544)
  • 11:58 banyek: converting wikidatawiki.slots to TokuDB on host dbstrore1002 (T205544)
  • 11:41 arturo: downtime labstore1007 load check in icinga for 1d
  • 11:21 zeljkof: EU SWAT finished
  • 11:19 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/FlaggedRevs/frontend/specialpages/reports/ProblemChanges_body.php: SWAT: Use proper index on change_tag table (T205904) (duration: 00m 57s)
  • 10:58 mobrovac@deploy1001: Synchronized rpc/RunSingleJob.php: RunSingleJob: Delay job execution while in read-only mode - T204154 (duration: 00m 57s)
  • 10:34 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2092 (duration: 00m 56s)
  • 10:24 marostegui: Deploy schema change on db2092 - T203709
  • 10:24 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2092 (duration: 00m 56s)
  • 09:30 marostegui: Deploy schema change on s2 eqiad master, lag will be generated T205913
  • 08:43 banyek: disabling puppet on es2001 and disabling backups too
  • 08:28 marostegui: Deploy schema change on s6 eqiad master, lag will be generated T205913
  • 08:16 jynus: test recover some s3 wiki data onto db1110 (s5)
  • 08:04 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 00m 56s)
  • 08:04 marostegui: Deploy schema change on s5 eqiad master, lag will be generated T205913
  • 08:01 banyek: converting wikidatawiki.content to TokuDB on host dbstrore1002 (T205544)
  • 07:54 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2071 (duration: 00m 55s)
  • 07:50 marostegui: Deploy schema change on db2071 T205913
  • 07:50 mholloway-shell@deploy1001: Finished deploy [tilerator/deploy@6c80537] (maps1004): Disable event logging requests and remove HTTP proxy (duration: 00m 17s)
  • 07:49 mholloway-shell@deploy1001: Started deploy [tilerator/deploy@6c80537] (maps1004): Disable event logging requests and remove HTTP proxy
  • 07:49 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 00m 56s)
  • 07:48 mholloway-shell@deploy1001: Finished deploy [kartotherian/deploy@0bf513a] (maps1004): Remove HTTP proxy (duration: 00m 16s)
  • 07:48 mholloway-shell@deploy1001: Started deploy [kartotherian/deploy@0bf513a] (maps1004): Remove HTTP proxy
  • 07:42 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2088:3311 (duration: 00m 56s)
  • 07:36 marostegui: Deploy schema change on db2088:3311 T205913
  • 07:36 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2088:3311 (duration: 00m 55s)
  • 07:32 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2072 (duration: 00m 55s)
  • 07:18 marostegui: Deploy schema change on db2072 T205913
  • 07:17 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2072 (duration: 01m 02s)
  • 05:22 _joe_: stopped tilerator on maps1004, was spamming like crazy
  • 01:18 ejegg: updated CiviCRM from e7a620a00c to c353eba283

2018-10-01

  • 23:44 eileen: update process control revision is b9c7ab286e - define but not enable Redis
  • 23:43 foks: disabling 2FA for two users
  • 23:31 twentyafterfour: finished creating database tables
  • 23:18 twentyafterfour: creating ipblocks_restrictions table (command run on mwmaint2001: foreachwiki sql.php maintenance/archives/patch-ipblocks_restrictions-table.sql)
  • 22:52 ppchelko@deploy1001: Finished deploy [restbase/deploy@babfe80]: Don't log the request for transform failures, take 3, feeds check timeouts (duration: 06m 22s)
  • 22:46 ppchelko@deploy1001: Started deploy [restbase/deploy@babfe80]: Don't log the request for transform failures, take 3, feeds check timeouts
  • 22:45 ppchelko@deploy1001: Finished deploy [restbase/deploy@babfe80]: Don't log the request for transform failures, take 2, feeds check timeouts (duration: 03m 57s)
  • 22:41 ppchelko@deploy1001: Started deploy [restbase/deploy@babfe80]: Don't log the request for transform failures, take 2, feeds check timeouts
  • 22:41 ppchelko@deploy1001: Finished deploy [restbase/deploy@babfe80]: Don't log the request for transform failures (duration: 12m 27s)
  • 22:29 ppchelko@deploy1001: Started deploy [restbase/deploy@babfe80]: Don't log the request for transform failures
  • 21:17 arlolra: Updated Parsoid to 224ecde (T198504, T133673, T202666)
  • 20:45 arlolra@deploy1001: Finished deploy [parsoid/deploy@8ff45db]: Updating Parsoid to 224ecde (duration: 08m 22s)
  • 20:37 arlolra@deploy1001: Started deploy [parsoid/deploy@8ff45db]: Updating Parsoid to 224ecde
  • 20:35 gehel@deploy1001: Finished deploy [wdqs/wdqs@a637583]: New version of WDQS GUI, updater and blazegraph (duration: 14m 00s)
  • 20:21 gehel@deploy1001: Started deploy [wdqs/wdqs@a637583]: New version of WDQS GUI, updater and blazegraph
  • 19:52 gehel@deploy1001: Finished deploy [wdqs/wdqs@a637583]: New version of WDQS GUI, updater and blazegraph (wdqs1009 only) (duration: 00m 30s)
  • 19:51 gehel@deploy1001: Started deploy [wdqs/wdqs@a637583]: New version of WDQS GUI, updater and blazegraph (wdqs1009 only)
  • 19:27 ppchelko@deploy1001: Finished deploy [restbase/deploy@7caf4d8]: Content-negotiation filter going live T128040 (duration: 03m 38s)
  • 19:24 ppchelko@deploy1001: Started deploy [restbase/deploy@7caf4d8]: Content-negotiation filter going live T128040
  • 19:11 thcipriani: restarting ci jenkins for new plugins
  • 18:33 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable page issues A/B test at 20% rate (T200792) (duration: 00m 56s)
  • 18:28 Amir1: ladsgroup@mwmaint2001:~$ mwscript extensions/CentralAuth/maintenance/deleteLocalPasswords.php --wiki=enwiki --prefix (T201009)
  • 18:23 catrope@deploy1001: Synchronized php-1.32.0-wmf.23/maintenance/includes/DeleteLocalPasswords.php: T201009 (duration: 00m 56s)
  • 18:17 catrope@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/PageTriage/: Ensure valid AFC option is selected (T205324, T205168); hide copyvio behind a global var and URL param (duration: 00m 57s)
  • 18:12 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable page issues A/B test at 5% rate (T200792) (duration: 00m 59s)
  • 17:59 XioNoX: push fw change on pfw3-eqiad - T205888
  • 17:57 XioNoX: push fw change on pfw3-codfw - T205888
  • 17:28 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@a637583]: Test deployment for recent updater build and GUI changes. Also blazegraph updates(wdqs1009) (duration: 01m 46s)
  • 17:27 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@a637583]: Test deployment for recent updater build and GUI changes. Also blazegraph updates(wdqs1009)
  • 17:06 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1093, db1064 (duration: 00m 57s)
  • 17:02 jynus: stopping some mariadb instances on dbstore1001 and starting compression T201392
  • 16:26 ppchelko@deploy1001: Started restart [cpjobqueue/deploy@58f9ed3]: Fix KafkaConsumer not connected error
  • 15:16 jynus: stopping db1064 to clone it to dbstore1001
  • 15:00 akosiaris: upgrade etherpad to 1.7.0-2
  • 14:14 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting MCR migration stage to write-both/read-new on mediawikiwiki (T198308) (duration: 00m 56s)
  • 13:51 banyek: Downtimed the slave lag monitoring on dbstore1002 while the tables getting converted (T205544)
  • 12:38 akosiaris: upload hfst_3.13.0~r3461-1+wmf2 to apt.wikimedia.org/jessie-wikimedia/main. T199962
  • 12:26 banyek: converting enwiki.categorylinks to TokuDB on host dbstrore1002 (T205544)
  • 12:19 banyek: stopping replication on s2@dbstore20002: the tables being compressed (T204930)
  • 12:19 banyek: stopping replication on s2@dbstore20002: the tables being compressed
  • 12:15 banyek: enabling puppet on labsdb1009, labsdb1010, labsdb1011 (T183983)
  • 12:13 zeljkof: EU SWAT finished
  • 12:12 zfilipin@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/ContentTranslation/: SWAT: Fix error in CXTransclusionNode#afterRender method (T205521) (duration: 00m 59s)
  • 11:56 jynus: stopping db1093 to clone it to dbstore1001
  • 11:52 arturo: install prometheus-openstack-exporte 0.0.8-3 in reprepro T203177
  • 11:41 zfilipin@deploy1001: Synchronized wmf-config: SWAT: Remove unused default source language config for CX (duration: 00m 57s)
  • 11:16 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2058 (duration: 00m 55s)
  • 11:09 _joe_: killed bash runner.sh by user ladsgroup on mwmaint2001
  • 10:58 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2058 (duration: 00m 57s)
  • 10:52 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1093, db1064 (duration: 00m 57s)
  • 10:42 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 56s)
  • 10:41 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 59s)
  • 10:21 godog: repair /dev/sdf1 /dev/sde1 on ms-be1041 - T199198
  • 10:15 Amir1: ladsgroup@mwmaint2001:~$ mwscript extensions/CentralAuth/maintenance/deleteLocalPasswords.php --prefix on all CentralAuth wikis (T201009)
  • 10:10 Amir1: mwscript extensions/CentralAuth/maintenance/deleteLocalPasswords.php --wiki=fawiki --delete (T201009)
  • 09:33 godog: test formatting sdh and sdi on ms-be2040 with crc=0 - T199198
  • 09:15 volans: Set Racktables in read-only mode - T199083
  • 08:56 _joe_: rolling restart of parsoid in codfw; afterwards, parsoid will connect to the MediaWiki API via HTTPS
  • 08:54 _joe_: rolling restart of parsoid in eqiad
  • 07:54 banyek: disabling puppet on labsdb1009, labsdb1010, labsdb1011 (T183983)
  • 07:54 banyek: disabling puppet on labsdb1009, labsdb1010, labsdb1011
  • 07:00 mholloway-shell@deploy1001: Finished deploy [kartotherian/deploy@ab6cb74] (maps1004): Update kartotherian to latest (T205462) (duration: 00m 16s)
  • 07:00 mholloway-shell@deploy1001: Started deploy [kartotherian/deploy@ab6cb74] (maps1004): Update kartotherian to latest (T205462)
  • 06:39 mholloway-shell@deploy1001: Finished deploy [tilerator/deploy@22f90ee] (maps1004): Update tilerator to latest (T205462) (duration: 00m 19s)
  • 06:39 mholloway-shell@deploy1001: Started deploy [tilerator/deploy@22f90ee] (maps1004): Update tilerator to latest (T205462)
  • 05:35 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1103:3312 (duration: 00m 56s)
  • 05:19 marostegui: Stop replication on dbstore1002 and db1103:3312 in sync
  • 05:19 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1103:3312 (duration: 01m 01s)
  • 05:19 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@07cbfb4]: Update mobileapps to a1fa41b (duration: 03m 18s)
  • 05:15 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@07cbfb4]: Update mobileapps to a1fa41b
  • 05:07 marostegui: Deploy schema change on s1 codfw msater - T203709
  • 03:21 onimisionipe: restarting inplace reindexing of enwiki and viwiki at codfw - T204362

2018-09-30

  • 12:41 ariel@deploy1001: Finished deploy [dumps/dumps@26aaee6]: make location of MWScript.php configurable (duration: 00m 03s)
  • 12:41 ariel@deploy1001: Started deploy [dumps/dumps@26aaee6]: make location of MWScript.php configurable

2018-09-29

  • 13:01 gtirloni: tools-mail cleaned frozen messages in exim queue
  • 01:13 krinkle@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/ORES/includes/ORESService.php: T205651 - I1beaea (duration: 00m 59s)

2018-09-28

  • 20:57 mutante: analytics1003 - unmounted and remounted /mnt/hdfs after Icinga alerts that it was not accessible - commands from https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration#Fixing_HDFS_mount_at_/mnt/hdfs - like it happened before on stat1004 and others (T182342)
  • 19:18 mutante: phab1001 (Phabricator), scheduled downtime, reboot for maintenance
  • 19:03 mutante: phab2001 - scheduled downtime, rebooting for kernel
  • 16:25 XioNoX: add HKBN BGP sessions to esams and eqsin
  • 16:05 jynus: compressing tables at db1116:3317, stopping replication
  • 16:01 Amir1: ladsgroup@mwmaint2001:~$ mwscript extensions/CentralAuth/maintenance/deleteLocalPasswords.php --wiki=fawiki --prefix (T201009)
  • 15:51 Amir1: ladsgroup@mwmaint2001:~$ mwscript extensions/CentralAuth/maintenance/deleteLocalPasswords.php --delete on mediawiki.org and testwiki
  • 15:10 XioNoX: activate Equinix peering sessions on cr4-ulsfo
  • 14:29 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1086, db1104 (duration: 00m 55s)
  • 14:22 banyek: converting dewiki.flaggedimages to TokuDB on host dbstrore1002 (T205544)
  • 13:27 moritzm: rebooting tungsten for kernel security update
  • 13:15 mobrovac@deploy1001: Finished deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints, take #4 (duration: 06m 36s)
  • 13:09 mobrovac@deploy1001: Started deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints, take #4
  • 13:08 mobrovac@deploy1001: Finished deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints, take #3 (duration: 03m 39s)
  • 13:05 mobrovac@deploy1001: Started deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints, take #3
  • 13:05 mobrovac@deploy1001: Finished deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints, take #2 (duration: 04m 57s)
  • 13:00 mobrovac@deploy1001: Started deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints, take #2
  • 13:00 mobrovac@deploy1001: Finished deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints (duration: 10m 55s)
  • 12:49 mobrovac@deploy1001: Started deploy [restbase/deploy@7caf4d8]: Update metrics top endpoints
  • 12:42 arturo: downtime cloudcontrol1003.wikimedia.org for 2H (tests related to T203177)
  • 12:39 banyek: converting wikidatawiki.text to TokuDB on host dbstrore1002 (T205544)
  • 12:28 arturo: downtime cloudcontrol1004.wikimedia.org for 2H (tests related to T203177)
  • 12:01 moritzm: installint php security updates on einsteinium (icinga.wikimedia.org)
  • 11:38 arturo: add prometheus-openstack-exporter 0.0.8-2 to reprepro (T203177)
  • 11:21 moritzm: installing php5 security updates
  • 11:01 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@961aa5a]: Update mobileapps to 38271fa (duration: 03m 05s)
  • 10:58 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@961aa5a]: Update mobileapps to 38271fa
  • 10:55 banyek: converting dewiki.flaggedtemplates to TokuDB on host dbstrore1002 (T205544)
  • 10:09 moritzm: reimaging mw2150 to test router ACLs on cumin1001
  • 09:45 jynus: stop db1086 and db1104 for cloning to db1116
  • 09:41 moritzm: installing ca-certificates updates on trusty/stretch
  • 09:34 banyek: converting whikishared.cx_coprora to TokuDB on host dbstrore1002 (T205544)
  • 09:30 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1086, db1104 (duration: 00m 57s)
  • 08:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 (duration: 00m 54s)
  • 08:11 marostegui: Deploy schema change on s3 eqiad, this will generate lag - T203709
  • 07:40 marostegui: Stop replication in sync on dbstore1002 and db1078
  • 07:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 (duration: 00m 55s)
  • 07:27 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2055 (duration: 00m 56s)
  • 07:23 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc (duration: 02m 43s)
  • 07:20 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc
  • 07:03 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc
  • 06:30 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc (duration: 01m 57s)
  • 06:28 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc
  • 06:27 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc (duration: 06m 40s)
  • 06:20 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc
  • 06:20 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc (duration: 00m 44s)
  • 06:19 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@bf09080]: Update mobileapps to 7878ffc
  • 06:07 marostegui: Deploy schema change on db2055 - T203709
  • 06:07 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2055 (duration: 00m 55s)
  • 05:54 marostegui: Deploy schema change on s7 eqiad, this will generate lag - T203709
  • 05:47 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1094 (duration: 00m 55s)
  • 05:29 marostegui: Stop replication in sync on db1094 and dbstore1002:s7
  • 05:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1094 (duration: 00m 59s)

2018-09-27

  • 23:29 thcipriani@deploy1001: Synchronized wmf-config/WikibaseSearchSettings.php: SWAT: Enable phrase search config T163642 (duration: 00m 56s)
  • 23:08 stephanebisson: Finished mwscript extensions/ORES/maintenance/BackfillPageTriageQueue.php --wiki enwiki (T203286)
  • 21:04 ejegg: updated fundraising CiviCRM from 43ca105873 to e7a620a00c
  • 20:09 arlolra@deploy1001: Finished deploy [parsoid/deploy@0272096]: Updating Parsoid to ff6ffb5 (duration: 08m 05s)
  • 20:03 XioNoX: update pfw3-codfw/eqiad firewall rules - T205574
  • 20:01 arlolra@deploy1001: Started deploy [parsoid/deploy@0272096]: Updating Parsoid to ff6ffb5
  • 19:31 bawolff: deploy related to T194204
  • 19:29 XioNoX: update all cr1/2-eqiad to include cumin1001 - T205513
  • 19:10 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove "metawiki" from "wgForceUIMsgAsContentMsg" T205633 (duration: 00m 56s)
  • 18:59 XioNoX: update all MR routers to include cumin1001 - T205513
  • 18:49 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/WikimediaEvents/modules/wikibase/ext.wikimediaEvents.completionClicks.js: SWAT: Ignore clicks with empty search string T205301 (duration: 00m 56s)
  • 18:43 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Page-previews on Wikivoyage T203981 (duration: 00m 57s)
  • 18:17 stephanebisson: Starting mwscript extensions/ORES/maintenance/BackfillPageTriageQueue.php --wiki enwiki (T203286)
  • 18:14 arlolra: Updated Parsoid to af3a920 (T198511, T163438, T108776, T205334, T114413)
  • 18:04 arlolra@deploy1001: Finished deploy [parsoid/deploy@6a2c25c]: Updating Parsoid to af3a920 (duration: 10m 43s)
  • 17:54 arlolra@deploy1001: Started deploy [parsoid/deploy@6a2c25c]: Updating Parsoid to af3a920
  • 17:50 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@a0054ba]: Update mobileapps to 0d6c2b7 (duration: 03m 21s)
  • 17:47 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@a0054ba]: Update mobileapps to 0d6c2b7
  • 17:27 XioNoX: reboot asw2-a-eqiad (not in prod) - T201145
  • 17:27 ladsgroup@deploy1001: Finished deploy [ores/deploy@a717199]: Send metrics for non-major responses (duration: 23m 55s)
  • 17:03 ppchelko@deploy1001: Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 5, feeds.... (duration: 06m 32s)
  • 17:03 ladsgroup@deploy1001: Started deploy [ores/deploy@a717199]: Send metrics for non-major responses
  • 17:03 cmjohnson1: swapping failed disk slot 7 db1069
  • 16:57 ppchelko@deploy1001: Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 5, feeds....
  • 16:57 ppchelko@deploy1001: Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 4, feeds.... (duration: 04m 46s)
  • 16:52 ppchelko@deploy1001: Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 4, feeds....
  • 16:51 ppchelko@deploy1001: Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 3, feeds (duration: 03m 59s)
  • 16:47 ppchelko@deploy1001: Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 3, feeds
  • 16:47 ppchelko@deploy1001: Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 2, feeds (duration: 03m 49s)
  • 16:43 ppchelko@deploy1001: Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 2, feeds
  • 16:43 ppchelko@deploy1001: Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 (duration: 03m 06s)
  • 16:39 ppchelko@deploy1001: Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040
  • 16:34 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T204791 (duration: 00m 57s)
  • 16:29 ppchelko@deploy1001: Finished deploy [restbase/deploy@0f11d5d]: Canary on 2001 for content negotiations T128040 (duration: 04m 17s)
  • 16:25 ppchelko@deploy1001: Started deploy [restbase/deploy@0f11d5d]: Canary on 2001 for content negotiations T128040
  • 15:55 reedy@deploy1001: Synchronized php-1.32.0-wmf.23/autoload.php: new mtx script (duration: 00m 56s)
  • 15:31 reedy@deploy1001: Synchronized php-1.32.0-wmf.23/maintenance/: add new mtx script (duration: 00m 58s)
  • 15:19 cmjohnson1: swapping out failed disk slot 3 rdb1004
  • 15:10 reedy@deploy1001: Synchronized wmf-config/extension-list: Bye Bye Education Program, Bye Bye. T125618 (duration: 00m 55s)
  • 15:08 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bye Bye Education Program, Bye Bye. T125618 (duration: 00m 56s)
  • 15:07 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Bye Bye Education Program, Bye Bye. T125618 (duration: 00m 58s)
  • 15:07 XioNoX: add peering with Telin in esams
  • 15:04 arturo: T196507 2h downtime cloudvirt1019 in icinga
  • 15:02 moritzm: rebooting labtestvirt2003 for microcode tests
  • 14:48 cmjohnson1: disabling checks on cloudvirt1019 to replace raid controller cable T196507
  • 14:29 banyek: converting srwiki.pagelinks to TokuDB on host dbstore1002 (T205544)
  • 13:49 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T205514: revert: depooling db1104, adding db1109 as temproray api host for s8 (duration: 00m 56s)
  • 13:46 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T205514: revert: depooling db1104, adding db1109 as temproray api host for s8 (duration: 00m 55s)
  • 13:44 moritzm: installing ca-certificates updates for jessie/stretch
  • 13:41 banyek: repooling db1104 (T205514)
  • 13:34 moritzm: installing postgres security updates on labsdb1004
  • 13:17 zfilipin@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.23
  • 13:13 godog: repair /dev/sdh1 on ms-be2042 - T199198
  • 12:24 gehel: reboot of wdqs2004-2006 for kernel upgrade
  • 12:11 moritzm: installing libapache2-mod-perl2 security updates
  • 11:45 zeljkof: EU SWAT finished
  • 11:44 zfilipin@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/ContentTranslation/: SWAT: Use numerical option when setting CX version preference (T205493) (duration: 00m 57s)
  • 10:50 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: NOOP phpdoc comment changes pt2/2 (duration: 00m 56s)
  • 10:48 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: NOOP phpdoc comment changes pt1/2 (duration: 00m 56s)
  • 10:41 godog: stop icinga-wm temporarily
  • 09:54 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T205514: depooling db1104, adding db1109 as temproray api host for s8 (duration: 00m 56s)
  • 09:35 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: RejectParserCacheValue Hook to purge wikidatawiki to 2018-09-19 (duration: 00m 56s)
  • 09:15 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: RejectParserCacheValue Hook to purge wikidatawiki to 2018-09-15 (duration: 00m 57s)
  • 09:04 moritzm: rebooting ores1* for kernel security updates
  • 08:56 banyek: stopping replocication & mariadb on db1104 and db1092 as db1092 is getting recloned from db1104 (T205514)
  • 08:54 godog: test formatting sde on ms-be1040 with crc=0 - T199198
  • 08:48 marostegui: Deploy schema change on labswiki (db1073 master) - T203709
  • 08:39 marostegui: Deploy schema change on labtestwiki - T203709
  • 08:30 banyek: upgrading db1104 (kernel-mariadb) and rebooting it (T205514)
  • 07:58 banyek: rebooting db1108 for kernel & mariadb upgrade (T205288)
  • 07:55 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: RejectParserCacheValue Hook to purge wikidatawiki to 2018-09-12 (duration: 00m 55s)
  • 07:54 addshore@deploy1001: sync-file aborted: RejectParserCacheValue Hook to purge wikidatawiki to 2018-09-08 (duration: 00m 01s)
  • 07:53 jynus: enabling puppet back on all db hosts
  • 07:45 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: RejectParserCacheValue Hook to purge wikidatawiki to 2018-09-08 (duration: 00m 55s)
  • 07:30 godog: test formatting sdd on ms-be1040 with crc=0 - T199198
  • 07:29 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: RejectParserCacheValue Hook to purge wikidatawiki to 2018-09-06 (duration: 00m 57s)
  • 07:27 godog: test formatting sdc on ms-be1040 with crc=0 - T199198
  • 07:26 jynus: enabling puppet on all core eqiad hosts
  • 07:08 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1089 (duration: 00m 55s)
  • 07:05 jynus: disablingh puppet on all databases
  • 07:00 marostegui: Stop replication in sync on db1089 and dbstore1002:s1
  • 06:27 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 (duration: 00m 56s)
  • 06:17 marostegui: Deploy schema change on dbstore2002:3311
  • 05:41 marostegui: Deploy schema change on s1 eqiad master, lag will be generated - T203709
  • 05:40 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2072 (duration: 00m 57s)
  • 05:26 marostegui: Drop wikiuser on dbstore1002
  • 05:21 marostegui: Deploy schema change on db2072
  • 05:21 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2072 (duration: 01m 00s)
  • 03:04 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Thu Sep 27 03:04:40 UTC 2018 (duration 10m 51s)
  • 02:53 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.23) (duration: 07m 16s)
  • 02:41 onimisionipe: starting inplace reindexing of viwiki and commonswiki - T204362
  • 02:35 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.22) (duration: 13m 25s)
  • 01:14 mutante: repair /dev/sdn1 on ms-be0241 - T199198
  • 01:14 mutante: repair /dev/sde1 on ms-be0240 - T199198
  • 00:22 krinkle@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/FeaturedFeeds: T205573 (duration: 00m 59s)

2018-09-26

  • 23:23 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/WikimediaEvents/modules/all/ext.wikimediaEvents.readingDepth.js: SWAT: Ensure Minerva has initialised before loading and executing ReadingDepth T204144 (duration: 00m 57s)
  • 23:14 eileen: process control -renable omnirepair - process-control config revision is b8549d3344
  • 22:48 eileen: civicrm revision changed from 018ff6325f to 43ca105873, config revision is 3b785f0af4
  • 22:28 XioNoX: enable BGP to transits on cr3-ulsfo
  • 21:20 thcipriani: restarting releases jenkins for updates
  • 21:03 gtirloni: tools-mail deleted frozen messages from exim queue
  • 19:54 mutante: icinga1001 - temp. disabling puppet, remove --purge all icinga packages, rm -rf /etc/nagios and /etc/icinga, let puppet recreate everything now that it should not mess with user/group on stretch (T202782)
  • 19:42 krinkle@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/Gadgets/includes/SpecialGadgets.php: (no justification provided) (duration: 00m 57s)
  • 19:25 shdubsh: manually installing python3-jinja2 on puppetmaster1001 to test naggen2 python3 upgrade
  • 18:01 joal@deploy1001: Finished deploy [analytics/aqs/deploy@39b909e]: Deploy Wikistats2 top-metrics updates (duration: 11m 58s)
  • 17:49 joal@deploy1001: Started deploy [analytics/aqs/deploy@39b909e]: Deploy Wikistats2 top-metrics updates
  • 17:22 jforrester@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/3D/src/ThreeDThumbnailImage.php: Hot-deploy I5bb4b699a fix for T205554 train-blocker (duration: 00m 56s)
  • 16:47 jforrester@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/ORES/includes/SpecialORESModels.php: SWAT T205228 fix Iddb9c1f9e (duration: 00m 56s)
  • 16:36 XioNoX: shutting down cr1/2-ulsfo for DC move
  • 16:36 robh: shutting down all ulsfo servers for relocation
  • 16:35 jforrester@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/CentralAuth/includes/CentralAuthHooks.php: SWAT deploy I65ae2f05e (duration: 00m 56s)
  • 16:34 jforrester@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/CentralAuth/includes/CentralAuthHooks.php: SWAT deploy I65ae2f05e (duration: 00m 57s)
  • 16:18 godog: test formatting sdd on ms-be2040 with crc=0 - T199198
  • 16:10 zfilipin@deploy1001: Synchronized php: group1 wikis to 1.32.0-wmf.23 (duration: 00m 54s)
  • 16:09 zfilipin@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.23
  • 16:07 bblack: downtimed ipsec alerts on cp[12]xxx for ulsfo outage
  • 15:55 addshore@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/WikibaseLexeme: form statement groups: ids without lexeme part T196226 (duration: 01m 01s)
  • 15:54 XioNoX: ulsfo X-connect hot cut scheduled for in 5min
  • 15:53 addshore@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/WikibaseLexeme: form statement groups: ids without lexeme part T196226 (duration: 01m 06s)
  • 15:29 pmiazga@deploy1001: Synchronized php-1.32.0-wmf.23/skins/MinervaNeue/includes/skins/SkinMinerva.php: SWAT: Create $returntoquery variable properly(T205449) (duration: 00m 56s)
  • 15:02 gehel: rolling restart of wdqs-test and wdqs-internal for JVM + kernel upgrade
  • 14:48 reedy@deploy1001: Synchronized php-1.32.0-wmf.23/includes/specials/SpecialMostlinkedcategories.php: T205469 (duration: 00m 55s)
  • 14:46 reedy@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/CirrusSearch/: T205473 (duration: 01m 06s)
  • 13:45 marostegui: Stop replication on dbstore1002:s4 for maintenance
  • 13:38 elukey: reboot an-master1001 to clear out an issue with systemd@fsck (Hadoop master, failover to an-master1002 included)
  • 13:34 marostegui: Stop replication on s4 eqiad master for maintenance, lag will be generated
  • 13:32 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2071 (duration: 00m 55s)
  • 13:12 marostegui: Deploy schema change on db2071 - T203709
  • 13:12 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 00m 56s)
  • 13:03 moritzm: installing tomcat8 security updates
  • 12:33 marostegui: Deploy schema change on s4 eqiad, will generate lag - T203709
  • 11:43 zeljkof: EU SWAT finished
  • 11:42 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Lift account creation IP cap for 2018-09-26 (T205529) (duration: 00m 56s)
  • 11:30 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor in Project namespace on srwiki (T205206) (duration: 00m 57s)
  • 11:11 volans: disabled puppet on einsteinium to test ircecho failure, I'll re-enable it in max ~30m
  • 10:30 volans: started ircecho on einsteinium
  • 10:27 volans: stop ircecho on einsteinium to register the nickname
  • 10:26 moritzm: rebooting mw1270-mw1290 for kernel security updates
  • 10:13 _joe_: reenabling puppet on the MediaWiki hosts
  • 10:08 moritzm: rebooting conf1006 for kernel security update
  • 09:51 moritzm: rebooting conf1005 for kernel security update
  • 09:46 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2092 (duration: 00m 56s)
  • 09:45 moritzm: rebooting conf1004 for kernel security update
  • 09:29 jynus: restarting icinga at einsteinium, unresponsive to commands
  • 09:06 marostegui: Deploy schema change on db2092 - T203709
  • 09:06 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2092 (duration: 00m 56s)
  • 08:58 _joe_: disabling puppet on all hosts with a MediaWiki setup before merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/453093
  • 08:35 onimisionipe: restarting blazegraph and updater on wdqs2003
  • 08:24 hashar: Restarting CI Jenkins on contint1001 [#2]
  • 08:22 elukey: start eventlogging_sync on db1108 and the mysql kafka consumers on eventlog1002 after db1107 maintenance
  • 08:21 moritzm: rebooting mw1250-mw1269 for kernel security updates
  • 08:17 banyek: db1107 upgrade finished (T205288)
  • 08:15 banyek: db1107 upgrade finished
  • 08:14 hashar: Restarting CI Jenkins on contint1001
  • 08:04 elukey: stop eventlogging_sync on db1108 and the mysql kafka consumers on eventlog1002 as prep step for db1107 maintenance
  • 08:04 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1092, server crashed - T205514 (duration: 00m 56s)
  • 08:00 banyek: donwtiming db1107 as upgrade (kernel & mariadb)
  • 07:50 marostegui: Hard reset db1092, server crashed - T205514
  • 07:48 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set testwikidatawiki CacheEpoch to 2018-09-19 T205330 (duration: 00m 56s)
  • 07:47 elukey: reboot an-master1002 as attempt to clear out some systemd@fsck alarms
  • 07:46 godog: repair /dev/sdd1 on ms-be1043 - T199198
  • 07:41 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2085:3311 (duration: 00m 56s)
  • 07:36 banyek: stopping s2 replication on dbstore2002 (compressing tables)
  • 07:34 godog: powercycle ms-be2030 - no console
  • 07:31 jynus: restart db1115
  • 07:29 godog: repair /dev/sdh1 on ms-be2041 - T199198
  • 07:28 jynus: stoping mariadb at db1115
  • 07:19 volans: restarted ircecho on einsteinium "icinga-wm quit (Reason: Remote host closed the connection)"
  • 05:29 marostegui: Deploy schema change on db2085:3311
  • 05:28 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2085:3311 (duration: 00m 56s)
  • 05:20 marostegui: Deploy schema change on s8 eqiad, this will generate lag - T203709
  • 05:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1088, db1098:3316 and db1098:3317 (duration: 01m 06s)
  • 04:17 tstarling@deploy1001: Synchronized php-1.32.0-wmf.23/includes/specials/SpecialExport.php: enable export logging channel (attempt 2) (duration: 00m 51s)
  • 04:09 tstarling@deploy1001: Synchronized php-1.32.0-wmf.23/includes/specials/SpecialExport.php: enable export logging channel (duration: 00m 55s)
  • 03:48 eileen: process-control config revision is 3b785f0af4 - disable omniail repair job
  • 03:42 krinkle@deploy1001: Synchronized php-1.32.0-wmf.23/includes/page/ImageHistoryPseudoPager.php: T204796 - I17455fef0d8 (duration: 00m 58s)
  • 03:21 tstarling@deploy1001: Synchronized wmf-config/InitialiseSettings.php: add export logging channel for g 461647 (duration: 00m 57s)
  • 03:06 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Wed Sep 26 03:06:46 UTC 2018 (duration 10m 54s)
  • 02:55 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.23) (duration: 05m 10s)
  • 02:32 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.22) (duration: 12m 42s)

2018-09-25

  • 23:59 maxsem@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/GlobalPreferences/: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/GlobalPreferences/+/462822/ (duration: 00m 56s)
  • 23:53 eileen: civicrm revision changed from ca49aed673 to 018ff6325f, config revision is e71b8fdbdf
  • 23:49 maxsem@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/UploadWizard/: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/UploadWizard/+/462781/ (duration: 00m 56s)
  • 23:46 maxsem@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/UploadWizard/: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/UploadWizard/+/462780/ (duration: 00m 56s)
  • 23:39 maxsem@deploy1001: Synchronized php-1.32.0-wmf.22/skins/MinervaNeue/: https://gerrit.wikimedia.org/r/#/c/mediawiki/skins/MinervaNeue/+/462705/ (duration: 00m 57s)
  • 23:29 maxsem@deploy1001: Synchronized wmf-config/WikibaseSearchSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/462347/ (duration: 00m 57s)
  • 22:55 ppchelko@deploy1001: Finished deploy [restbase/deploy@0fa695e] (dev-cluster): Deployinng content negotiation to dev cluster for ab testing (duration: 02m 53s)
  • 22:52 ppchelko@deploy1001: Started deploy [restbase/deploy@0fa695e] (dev-cluster): Deployinng content negotiation to dev cluster for ab testing
  • 19:17 catrope@deploy1001: Synchronized php-1.32.0-wmf.23/extensions/ORES/maintenance/BackfillPageTriageQueue.php: I3f1ae92d8645 (duration: 00m 58s)
  • 17:49 otto@deploy1001: Finished deploy [analytics/refinery@ce8f0b3]: Deploying refinery-source 0.0.75 for ConfigHelper Refine - T203804 (duration: 10m 22s)
  • 17:39 otto@deploy1001: Started deploy [analytics/refinery@ce8f0b3]: Deploying refinery-source 0.0.75 for ConfigHelper Refine - T203804
  • 17:12 XioNoX: depool ulsfo for DC move
  • 16:59 gehel: forcing puppet run on all elastic nodes (including logstash) to recover prometheus metric exporter
  • 15:14 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Turn off EducationProgram T188411 T125618 (duration: 00m 57s)
  • 15:02 zfilipin@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.23
  • 14:50 marostegui: Stop MySQL on db1098 s6 and s7 for upgrade
  • 14:49 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098:3317 (duration: 00m 57s)
  • 14:39 zfilipin@deploy1001: Finished scap: testwiki to php-1.32.0-wmf.23 and rebuild l10n cache (duration: 20m 50s)
  • 14:38 moritzm: rebooting mw1240-mw1258 for kernel security updates
  • 14:18 zfilipin@deploy1001: Started scap: testwiki to php-1.32.0-wmf.23 and rebuild l10n cache
  • 14:15 zfilipin@deploy1001: Pruned MediaWiki: 1.32.0-wmf.19 [keeping static files] (duration: 01m 29s)
  • 14:12 zfilipin@deploy1001: Pruned MediaWiki: 1.32.0-wmf.18 (duration: 03m 02s)
  • 14:10 marostegui: Upgrade db1088
  • 14:05 marostegui: Stop db1088 and db1098:3316 in sync
  • 14:03 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098:3316 (duration: 00m 50s)
  • 14:02 zfilipin@deploy1001: Pruned MediaWiki: 1.32.0-wmf.16 (duration: 08m 27s)
  • 13:29 marostegui: Deploy schema change on s2 eqiad - this might generate lag on s2 eqiad - T203709
  • 13:13 marostegui: Deploy schema change on s6 eqiad - this might generate lag on s6 eqiad - T203709
  • 13:07 marostegui: Deploy schema change on s5 eqiad - this might generate lag on s5 eqiad - T203709
  • 12:51 moritzm: rebooting mw1221-mw1239 for kernel security updates
  • 12:42 moritzm: installing chromium security updates on proton* (tested the new Chromium version in deployment-prep)
  • 12:36 raynor: EU SWAT finished
  • 12:34 pmiazga@deploy1001: Finished scap: php-1.32.0-wmf.22/skins/MinervaNeue/includes SWAT: Minerva A/B tests are not subject to HTML caching time (T205355) (duration: 16m 51s)
  • 12:34 godog: repair sde on ms-be2042 - T199198
  • 12:17 pmiazga@deploy1001: Started scap: php-1.32.0-wmf.22/skins/MinervaNeue/includes SWAT: Minerva A/B tests are not subject to HTML caching time (T205355)
  • 12:05 pmiazga@deploy1001: Finished scap: php-1.32.0-wmf.22/skins/MinervaNeue/resources/skins.minerva.scripts/pageIssues.js SWAT: It should be possible to opt into new page issues treatment via query string parameter (T204746) (duration: 17m 39s)
  • 12:01 elukey: end of the maintenance to swap Hadoop masters from analytics100[1,2] to an-master100[1,2] - T203635
  • 11:47 pmiazga@deploy1001: Started scap: php-1.32.0-wmf.22/skins/MinervaNeue/resources/skins.minerva.scripts/pageIssues.js SWAT: It should be possible to opt into new page issues treatment via query string parameter (T204746)
  • 11:20 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Increase sampling ratio for ReadingDepth (T205176) (duration: 00m 50s)
  • 10:53 onimisionipe: onimisionipe: starting inplace reindex of eqiad enwiki - T204362
  • 09:48 banyek: upgrading & rebooting dbproxy1006
  • 09:38 Amir1: ladsgroup@mwmaint2001:~$ foreachwikiindblist all populateChangeTagDef.php --set-user-tags-only
  • 09:34 banyek: rebooting again dbproxy1004, 1005, 1007, 1009
  • 09:25 banyek: upgrade and reboot dbproxy1008
  • 09:18 banyek: upgrade and reboot dbproxy1007
  • 08:54 banyek: upgrade & reboot dbproxy1005
  • 08:52 moritzm: installing libtirpc security updates
  • 08:49 mobrovac@deploy1001: Finished deploy [restbase/deploy@40b81a8]: Do not dynamically generate Parsoid content if TID is provided, take #2 (duration: 12m 19s)
  • 08:39 banyek: upgrade & reboot dbproxy1009
  • 08:38 moritzm: installing twitter-bootstrap3 security updates
  • 08:37 mobrovac@deploy1001: Started deploy [restbase/deploy@40b81a8]: Do not dynamically generate Parsoid content if TID is provided, take #2
  • 08:37 mobrovac@deploy1001: Finished deploy [restbase/deploy@40b81a8]: Do not dynamically generate Parsoid content if TID is provided - T204880 (duration: 12m 00s)
  • 08:25 mobrovac@deploy1001: Started deploy [restbase/deploy@40b81a8]: Do not dynamically generate Parsoid content if TID is provided - T204880
  • 08:24 moritzm: installing libapache2-mod-perl2 security updates
  • 08:23 banyek: upgrading & rebooting dbproxy1004
  • 08:21 mobrovac@deploy1001: Finished deploy [restbase/deploy@40b81a8] (dev-cluster): Do not dynamically generate Parsoid content if TID is provided (duration: 02m 52s)
  • 08:18 mobrovac@deploy1001: Started deploy [restbase/deploy@40b81a8] (dev-cluster): Do not dynamically generate Parsoid content if TID is provided
  • 08:03 moritzm: installing reportbug DLA update on jessie hosts
  • 08:03 elukey: start of the maintenance to swap Hadoop masters from analytics100[1,2] to an-master100[1,2] - T203635
  • 07:54 marostegui: Deploy schema change on s4 codfw - T204006
  • 07:20 godog: repair sdm sdi on ms-be2043 - T199198
  • 07:13 marostegui: Deploy schema change on s3 eqiad master (db1075), might generate lag on s3 eqiad - T204006
  • 07:00 marostegui: Deploy schema change on labswiki (wikitech) m5 master db1073 - T204006
  • 06:58 marostegui: Deploy schema change on labtestwiki - T204006
  • 06:52 marostegui: Deploy schema change on s1 eqiad master (db1067), might generate lag on s1 eqiad - T204006
  • 06:49 marostegui: Deploy schema change on s2 eqiad master (db1066), might generate lag on s2 eqiad - T204006
  • 06:46 marostegui: Deploy schema change on s7 eqiad master (db1062), might generate lag on s4 eqiad - T204006
  • 06:42 marostegui: Deploy schema change on s4 eqiad master (db1068), might generate lag on s4 eqiad - T204006
  • 06:11 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2038 (duration: 00m 49s)
  • 06:05 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2038 (duration: 00m 49s)
  • 05:59 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2059 (duration: 00m 49s)
  • 05:51 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2059 (duration: 00m 49s)
  • 05:50 marostegui@deploy1001: sync-file aborted: Repool db2059 (duration: 00m 09s)
  • 05:45 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2066 (duration: 00m 50s)
  • 05:37 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2066 (duration: 00m 50s)
  • 05:32 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2075 (duration: 00m 51s)
  • 03:02 krinkle@deploy1001: Synchronized php-1.32.0-wmf.22/includes/specials/SpecialLog.php: T201411 - Ie1a9a8 (duration: 00m 52s)
  • 02:30 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Tue Sep 25 02:30:02 UTC 2018 (duration 10m 42s)
  • 02:19 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.22) (duration: 07m 45s)
  • 01:54 ejegg: updated fundraising CiviCRM from b0223f56b0 to ca49aed673
  • 00:38 mutante: mwmaint1002 - created /var/run/nutcracker dir and fixed permissions on it, then started nutcracker with systemctl. this fixed icinga alerts T201343

2018-09-24

  • 23:21 maxsem@deploy1001: Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/461715/ (duration: 00m 51s)
  • 22:54 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove DisableAccount (duration: 00m 52s)
  • 22:52 reedy@deploy1001: Synchronized wmf-config/extension-list: Remove DisableAccount (duration: 00m 49s)
  • 22:43 reedy@deploy1001: Synchronized wmf-config/CommonSettings.php: Bye bye DisableAccount T106067 (duration: 00m 51s)
  • 21:23 imarlier@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable NavTiming oversampling from Asian countries, used during Singapore data center rollout (T204365) (duration: 00m 50s)
  • 20:57 SMalyshev: Started fulltext reindex for wikidatawiki
  • 20:28 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@4be131b]: Update mobileapps to badb463 (T187098 T195838) (duration: 04m 31s)
  • 20:23 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@4be131b]: Update mobileapps to badb463 (T187098 T195838)
  • 19:36 gehel: resetting failed units on elasticsearch codfw
  • 19:02 gjg@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT - [config] {{gerrit|459365}} Remove mhs.ox.ac.uk from $wgCopyUploadsDomains ({{phab|T203904}}) (duration: 00m 50s)
  • 18:57 gjg@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [config] {{gerrit|462011}} Set wgRestrictDisplayTitle = false for pflwiki ({{phab|T205055}}) (duration: 00m 48s)
  • 18:49 gjg@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [config] {{gerrit|462271}} Search by default on the Reconstruction namespace at frwikt ({{phab|T205198}}) (duration: 00m 50s)
  • 18:44 gjg@deploy1001: Synchronized wmf-config/throttle.php: [config] {{gerrit|462014}} New users on IP for edit-a-thon (October 11) ({{phab|T204829}}) (duration: 00m 49s)
  • 18:37 gjg@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [config] {{gerrit|461240}} Allow wikitech bureaucrats to promote to interface-admin, but uh, only wikitech (duration: 00m 50s)
  • 18:21 gjg@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT [config] {{gerrit|462507}} Increase Schema:CitationUsagePageLoad population size ({{phab|T191086}}) (duration: 00m 50s)
  • 18:14 gjg@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT [config] {{gerrit|462045}} Enable Extension:NewUserMessage on kn.wikisource ({{phab|T204405}}) (duration: 00m 50s)
  • 17:59 gehel: restarting updater on wdqs*
  • 17:50 ejegg: updated standalone SmashPig deployment from a2bc0b9fa5 to 82f9d49c23
  • 17:39 XioNoX: powerdown and move bast4002 (not in prod)
  • 17:18 ejegg: updated SmashPig standalone from 8500e75d9d to a2bc0b9fa5
  • 17:15 gehel@deploy1001: Finished deploy [wdqs/wdqs@195ea0e]: new version of wdqs GUI and updater (duration: 13m 15s)
  • 17:02 gehel@deploy1001: Started deploy [wdqs/wdqs@195ea0e]: new version of wdqs GUI and updater
  • 16:59 gehel@deploy1001: Finished deploy [wdqs/wdqs@195ea0e]: new version of wdqs GUI and updater (wdqs1009 only) (duration: 00m 31s)
  • 16:58 gehel@deploy1001: Started deploy [wdqs/wdqs@195ea0e]: new version of wdqs GUI and updater (wdqs1009 only)
  • 15:55 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1064 (duration: 00m 50s)
  • 15:52 moritzm: rebooting rdb1009/rdb1010 for kernel security update
  • 15:46 marostegui: Deploy schema change on s8 eqiad master with replication - might generate lag - T204006
  • 15:23 jynus: stop and upgrade db1064 (x1)
  • 15:18 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: depool db1064 (duration: 00m 50s)
  • 15:04 moritzm: rebooting rdb1008 for kernel security update
  • 14:51 moritzm: rebooting rdb1007 for kernel security update
  • 14:30 bblack: upgrade gdnsd 2.99.42 -> 2.99.1729 on authdns2001
  • 14:28 moritzm: rebooting rdb1006 for kernel security update
  • 14:28 bblack: upgrade gdnsd 2.99.42 -> 2.99.1729 on authdns1001
  • 14:20 moritzm: rebooting rdb1005 for kernel security update
  • 14:18 jynus: stop and upgrade es1017 (es3 eqiad master)- it may create some temporary lag on es3
  • 13:59 jynus: stop and upgrade es1015 (es2 eqiad master)- it may create some temporary lag on es2
  • 13:52 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2089 (duration: 00m 49s)
  • 13:43 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2089 (duration: 00m 50s)
  • 13:34 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Setup to db hosts for api-s5 (duration: 00m 50s)
  • 13:23 moritzm: rebooting rdb1004 for kernel security update
  • 13:12 moritzm: rebooting rdb1003 for kernel security update
  • 13:03 gehel: start isolating maps1004 for reimage to stretch - T195285
  • 11:39 elukey: reboot an-master100[1,2] as part of the pre-checks before the hadoop master daemons swap - T203635
  • 11:21 moritzm: installing texlive-bin security updates
  • 10:44 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 49s)
  • 10:43 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
  • 09:42 jynus: stop and upgrade db1066 (s2 eqiad master)- it may create some temporary lag on s2
  • 09:30 godog: upgrade / roll restart thumbor in eqiad / codfw - T20871 T198370
  • 09:03 jynus: stop and upgrade db1067 (s1 eqiad master)- it may create some temporary lag on s1
  • 07:55 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Mon Sep 24 07:55:18 UTC 2018 (duration 10m 50s)
  • 07:44 bawolff@deploy1001: scap sync-l10n completed (1.32.0-wmf.22) (duration: 07m 44s)
  • 07:31 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2062 (duration: 00m 46s)
  • 07:25 bawolff_: manually running l10n update (T205238)
  • 07:09 moritzm: installing libarchive-zip-perl security updates
  • 07:05 volans: wiping netbox DB to re-import it cleanly from racktables - T199083
  • 06:58 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2062 (duration: 00m 50s)
  • 06:57 godog: repair sde on ms-be2041 - T199198
  • 06:57 godog: repair sdn on ms-be1041 - T199198
  • 06:54 volans@deploy1001: Finished deploy [netbox/deploy@5e70423]: Cherry pick of custom fields fix (duration: 00m 05s)
  • 06:54 volans@deploy1001: Started deploy [netbox/deploy@5e70423]: Cherry pick of custom fields fix
  • 05:48 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2088:3311 (duration: 00m 50s)
  • 05:28 marostegui: Deploy schema change on db2088:3311 - T203709
  • 05:27 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2088:3311 (duration: 00m 50s)
  • 04:15 kartik@deploy1001: Finished deploy [cxserver/deploy@3e2d668]: Update cxserver to d913793 (T203551, T203780, T202716, T203947) (duration: 04m 20s)
  • 04:11 kartik@deploy1001: Started deploy [cxserver/deploy@3e2d668]: Update cxserver to d913793 (T203551, T203780, T202716, T203947)
  • 02:48 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Mon Sep 24 02:48:16 UTC 2018 (duration 10m 49s)
  • 02:37 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.22) (duration: 16m 02s)

2018-09-23

  • 21:41 krinkle@deploy1001: Synchronized wmf-config/profiler.php: (no justification provided) (duration: 00m 52s)
  • 04:32 krinkle@deploy1001: Synchronized php-1.32.0-wmf.22/includes/user/User.php: T202149 - Ic0c25f66f23f (duration: 00m 53s)

2018-09-22

  • 18:29 krinkle@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/MultimediaViewer/resources/: I0954c4, T205162 (duration: 00m 56s)

2018-09-21

  • 23:47 mutante: mwmaint1002 - rsyncing home dirs over from mwmaint1001 which synced them from terbium. still large old files from terbium here, like a NASA video :) (T201343)
  • 22:40 mutante: mwmaint1002 - scap pull
  • 22:13 mutante: mwmaint1002 - re-enabling puppet, now that it has mcrouter certs, that works (T201343)
  • 22:13 mutante: mwmaint1002 - re-enabling puppet, now that it has mcrouter certs, that works (201343)
  • 22:12 mutante: how to generate mcrouter certs, needed when you add any new mw server: https://wikitech.wikimedia.org/wiki/Mcrouter#Generate_certs_for_a_new_host
  • 19:10 XioNoX: shutdown cr3/4-ulsfo (not in prod)
  • 18:19 bawolff: deploy patch related to T197279 T204825
  • 18:08 bblack: upgrade multatuli to gdnsd-2.99.1729-beta
  • 18:06 bblack: uploaded 2.99.1729-beta-1+wmf1 to stretch-wikimedia
  • 17:59 mutante: *.toolserver.org also moved from eqiad to eqiad-r region in cloud vps, which gave it new IP addresses
  • 17:58 mutante: toolserver.org and subdomains (wiki.toolserver, status.toolserver, stable.toolserver) legacy URLs have been switched to new stretch backend, away from trusty
  • 16:37 XioNoX: route ns2-v6 to multatuli.wikimedia.org on cr1/2-esams
  • 16:33 mutante: puppetmaster1001: cergen --base-path /srv/private/modules/secret/secrets/mcrouter/ --generate /etc/cergen/mcrouter.manifests.d
  • 16:32 mutante: puppetmaster1001: mcrouter_generate_certs --generate
  • 16:29 mutante: puppetmaster: running mcrouter_generate_certs to add an mcrouter cert for mwmaint1002 (T201343) https://wikitech.wikimedia.org/wiki/Mcrouter#Generate_certs_for_a_new_host
  • 15:39 bblack: upgrading gdnsd to 2.99.42 on authdns2001
  • 15:16 XioNoX: route ns2 to multatuli.wikimedia.org on cr1/2-esams
  • 13:35 ladsgroup@deploy1001: Finished deploy [ores/deploy@7b987a7]: Pass the real IP (T205087) (duration: 24m 10s)
  • 13:11 ladsgroup@deploy1001: Started deploy [ores/deploy@7b987a7]: Pass the real IP (T205087)
  • 13:01 volans: wiping netbox DB to re-import it cleanly from racktables - T199083
  • 12:43 banyek: adding wmf-pt-kill_2.2.20-1+wmf1 package for jessie
  • 12:31 banyek: adding wmf-pt-kill_2.2.20-1+wmf1 package for strech
  • 12:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1006 (duration: 00m 49s)
  • 12:13 marostegui: Stop MySQL on pc1006 for kernel upgrade
  • 12:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1006 (duration: 00m 50s)
  • 07:17 moritzm: uploaded debdeploy 0.0.99.6 to apt.wikimedia.org
  • 07:00 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1005 (duration: 00m 50s)
  • 06:51 marostegui: Stop MySQL on pc1005 for kernel upgrade
  • 06:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1005 (duration: 00m 51s)

2018-09-20

  • 23:49 ejegg: updated fundraising CiviCRM from 7f58670e88 to b0223f56b0
  • 23:06 ejegg: updated payments-wiki from e7d373fa71 to 1af8f8c8c0
  • 21:47 ejegg: updated payments-wiki from 43923fadd9 to e7d373fa71
  • 21:26 mutante: releases1001 - rm /usr/local/sbin/sync-srv-org-wikimedia-releases (bromine remnants)
  • 21:26 mutante: releases1001 - deleting cronjob remnants for releases from bromine
  • 21:17 mutante: releases2001 - manually rsync releases files from releases1001
  • 21:16 hashar: 1.32.0-wmf.22 is fully deployed. A quick summary and thanks words are at https://phabricator.wikimedia.org/T191068#4604040
  • 20:55 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: Comment clean-up no-op (duration: 00m 50s)
  • 20:54 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Comment clean-up no-op (duration: 00m 51s)
  • 20:22 SMalyshev: Started wikidata reindex again, hopefully better luck this time
  • 20:07 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Bye bye tidy config! (duration: 00m 50s)
  • 20:00 gehel: elasticsearch codfw cluster restart for new systemd unit completed
  • 19:57 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable DisableAccount T106067 (duration: 00m 51s)
  • 19:50 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Temporarily re-enable DisableAccount on wikis it was previous enabled (duration: 00m 51s)
  • 19:36 reedy@deploy1001: Pruned MediaWiki: 1.32.0-wmf.15 (duration: 03m 16s)
  • 19:32 reedy@deploy1001: Pruned MediaWiki: 1.32.0-wmf.14 (duration: 02m 05s)
  • 19:22 tzatziki: Add email to account Abalg~commonswiki
  • 19:22 reedy@deploy1001: clean aborted: Pruned MediaWiki: 1.32.0-wmf.14 (duration: 00m 30s)
  • 19:21 reedy@deploy1001: Pruned MediaWiki: 1.32.0-wmf.14 [keeping static files] (duration: 02m 03s)
  • 19:19 bblack: reboot multatuli
  • 19:13 reedy@deploy1001: clean aborted: Pruned MediaWiki: 1.32.0-wmf.4 (duration: 01m 03s)
  • 19:12 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/CentralNotice/resources/subscribing/ext.centralNotice.display.js: SWAT: Add performance mark for when banner is inserted T195840 (duration: 00m 51s)
  • 19:06 reedy@deploy1001: Pruned MediaWiki: 1.32.0-wmf.4 [keeping static files] (duration: 01m 58s)
  • 19:02 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Set $wgSiteMatrixNonGlobalSites global for SiteMatrix List $wgSiteMatrixNonGlobalSites as global (duration: 00m 52s)
  • 18:54 reedy@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/DisableAccount/: maintenance script (duration: 00m 52s)
  • 18:53 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: Revert on canaries: SWAT: Set $wgSiteMatrixNonGlobalSites global for SiteMatrix (duration: 00m 53s)
  • 18:45 thcipriani@deploy1001: scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 18:43 SMalyshev: Initiating in-place reindex for wikidatawiki (T147505)
  • 18:31 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/PageTriage/modules: SWAT: Correctly sync the form when afc_state === "all" T204629 Use same api params for list and stats on page load T204629 (duration: 00m 54s)
  • 18:21 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable logging for CitationUsage and CitationUsagePageLoad T191086 (duration: 00m 51s)
  • 17:59 bblack: reinstalling multatuli
  • 17:35 thcipriani@deploy1001: Finished scap: Noop (hopefully) test of php7.0 (duration: 03m 07s)
  • 17:31 thcipriani@deploy1001: Started scap: Noop (hopefully) test of php7.0
  • 17:24 ladsgroup@deploy1001: Finished deploy [ores/deploy@ee2d28b]: Returning 429 instead of 408 in case of too many requests (T204956) (duration: 21m 15s)
  • 17:24 godog: repair sdk on ms-be2043 - T199198
  • 17:24 godog: upload scap 3.8.6-1 - T204383
  • 17:03 ladsgroup@deploy1001: Started deploy [ores/deploy@ee2d28b]: Returning 429 instead of 408 in case of too many requests (T204956)
  • 15:56 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting MCR migration stage to write-both/read-new on testwiki (T198309) (duration: 00m 59s)
  • 15:17 moritzm: installing glib2.0 security updates for trusty
  • 15:15 stephanebisson: Starting mwscript extensions/PageTriage/maintenance/populateDraftQueue.php --wiki enwiki (T203184)
  • 15:14 stephanebisson: Finished mwscript extensions/PageTriage/maintenance/DeleteAfcStates.php --wiki enwiki (T203184)
  • 15:11 stephanebisson: Starting mwscript extensions/PageTriage/maintenance/DeleteAfcStates.php --wiki enwiki (T203184)
  • 15:05 hashar: 1.32.0-wmf.22 on group2 seems fine so far \o/
  • 15:05 moritzm: installing bind9 security updates (client-side tools and libraries)
  • 14:48 godog: repair sdn on ms-be2040 - T199198
  • 14:44 gehel: reduce replication factor to 2 on cassandra maps eqiad - T194966
  • 14:38 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.22
  • 14:28 hashar@deploy1001: Synchronized typos: Dummy sync to verify list of canaries for T204907 (duration: 00m 59s)
  • 14:27 herron: maps1001:~# tune2fs -m0 /dev/mapper/maps1001--vg-data
  • 14:22 marostegui: Deploy schema change on s5 eqiad master with replication T204006
  • 14:15 moritzm: rebooting rdb1001 for kernel security update
  • 14:15 bblack: upgrade authdns1001 gdnsd 2.99.9 -> 2.99.42
  • 14:12 bblack: upload gdnsd-2.99.42-beta to stretch-wikimedia
  • 14:11 herron: removing unused wiki-mail.wikimedia.org cname (gerrit 143762)
  • 14:05 moritzm: rebooting rdb1002 for kernel security update
  • 13:27 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Tune codfw database weights (duration: 00m 58s)
  • 13:20 gtirloni: T204667 reimage labtestnet2003
  • 12:14 gehel: starting elasticsearch codfw cluster restart for new systemd unit
  • 12:07 zeljkof: EU SWAT finished
  • 12:01 moritzm: rebooting mc1036 for kernel security update
  • 11:48 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Create new namespaces in zhwikiversity (T201675) (duration: 00m 57s)
  • 11:46 moritzm: rebooting mc1035 for kernel security update
  • 11:38 moritzm: rebooting mc1034 for kernel security update
  • 11:35 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Introduce engineer user group on Czech Wikipedia (T203000) (duration: 00m 58s)
  • 11:28 moritzm: rebooting mc1033 for kernel security update
  • 11:19 dcausse@deploy1001: Synchronized ./docroot/search.wikimedia.org/index.php: search.wikimedia.org should properly handle multivalue separation char (0x1F) (duration: 00m 58s)
  • 11:15 jynus: rebooting es200X hosts for upgrade
  • 11:07 moritzm: rebooting mc1032 for kernel security update
  • 11:03 gehel: elasticsearch eqiad cluster restart for new systemd unit completed
  • 10:58 moritzm: rebooting mc1031 for kernel security update
  • 10:43 moritzm: rebooting mc1030 for kernel security update
  • 10:30 moritzm: rebooting mc1029 for kernel security update
  • 10:16 moritzm: rebooting mc1028 for kernel security update
  • 10:05 moritzm: rebooting mc1027 for kernel security update
  • 09:50 moritzm: rebooting mc1026 for kernel security update
  • 09:37 moritzm: rebooting mc1025 for kernel security update
  • 09:26 marostegui: Change passwords for wikiuser on dbstore1002 - T200801
  • 09:24 moritzm: rebooting mc1024 for kernel security update
  • 08:57 moritzm: rebooting mc1023 for kernel security update
  • 08:44 moritzm: rebooting mc1022 for kernel security update
  • 08:06 banyek: replication stopped and tables being compressed for s2 on dbstore2002
  • 08:01 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2066 (duration: 00m 57s)
  • 07:53 marostegui: Deploy schema change on db2066
  • 07:53 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2066 (duration: 00m 57s)
  • 07:46 moritzm: upgrading intel-microcode on trusty systems to 3.20180807a
  • 07:31 godog: repair sdl on ms-be2042 - T199198
  • 07:25 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool pc1004 (duration: 00m 57s)
  • 07:15 marostegui: Stop MySQL on pc1004 for kernel upgrade
  • 07:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool pc1004 (duration: 00m 57s)
  • 06:50 eileen: process-control config revision is 3069f85578
  • 06:42 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2070 (duration: 00m 58s)
  • 06:25 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@58f9ed3]: Reset the Kafka connections
  • 06:20 mobrovac@deploy1001: Synchronized rpc/RunSingleJob.php: Have RunSingleJob send the X-Readonly header - T204154 (duration: 00m 58s)
  • 06:09 marostegui: Deploy schema change on db2070 - T203709
  • 06:07 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2070 (duration: 00m 58s)
  • 05:54 marostegui: Deploy schema change on s3:mediawikiwiki for echo tables codfw - T51593
  • 05:49 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Resume reads on db2034 (duration: 00m 57s)
  • 05:34 marostegui: Deploy schema change on db2034 (x1 master)
  • 05:33 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Stop reads on db2034 (duration: 00m 57s)
  • 05:28 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2069 after alter table (duration: 00m 57s)
  • 05:12 marostegui: Deploy schema change on db2069
  • 05:11 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2069 for alter table (duration: 01m 00s)
  • 03:06 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Thu Sep 20 03:06:52 UTC 2018 (duration 10m 47s)
  • 02:56 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.22) (duration: 13m 54s)
  • 02:23 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 09m 26s)
  • 01:27 krinkle@deploy1001: Synchronized php-1.32.0-wmf.22/resources/src/startup/: I6c77b25856 (duration: 00m 55s)
  • 01:14 krinkle@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/GlobalPreferences/includes/GlobalPreferencesFactory.php: I52434a523f60e, T204864 (duration: 00m 58s)
  • 00:52 krinkle@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/3D/modules/: I9e718957a497, T204621 (duration: 00m 58s)
  • 00:49 mutante: temp disabled puppet on mw2*, deployed gerrit 461393, confirmed noop, re-enabled puppet on mw2*

2018-09-19

  • 23:21 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove obsolete $wgPopupsBetaFeature, part 2 (T203589) (duration: 00m 56s)
  • 23:21 catrope@deploy1001: Synchronized wmf-config/CommonSettings.php: Remove obsolete $wgPopupsBetaFeature, part 1 (T203589) (duration: 00m 56s)
  • 23:17 catrope@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/PageTriage/: PageTriage fixes for T203184 (duration: 00m 58s)
  • 23:14 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable Page issues A/B test on lvwiki (T204609) (duration: 00m 58s)
  • 22:00 XioNoX: merging icinga check_bfd
  • 21:24 jforrester@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/VisualEditor/includes/SpecialCollabPad.php: T204873 Fix special page override hot deploy (duration: 00m 57s)
  • 20:26 ladsgroup@deploy1001: Finished deploy [ores/deploy@76fe25a]: ORES doesn't block hammering IPs (T204862) (duration: 21m 13s)
  • 20:24 XioNoX: repool ulsfo
  • 20:21 ejegg: updated fundraising CiviCRM from 2d9360cfac to 7f58670e88
  • 20:11 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@55d0b34]: Update mobileapps to a224e99 Ia80abe02490 (duration: 03m 29s)
  • 20:07 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@55d0b34]: Update mobileapps to a224e99 Ia80abe02490
  • 20:05 ladsgroup@deploy1001: Started deploy [ores/deploy@76fe25a]: ORES doesn't block hammering IPs (T204862)
  • 19:41 hashar: web request 60 second timeout when deploying is filled as https://phabricator.wikimedia.org/T204871
  • 19:30 hashar: while promoting group1 to 1.32.0-wmf.22 lot of web requests timed out at 60 seconds. Roughly from 19:24 to 19:28. But that is no more occurring | T191068
  • 19:25 hashar@deploy1001: Synchronized php: group1 wikis to 1.32.0-wmf.22 (duration: 00m 55s)
  • 19:24 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.22
  • 19:16 hashar@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/Translate/tag/PageTranslationHooks.php: Avoid warnings and errors caused by x-pagetranslation-tag - T204797 (duration: 00m 58s)
  • 18:43 gehel: starting elasticsearch eqiad cluster restart for new systemd unit
  • 18:33 chasemp: add security_team_bot to acl*security_team in phab
  • 17:52 robh: no errors in notebook1003 SEL
  • 17:51 robh: notebook1003 unresponsive to icinga checks and serial console, rebooting
  • 17:31 mutante: notebook1003 - starting failed nagios-nrpe-server
  • 17:20 cmjohnson1: powering off mw1254 to reseat DIMM T204491
  • 17:03 mobrovac@deploy1001: Finished deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket, take #4 (duration: 09m 03s)
  • 16:54 mobrovac@deploy1001: Started deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket, take #4
  • 16:54 mobrovac@deploy1001: Finished deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket, take #3 (duration: 05m 28s)
  • 16:52 gehel: rolling restart of elasticsearch / logstash for new systemd unit completed - new systemd unit is elasticsearch_5@production-logstash-eqiad
  • 16:50 sbisson@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/Echo/includes/api/ApiCrossWiki.php: T204758 (duration: 00m 57s)
  • 16:49 mobrovac@deploy1001: Started deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket, take #3
  • 16:48 mobrovac@deploy1001: Finished deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket (duration: 05m 43s)
  • 16:42 mobrovac@deploy1001: Started deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket
  • 16:42 mobrovac@deploy1001: Finished deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket (duration: 03m 38s)
  • 16:38 mobrovac@deploy1001: Started deploy [restbase/deploy@b35727e]: Bug fix: return the stored headers in the key rev value bucket
  • 16:38 mobrovac@deploy1001: Finished deploy [restbase/deploy@b35727e] (dev-cluster): Bug fix: return the stored headers in the key rev value bucket (duration: 03m 01s)
  • 16:35 mobrovac@deploy1001: Started deploy [restbase/deploy@b35727e] (dev-cluster): Bug fix: return the stored headers in the key rev value bucket
  • 15:39 XioNoX: rebooting cr4-ulsfo for upgrade (not in prod)
  • 15:24 XioNoX: rebooting cr3-ulsfo for upgrade (not in prod)
  • 15:23 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Reduce db2057 load (duration: 00m 57s)
  • 14:57 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2033 after alter table (duration: 00m 57s)
  • 14:46 herron: updating MX bulk_smtp helo_data (gerrit 461193)
  • 14:36 marostegui: Deploy alter table on db2033 (x1) enwiki
  • 14:35 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2033 for alter table (duration: 00m 59s)
  • 13:39 marostegui: Deploy schema change on dbstore2002:x1
  • 13:25 marostegui: Deploy schema change on x1 eqiad master on enwiki (this will generate some lag on x1 eqiad) - T51593
  • 13:03 arturo: T203177 add initial prometheus-openstack-exporter package to reprepro (v0.0.8-1)
  • 12:50 marostegui: Deploy schema change on s3 eqiad master on mediawikiwiki (this will generate some lag on s3 eqiad) - T51593
  • 12:18 gehel: rolling restart of elasticsearch / logstash for new systemd unit
  • 12:09 gehel: rolling restart of relforge for new systemd unit
  • 12:07 gehel: stopping puppet on all elasticsearch servers to deploy new systemd unit - https://gerrit.wikimedia.org/r/c/operations/puppet/+/440498
  • 12:00 moritzm: reimaging mw1298 (spare host) to test reimages from cumin2001
  • 10:26 marostegui: Deploy schema change on x1:testwiki - T51593
  • 10:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 57s)
  • 10:11 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: Fully pool db2041 back into production (duration: 00m 57s)
  • 10:07 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T203565: Weight adjust for S1 API hosts (duration: 00m 57s)
  • 10:03 jynus: fixing db2033 grants
  • 09:41 marostegui: Upgrade MariaDB and kernel on db1110
  • 09:40 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2084:3315 db2053 db1096:3316 (duration: 00m 58s)
  • 09:39 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db2084:3315 db2053 db1096:3316 (duration: 00m 59s)
  • 09:26 godog: repair sdf on ms-be2043 - T199198
  • 09:14 moritzm: rebooting mc1021 for kernel security update
  • 09:04 moritzm: rebooting mc1020 for kernel security update
  • 09:02 moritzm: updating intel-microcode on Debian jessie/stretch to 3.20180807a.1
  • 08:39 moritzm: rebooting mc1019 for kernel security update
  • 08:35 marostegui: Stop MySQL on db1110 for alert testing - T200509
  • 08:04 moritzm: bounced ferm service on elastic2004
  • 07:53 marostegui: Stop MySQL on db1110 for alert testing - T200509
  • 07:45 marostegui: Stop MySQL on db1096:3316 for alert testing - T200509
  • 07:31 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db2084:3315 db2053 db1110 db1096:3316 db1110 (duration: 00m 57s)
  • 07:30 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2084:3315 db2053 db1110 db1096:3316 db1110 (duration: 00m 57s)
  • 07:26 moritzm: reimaging mw2245 (spare host) to test reimages from cumin2001 (router policies have been updated)
  • 07:25 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 (duration: 00m 57s)
  • 07:12 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 58s)
  • 07:12 marostegui: Disable puppet on databases to test new alerts - T200509 https://phabricator.wikimedia.org/T172489
  • 07:10 jynus: restart db1109 for upgrade
  • 06:56 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2041 with low load (duration: 00m 54s)
  • 06:00 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2036 (duration: 00m 58s)
  • 05:53 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2036 (duration: 00m 57s)
  • 05:49 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2050 (duration: 00m 58s)
  • 05:41 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2050 (duration: 00m 57s)
  • 05:37 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2074 (duration: 00m 57s)
  • 05:31 marostegui: Drop echo tables from s3 db2074 - T153638
  • 05:31 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2074 (duration: 00m 57s)
  • 05:25 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2057 (duration: 00m 58s)
  • 05:19 marostegui: Drop echo tables from s7:kowiki - T153638
  • 05:16 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2057 (duration: 02m 13s)
  • 02:34 twentyafterfour@deploy1001: Finished scap: SWAT: full sync to update l10n for https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikimediaMessages/+/461230/ (duration: 61m 20s)
  • 01:33 twentyafterfour@deploy1001: Started scap: SWAT: full sync to update l10n for https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/WikimediaMessages/+/461230/
  • 01:28 twentyafterfour@deploy1001: Synchronized php-1.32.0-wmf.22/resources/src/mediawiki.util.js: SWAT: sync https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/461227/ (duration: 01m 00s)

2018-09-18

  • 23:21 twentyafterfour@deploy1001: Synchronized wmf-config/: SWAT: tested on mwdebug1001, now syncing wmf-config settings to the whole cluster (duration: 00m 59s)
  • 23:15 twentyafterfour: SWAT: deploying 3 patches for ebernhardson: 77deae12f 7009fe473 and 12cc420fc
  • 22:10 bblack: depooled ulsfo, some uknown router issue
  • 21:42 bblack: authdns1001 seems stable/fine running gdnsd-2.99.9-beta so far. If issues crop up later, don't hesitate to (a) downgrade back to stretch-backports gdnsd-2.3.0-1~bpo9+1 or (b) call me!
  • 21:25 bblack: authdns1001: testing gdnsd version update (2.99.9-beta)
  • 20:58 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set ActorTableSchemaMigrationStage back to OLD on test wikis, mediawikiwiki (T188327, T204669) (duration: 00m 57s)
  • 20:57 catrope@deploy1001: sync-file aborted: Set ActorTableSchemaMigrationStage back to OLD on test wikis, mediawikiwiki (duration: 00m 10s)
  • 20:05 hashar@deploy1001: Synchronized php-1.32.0-wmf.22/extensions/OAuth/frontend/specialpages/SpecialMWOAuthListConsumers.php: Fix escapeForHtml method name - T204757 (duration: 00m 58s)
  • 19:56 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@58f9ed3]: Log errors for HTTP error T203929 (duration: 00m 49s)
  • 19:55 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@58f9ed3]: Log errors for HTTP error T203929
  • 19:43 bblack: uploaded gdnsd-2.99.9-beta1-1+wmf1 to reprepro for stretch-wikimedia
  • 19:20 hashar@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.22
  • 19:18 ppchelko@deploy1001: Finished deploy [restbase/deploy@4c3128f] (dev-cluster): Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints, take 2, feed timing out (duration: 06m 32s)
  • 19:11 ppchelko@deploy1001: Started deploy [restbase/deploy@4c3128f] (dev-cluster): Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints, take 2, feed timing out
  • 18:49 ppchelko@deploy1001: Finished deploy [restbase/deploy@4c3128f]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints, take 2, feed timing out (duration: 06m 14s)
  • 18:43 ppchelko@deploy1001: Started deploy [restbase/deploy@4c3128f]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints, take 2, feed timing out
  • 18:40 ppchelko@deploy1001: Finished deploy [restbase/deploy@4c3128f]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints, take 2, feed timing out (duration: 07m 58s)
  • 18:32 ppchelko@deploy1001: Started deploy [restbase/deploy@4c3128f]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints, take 2, feed timing out
  • 18:31 ppchelko@deploy1001: Finished deploy [restbase/deploy@4c3128f]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints (duration: 08m 55s)
  • 18:22 ppchelko@deploy1001: Started deploy [restbase/deploy@4c3128f]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements, don't monitor PDF endpoints
  • 18:08 ppchelko@deploy1001: Finished deploy [restbase/deploy@55100d4]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements (duration: 03m 37s)
  • 18:04 ppchelko@deploy1001: Started deploy [restbase/deploy@55100d4]: Remove restrictions table T203835, Split 25% PDF traffic to Proton T186748, Metrics endpoints improvements
  • 17:29 mutante: scandium - move from role(spare) to role(parsoid_testing), making it equal to ruthenium (T201366)
  • 16:45 addshore@deploy1001: Synchronized php-1.32.0-wmf.22/includes/watcheditem/WatchedItemStore.php: WatchedItemStore::countVisitingWatchersMultiple() fix T204729 (duration: 00m 57s)
  • 16:43 addshore@deploy1001: Synchronized php-1.32.0-wmf.20/includes/watcheditem/WatchedItemStore.php: WatchedItemStore::countVisitingWatchersMultiple() fix T204729 (duration: 00m 59s)
  • 16:34 XioNoX: delete `filter common-infrastructure4` on cr1/2-eqiad, unused/obsolete after T198623
  • 16:33 mutante: mwmaint2001 - nagios-nrpe-server had status 'failed' and caused all NRPE Icinga checks to fail but recovered after simply starting it again
  • 16:32 mutante: radon - re-enabled disabled puppet without reason (decom) T202040
  • 16:32 mutante: mwmaint2001 - starting nagios-nrpe-server
  • 16:22 XioNoX: delete `term cumin` from cr1/2-eqiad analytics filter (already permited by established-tcp term)
  • 16:02 moritzm: installing policykit-1 security updates on jessie
  • 15:55 elukey@deploy1001: Finished deploy [analytics/refinery@1a6235a]: Fix cron scrips from the NYC offsite (duration: 09m 32s)
  • 15:53 XioNoX: add cumin2001 to mr* security policies - T204730
  • 15:47 moritzm: installing spice security updates
  • 15:46 elukey@deploy1001: Started deploy [analytics/refinery@1a6235a]: Fix cron scrips from the NYC offsite
  • 15:26 XioNoX: add cumin2001 to labs-in4 on cr1/2-eqiad
  • 15:23 hashar@deploy1001: Finished scap: Sync again testwiki to php-1.32.0-wmf.20, I might have screwed up l10ncache (duration: 06m 31s)
  • 15:20 hashar: hashar scap for testwiki is actually php-1.32.0-wmf.22
  • 15:19 mepps: updated process-control to c03857f6a6
  • 15:17 hashar@deploy1001: Started scap: Sync again testwiki to php-1.32.0-wmf.20, I might have screwed up l10ncache
  • 15:15 mepps: updated process-control to f9ab984c5d
  • 15:07 hashar@deploy1001: Finished scap: testwiki to php-1.32.0-wmf.20 (duration: 35m 46s)
  • 14:31 hashar@deploy1001: Started scap: testwiki to php-1.32.0-wmf.20
  • 14:19 marostegui: Drop T153638_echo_XXX tables from db1123
  • 14:02 marostegui: Drop T153638_echo_XXX tables from db1095:3313 - T153638
  • 13:58 papaul: upgrading BIOS on ms-be2030
  • 13:57 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T203565: Repool db1114 and db1119 (duration: 00m 49s)
  • 13:56 godog: poweroff ms-be2030 - T204567
  • 13:48 moritzm: updating intel-microcode on Debian jessie/stretch to 3.20180807a.1
  • 13:43 moritzm: installing redis security updates on maps* servers
  • 13:32 godog: repair sdd on ms-be2041 - T199198
  • 13:31 marostegui: Rename echo_XXX tables on db1095 and db1123
  • 13:26 moritzm: reimaging mw2245 (spare host) to test reimages from cumin2001
  • 13:24 marostegui: Drop T153638_echo_XXX tables from db1078 - T153638
  • 13:21 hashar: Cutting branches wmf/1.32.0-wmf.22 | T191068
  • 13:10 marostegui: Drop T153638_echo_XXX tables from db1077 - T153638
  • 12:12 moritzm: rebooting labnodepool1001 for kernel security update
  • 10:36 jynus: stopping dbstore2002:s2 mariadb instance
  • 10:31 jynus: stop db2041 to clone to dbstore2002
  • 10:25 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2041 (duration: 00m 49s)
  • 10:24 marostegui: Rename echo_XXX tables on db1077 and db1078  - T153638
  • 10:12 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Tune the s3 and s5 database loads (duration: 00m 49s)
  • 10:09 marostegui: Drop T153638_echo_XXX tables from dbstore2002:3313 - T153638
  • 10:06 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1105 (duration: 00m 49s)
  • 10:04 hashar: ci: updating Quibble Jenkins jobs to 0.0.26
  • 10:01 marostegui: Rename echo_XXX tables on dbstore2002:3313 - T153638
  • 09:37 marostegui: Drop T153638_echo_XXX tables from db2057 - T153638
  • 09:10 jynus: stop mariadb (both instances) and restart db1105 for upgrade
  • 08:54 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully depool db1105 (duration: 00m 49s)
  • 08:45 banyek: stop replication & stop mysql on db1119 (preparing to clone db1114)
  • 08:02 godog: bounce rsyslog on wezen/lithium, tls listener was timing out in icinga
  • 08:02 jynus: stop and restart db1105 for upgrade
  • 07:53 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1105 (duration: 00m 50s)
  • 06:55 moritzm: installin zsh security updates
  • 06:21 marostegui: Drop tmp_2 and tmp_3 index from wikidatawiki.recentchanges on dbstore2001, db2079, db2082,db2083 - T202764
  • 06:09 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool es1013 after kernel and mariadb upgrade (duration: 00m 49s)
  • 05:56 marostegui: Stop MySQL on es1013 to upgrade mariadb & kernel
  • 05:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool es1013 for kernel and mariadb upgrade (duration: 00m 49s)
  • 05:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool es1012 after kernel and mariadb upgrade (duration: 00m 50s)
  • 05:35 marostegui: Stop MySQL on es1012 to upgrade mariadb & kernel
  • 05:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool es1012 for kernel and mariadb upgrade (duration: 00m 49s)
  • 05:28 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2081 - T202764 (duration: 00m 49s)
  • 05:22 marostegui: Drop tmp_2 and tmp_3 index from wikidatawiki.recentchanges on db2081 - T202764
  • 05:22 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2081 - T202764 (duration: 00m 49s)
  • 05:14 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2080 - T202764 (duration: 00m 49s)
  • 05:08 marostegui: Drop tmp_2 and tmp_3 index from wikidatawiki.recentchanges on db2080 - T202764
  • 05:08 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2080 - T202764 (duration: 00m 51s)
  • 03:01 ejegg: restarted fundraising CiviCRM jobs
  • 02:46 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Tue Sep 18 02:46:09 UTC 2018 (duration 10m 49s)
  • 02:43 eileen_: civicrm revision changed from b0d7df4e60 to 2d9360cfac
  • 02:37 ejegg: disabled fundraising CiviCRM jobs for 5.6 update
  • 02:35 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 13m 58s)

2018-09-17

  • 23:32 mutante: ms-be2042 - reparing xfs - (T199198)
  • 21:44 andrew@deploy1001: Finished deploy [horizon/deploy@3124052]: Fighting with scap more (duration: 01m 50s)
  • 21:42 andrew@deploy1001: Started deploy [horizon/deploy@3124052]: Fighting with scap more
  • 21:40 andrew@deploy1001: Finished deploy [horizon/deploy@3124052]: Fighting with scap (duration: 00m 10s)
  • 21:40 andrew@deploy1001: Started deploy [horizon/deploy@3124052]: Fighting with scap
  • 21:32 andrew@deploy1001: Finished deploy [horizon/deploy@3124052]: Cleaning up from some by-hand hacks (duration: 00m 10s)
  • 21:32 andrew@deploy1001: Started deploy [horizon/deploy@3124052]: Cleaning up from some by-hand hacks
  • 20:59 andrew@deploy1001: Finished deploy [horizon/deploy@3124052]: Disable unneeded network panels (duration: 03m 34s)
  • 20:56 andrew@deploy1001: Started deploy [horizon/deploy@3124052]: Disable unneeded network panels
  • 20:49 foks: add email address to User:Lanhiaze
  • 20:44 ladsgroup@deploy1001: Finished deploy [ores/deploy@ae96071]: PoolCounter support: Let's get the party started (T160692) (duration: 28m 19s)
  • 20:24 bsitzmann@deploy1001: Finished deploy [mobileapps/deploy@e0b7158]: Update mobileapps to d56e4cf (duration: 04m 16s)
  • 20:19 bsitzmann@deploy1001: Started deploy [mobileapps/deploy@e0b7158]: Update mobileapps to d56e4cf
  • 20:15 ladsgroup@deploy1001: Started deploy [ores/deploy@ae96071]: PoolCounter support: Let's get the party started (T160692)
  • 20:10 reedy@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 03m 26s)
  • 18:56 stephanebisson: Starting mwscript extensions/PageTriage/maintenance/populateDraftQueue.php --wiki enwiki
  • 18:55 stephanebisson: Stopped mwscript extensions/PageTriage/maintenance/populateDraftQueue.php --wiki enwiki
  • 18:46 stephanebisson: Starting mwscript extensions/PageTriage/maintenance/populateDraftQueue.php --wiki enwiki
  • 18:40 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T203184 (duration: 00m 50s)
  • 18:35 sbisson@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Score/includes/Score.php: T203560 (duration: 00m 50s)
  • 18:14 sbisson@deploy1001: Synchronized static/images/project-logos: SWAT (duration: 00m 50s)
  • 17:22 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@6091de6]: Regular WDQS deployment (duration: 11m 45s)
  • 17:10 smalyshev@deploy1001: Started deploy [wdqs/wdqs@6091de6]: Regular WDQS deployment
  • 16:36 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting wgCommentTableSchemaMigrationStage = WRITE_NEW on test wikis, mw.org (T166733) (duration: 00m 50s)
  • 16:18 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting wgActorTableSchemaMigrationStage = WRITE_BOTH on test wikis, mw.org (T188327) (duration: 00m 50s)
  • 15:22 cmjohnson1: db1069 replacing disk in slot 7
  • 15:20 jynus: stop db1061 (may generate s6 lag) for hw maintenance
  • 15:19 jynus: db1061 (may generate s6 lag) for hw maintenance
  • 14:31 banyek: upgrading (kernel & mariadb) db1119
  • 14:27 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T203565: depooling db1119 (duration: 00m 49s)
  • 13:40 godog: repair sdl on ms-be2041 - T199198
  • 13:39 moritzm: removed intel-microcode 3.20180703 from apt.wikimedia.org/jessie-wikimedia (superseded by new release shipped via security.debian.org)
  • 13:30 moritzm: removed intel-microcode 3.20180703 from apt.wikimedia.org/stretch-wikimedia (superseded by new release shipped via security.debian.org)
  • 12:58 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T203565: depooling db1114 (duration: 00m 49s)
  • 12:53 banyek: depooling db1114
  • 12:08 jynus: stop and reimage db1071 (may generate s8 lag)
  • 12:02 zeljkof: EU SWAT finished
  • 12:01 zfilipin@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/CongressLookup/: SWAT: Senator John McCain deceased (T203611) Add Sen. Jon Kyl (T203611) (duration: 00m 51s)
  • 11:47 moritzm: installing ghostscript security updates on stretch
  • 11:37 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable WikidataPageBanner for glwiki (T199713) (duration: 00m 50s)
  • 11:29 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable FileExport on Korean Wikipedia (T204399) (duration: 00m 50s)
  • 11:18 gilles@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T204478 Disable thumbnail prerendering on private wikis (duration: 00m 49s)
  • 11:17 moritzm: installing curl security updates on trusty (Debian already fixed)
  • 10:40 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
  • 10:40 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
  • 10:37 addshore: starting my maint script again
  • 10:31 addshore: stopped my maint script
  • 10:28 addshore: START addshore@mwmaint2001:~$ mwscript refreshLinks.php --wiki wikidatawiki --namespace 146 --e 56031711 54387042 # T198301 T195302
  • 09:50 marostegui: Deploy schema change on db1062 - T203548
  • 09:28 marostegui: Deploy schema change on s7 eqiad master (db1062) - T187089
  • 09:16 jynus: stop and reimage db1068 (may generate s4 lag)
  • 08:43 moritzm: installing postgresql security updates on maps*
  • 08:29 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for enwiki event (T204243) (duration: 00m 49s)
  • 08:07 jynus: stop and reimage db1062 (may generate s7 lag)
  • 07:46 moritzm: repooled wtp1043 (T196886)
  • 07:31 godog: repair sdj on ms-be2040 - T199198
  • 07:29 godog: force power on for ms-be2030
  • 07:24 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1092 and db1099:3318 - T201011 (duration: 00m 49s)
  • 07:24 marostegui: Deploy schema change on s8 - T201011
  • 07:18 marostegui: Stop replication in sync on db1092 and db1099:3318 - T201011
  • 07:18 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1092 and db1099:3318 - T201011 (duration: 00m 49s)
  • 07:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1094 and db1098:3317 - T201011 (duration: 00m 49s)
  • 06:53 marostegui: Stop replication in sync on db1094 and db1098:3317 - T201011
  • 06:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1094 and db1098:3317 - T201011 (duration: 01m 04s)
  • 06:33 marostegui: Deploy schema change on s7 (frwiktionary,metawiki) - T201011
  • 06:31 marostegui: Deploy schema change on s4 - T201011
  • 06:04 marostegui: Deploy schema change on s3 - T201011
  • 05:47 marostegui: Deploy schema change on s3:testwikidatawiki directly on the master (db2043) - T201011
  • 05:36 marostegui: Deploy schema change on s3:testwiki directly on the master (db2043) - T201011
  • 05:28 marostegui: Deploy schema change on s1 eqiad master (db1067) - T67448 T114117 T51191
  • 02:46 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Mon Sep 17 02:46:18 UTC 2018 (duration 10m 52s)
  • 02:35 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 14m 20s)

2018-09-15

  • 11:57 joal@deploy1001: Finished deploy [analytics/refinery@f4d1f24]: Deploying cluster to prevent failures from missing misc data (duration: 15m 19s)
  • 11:42 joal@deploy1001: Started deploy [analytics/refinery@f4d1f24]: Deploying cluster to prevent failures from missing misc data
  • 04:53 SMalyshev: re-pooled wdqs2003
  • 01:08 SMalyshev: depooled wdqs2003 to let it catch up

2018-09-14

  • 21:19 andrew@deploy1001: Finished deploy [horizon/deploy@56340cd]: Fix proxy creation in neutron regions (duration: 03m 31s)
  • 21:16 andrew@deploy1001: Started deploy [horizon/deploy@56340cd]: Fix proxy creation in neutron regions
  • 21:03 mutante: ACKed memory error alert on wtp2011 - existing ticket but fresh alert popped up 9h ago (T200678)
  • 20:12 ejegg: updated payments-wiki from 44b9409104 to 43923fadd9
  • 15:57 jynus: set thread_pool_size to 64 at db2047
  • 14:42 ema: lvs3002: restart pybal to remove misc-web T164609
  • 14:28 ema: lvs2002: restart pybal to remove misc-web T164609
  • 14:23 ema: lvs1002: restart pybal to remove misc-web T164609
  • 14:11 jynus: shutting down db1062 for dc maintenance T204302
  • 13:58 jynus: stop and restart db1118 for upgrade
  • 12:51 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T204127: Weight Adjust db2068 (duration: 00m 50s)
  • 09:47 banyek: operation stopping db1070 in preparation for reimage
  • 08:45 marostegui: Deploy schema change on s7 eqiad master (db1062) - T89737
  • 08:34 mobrovac@deploy1001: Started restart [cpjobqueue/deploy@32a81be]: (no justification provided)
  • 08:29 moritzm: rebooting acamar for kernel tests
  • 08:27 elukey: reboot kafka100[1-3] (eventbus eqiad) for kernel upgrades
  • 08:20 jynus: stopping and restarting db1069 for upgrade (x1 eqiad master)
  • 08:15 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T204127: Weight Adjust db2068 (duration: 00m 50s)
  • 07:49 marostegui: Deploy schema change on s4 eqiad master (db1068) - T67448 T114117 T51191
  • 07:40 moritzm: rebooting mwmaint1001 for kernel security update
  • 07:28 jynus: stopping db1062 mariadb in preparation for reimage
  • 07:11 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T204127: Weight Adjust db2068 (duration: 00m 50s)
  • 05:34 marostegui: Deploy schema change on s3 eqiad master - T187089
  • 05:29 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2050 - T189101 (duration: 00m 49s)
  • 05:17 marostegui: Stop replication in sync on db1075 and db2050 - T189101
  • 05:16 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2050 - T189101 (duration: 00m 49s)
  • 05:10 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Increase weight for db2054 - T204127 (duration: 00m 50s)
  • 05:06 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Increase weight for db2068 - T204127 (duration: 00m 52s)

2018-09-13

  • 21:55 urandom: running fstrim --all on restbase1013 -- T89584
  • 21:51 urandom: running fstrim --all on restbase1008 -- T89584
  • 21:35 urandom: running fstrim --all on restbase1011 -- T89584
  • 21:33 urandom: running fstrim --all on restbase1007 -- T89584
  • 19:31 herron: shutting down old puppet compiler instances in puppet3-diffs project T191438
  • 19:24 herron: moving puppet-compiler.wmlabs.org proxy from puppet3-diffs to puppet-diffs project T191438
  • 18:40 cmjohnson1: updating f/w cloudvirt1019 disabled icinga checks
  • 17:42 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/resources/src/mediawiki.widgets/mw.widgets.CheckMatrixWidget.js: I1f92479bf1, T203325 (duration: 00m 50s)
  • 17:41 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/includes/widget/CheckMatrixWidget.php: I1f92479bf1, T203325 (duration: 00m 49s)
  • 17:40 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/includes/htmlform/fields/HTMLCheckMatrix.php: I1f92479bf1, T203325 (duration: 00m 51s)
  • 17:38 XioNoX: enable cr2:xe-4/0/0 (to asw-a) - T203719
  • 17:31 XioNoX: disable cr2:xe-4/0/0 (to asw-a) for optics replacement (round 2, 1st one didn't clear the errors, need to do the other side) - T203719
  • 17:29 mutante: radium - removing tor package, clearing systemd failed units, to clear Icinga alerts from this host that is to be decomed
  • 17:29 XioNoX: enable cr2:xe-4/0/0 (to asw-a) for optics replacement - T203719
  • 17:25 XioNoX: disable cr2:xe-4/0/0 (to asw-a) for optics replacement - T203719
  • 16:45 godog: repair sde on ms-be1041 - T199198
  • 15:26 jynus: restarting again db1075 for proper kernel upgrade
  • 15:25 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Increase weight for db2054 - T204127 (duration: 00m 49s)
  • 15:15 herron: repool mx1001 — upgrade to stretch complete T175361
  • 15:02 akosiaris: upload blubber_0.5.0-1 to apt.wikimedia.org/{stretch,jessie}-wikimedia/main T203121
  • 14:47 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Increase weight for db2054 - T204127 (duration: 00m 50s)
  • 14:36 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1011 - https://phabricator.wikimedia.org/T174047
  • 14:35 bstorm_: switch labstore1004 and labstore1005 to using the cfq scheduler on the DRBD volumes
  • 14:32 moritzm: rebooting dns1002 for kernel security update
  • 13:59 herron: depool mx1001 and relay queued messages to mx2001 for upgrade to stretch T175361
  • 13:51 moritzm: rebooting dns1001 for kernel security update
  • 13:28 andrew@deploy1001: Finished deploy [horizon/deploy@c9c7a56]: Improvements for VM creation in eqiad1, T167293 (take two) (duration: 04m 10s)
  • 13:24 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Increase weight for db2054 - T204127 (duration: 00m 49s)
  • 13:23 andrew@deploy1001: Started deploy [horizon/deploy@c9c7a56]: Improvements for VM creation in eqiad1, T167293 (take two)
  • 13:20 andrew@deploy1001: Finished deploy [horizon/deploy@12aa2d3]: Improvements for VM creation in eqiad1, T167293 (duration: 00m 13s)
  • 13:20 andrew@deploy1001: Started deploy [horizon/deploy@12aa2d3]: Improvements for VM creation in eqiad1, T167293
  • 13:18 marostegui: Deploy schema change on s2 eqiad master (db1066) - T89737
  • 13:07 marostegui: Deploy schema change on s8 eqiad master (db1071) - T187089
  • 13:03 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2085:3318 - T189101 (duration: 00m 49s)
  • 12:55 marostegui: Stop db1071 (s8 eqiad master) and db2085:3318 in sync - T189101
  • 12:54 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2085:3318 - T189101 (duration: 00m 49s)
  • 12:45 marostegui: Reload haproxy on dbproxy1011 to repool labsdb1009 - https://phabricator.wikimedia.org/T174047
  • 11:47 moritzm: rebooting dns recursors in codfw for kernel security update
  • 11:38 jynus: stopping eqiad s3 replication and shuttinf down db1075 in preparation for reimage T148507
  • 10:49 banyek: shutting down db2068 and dbstore2001:s7 for cloning
  • 10:46 moritzm: uploaded debdeploy 0.0.99.5-1+deb9u1 to apt.wikimedia.org/stretch-wikimedia
  • 10:33 marostegui: Deploy schema change on s4 eqiad master (db1068) - T187089
  • 10:33 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Slowly repool db2054 - T204127 (duration: 00m 50s)
  • 09:57 marostegui: Deploy schema change on s4 eqiad master (db1068) - T89737
  • 09:54 godog: repair sdd on ms-be2043 - T199198
  • 09:44 marostegui: Enable GTID on eqiad masters - T189107
  • 09:41 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wtp2020.codfw.wmnet
  • 09:34 moritzm: rebooting dns recursors in esams for kernel security update
  • 09:24 marostegui: Reload haproxy on dbproxy1011 to depool labsdb1009 - https://phabricator.wikimedia.org/T174047
  • 09:08 marostegui: Deploy schema change on s8 eqiad master (db1071) - T89737
  • 08:59 hashar: Bumped CI jobs based on Debian Stretch to use Chrome 69 and Firefox 60 | T203902
  • 08:51 moritzm: rebooting dns recursors in ulsfo for kernel security update
  • 08:49 marostegui: Deploy schema change on s5 eqiad master (db1070) - T89737
  • 08:41 moritzm: removed labvirt1019/labvirt1020 from debmonitor (T204004)
  • 08:36 marostegui: Deploy schema change on s6 eqiad master (db1061) - T89737
  • 08:29 moritzm: installing curl security updates
  • 08:21 marostegui: Deploy schema change on s6 eqiad master (db1061) - T187089
  • 08:16 marostegui: Stop replication on s4 eqiad master (db1068) and deploy a schema change - this will generate lag on s4 eqiad - T144010
  • 08:08 marostegui: Deploy schema change on s5 eqiad master - T187089
  • 08:01 moritzm: rebooting dns recursors in eqsin for kernel security update
  • 08:01 marostegui: Disconnect replication eqiad -> codfw on s1-s8, x1, es2, es3 - T189107
  • 07:47 elukey: reboot kafka10[12-23] (old analytics cluster) for kernel upgrades
  • 07:46 elukey: execute apt-get autoremove on notebook* to remove old nginx packages (not used anymore)
  • 07:25 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Tune db weights even more (duration: 00m 50s)
  • 06:56 elukey: reboot notebook100[3,4] for kernel upgrades - T203165
  • 06:42 elukey: reboot stat100[4-6] for kernel upgrades - T203165
  • 05:11 marostegui: Stop MySQL on db2054 and dbstore2001:3317 to clone db2054 - T204127
  • 02:46 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Thu Sep 13 02:46:16 UTC 2018 (duration 10m 46s)
  • 02:35 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 14m 03s)
  • 02:24 ejegg: restarted fundraising CiviCRM jobs
  • 01:48 ejegg: updated CiviCRM from 2a3583cca1 to b0d7df4e60
  • 01:45 ejegg: disabled fundraising CiviCRM jobs
  • 00:00 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@7e5e537]: Deploy Blazegraph & Updater for T202765 and T203646 handling (duration: 23m 45s)

2018-09-12

  • 23:36 smalyshev@deploy1001: Started deploy [wdqs/wdqs@7e5e537]: Deploy Blazegraph & Updater for T202765 and T203646 handling
  • 22:41 Trey314159: reindexing Esperanto wikis on elastic@codfw and elastic@eqiad complete (T203005)
  • 19:50 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Decrease db2061 load (duration: 00m 50s)
  • 19:13 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Decrease db2065 load (duration: 00m 49s)
  • 18:24 Trey314159: reindexing Esperanto wikis on elastic@codfw and elastic@eqiad (T203005)
  • 17:59 otto@deploy1001: Finished deploy [analytics/refinery@407da92]: Deploying refinery-source 0.0.74 jars for T203804 (duration: 06m 11s)
  • 17:53 otto@deploy1001: Started deploy [analytics/refinery@407da92]: Deploying refinery-source 0.0.74 jars for T203804
  • 17:52 otto@deploy1001: Finished deploy [analytics/refinery@407da92]: Deploying refinery-source 0.0.74 jars for T203804 (duration: 01m 41s)
  • 17:50 otto@deploy1001: Started deploy [analytics/refinery@407da92]: Deploying refinery-source 0.0.74 jars for T203804
  • 17:38 akosiaris: restart icinga T196336
  • 16:53 ottomata: restarting eventlogging processors to pick up change to mysql whitelist
  • 16:52 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Decrease weight for db2065 (duration: 00m 49s)
  • 16:48 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=wtp2020.codfw.wmnet
  • 16:39 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: cirrussearch: eqiad -> local (codfw) (duration: 00m 50s)
  • 16:32 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Reduce db2065, db2061 weights (duration: 00m 48s)
  • 16:14 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Decrease weight for db2049 (duration: 00m 49s)
  • 16:06 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Reduce db2056 weight (duration: 00m 49s)
  • 15:56 akosiaris: switch s*-master DNS records to codfw
  • 15:51 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Decrease weight for db2049 (duration: 00m 50s)
  • 15:38 krinkle@deploy1001: Synchronized wmf-config/CommonSettings.php: 43b98e414 (duration: 00m 50s)
  • 15:32 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Decrease weight for db2070 (duration: 00m 49s)
  • 15:24 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-parsoid (exit_code=0) (volans@sarin)
  • 15:22 oblivian@sarin: conftool action : set/weight=15; selector: dc=codfw,cluster=api_appserver,service=apache2,name=mw21.*
  • 15:19 oblivian@sarin: conftool action : set/weight=20; selector: dc=codfw,cluster=api_appserver,service=apache2,name=mw22[2-9].*
  • 15:17 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Tweak weights in s1 and s2 (duration: 00m 50s)
  • 15:15 _joe_: repool mw2141
  • 15:14 _joe_: depooling mw2141 for investigation
  • 15:08 START: - Cookbook sre.switchdc.mediawiki.08-restart-parsoid (volans@sarin)
  • 15:05 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) (volans@sarin)
  • 15:05 START: - Cookbook sre.switchdc.mediawiki.08-restore-ttl (volans@sarin)
  • 15:05 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) (volans@sarin)
  • 15:03 START: - Cookbook sre.switchdc.mediawiki.08-start-maintenance (volans@sarin)
  • 14:59 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0) (volans@sarin)
  • 14:58 START: - Cookbook sre.switchdc.mediawiki.08-update-tendril (volans@sarin)
  • 14:52 END: (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) (volans@sarin)
  • 14:51 MediaWiki: read-only period ends at: 2018-09-12 14:51:58.936291 (volans@sarin)
  • 14:51 START: - Cookbook sre.switchdc.mediawiki.07-set-readwrite (volans@sarin)
  • 14:51 END: (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) (volans@sarin)
  • 14:51 START: - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (volans@sarin)
  • 14:50 END: (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0) (volans@sarin)
  • 14:50 START: - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (volans@sarin)
  • 14:50 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-traffic (exit_code=0) (volans@sarin)
  • 14:48 START: - Cookbook sre.switchdc.mediawiki.04-switch-traffic (volans@sarin)
  • 14:47 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) (volans@sarin)
  • 14:47 START: - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (volans@sarin)
  • 14:47 END: (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) (volans@sarin)
  • 14:46 START: - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (volans@sarin)
  • 14:44 END: (FAIL) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=99) (volans@sarin)
  • 14:44 MediaWiki: read-only period starts at: 2018-09-12 14:44:24.536913 (volans@sarin)
  • 14:44 START: - Cookbook sre.switchdc.mediawiki.02-set-readonly (volans@sarin)
  • 14:41 END: (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) (volans@sarin)
  • 14:41 START: - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (volans@sarin)
  • 14:40 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-wipe-and-warmup-caches (exit_code=0) (volans@sarin)
  • 14:34 jynus: stopping mariadb at db2054
  • 14:29 START: - Cookbook sre.switchdc.mediawiki.00-wipe-and-warmup-caches (volans@sarin)
  • 14:13 jynus: starting replication on db2068
  • 14:07 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) (volans@sarin)
  • 14:07 START: - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (volans@sarin)
  • 14:06 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) (volans@sarin)
  • 14:06 START: - Cookbook sre.switchdc.mediawiki.00-disable-puppet (volans@sarin)
  • 14:04 volans: starting switchover meny "cookbook sre.switchdc.mediawiki eqiad codfw" on sarin - T203777
  • 14:03 jynus: stopping db2068
  • 13:59 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2054 and db2068 looks like they are stuck (duration: 00m 50s)
  • 13:52 ema: restart varnish-be on cp1081 due to mbox lag and 503s
  • 13:24 marostegui: Reload haproxy on dbproxy1010 to repool labsdb1010 - T174047
  • 13:11 otto@deploy1001: Finished deploy [eventlogging/analytics@5c6fab6]: Support loading plugins in eventlogging-processor - T203596 (duration: 00m 05s)
  • 13:11 otto@deploy1001: Started deploy [eventlogging/analytics@5c6fab6]: Support loading plugins in eventlogging-processor - T203596
  • 13:08 jynus@deploy1001: Synchronized wmf-config/CommonSettings.php: Disabling CentralNotice translations (duration: 00m 50s)
  • 12:37 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 with original load (duration: 00m 50s)
  • 12:23 marostegui: Reload haproxy on dbproxy1010 to depool labsdb1010 - https://phabricator.wikimedia.org/T174047
  • 12:17 godog: repair sdm / sdj on ms-be2042 - T199198
  • 12:07 elukey: delete meitnerium.wikimedia.org's ganeti VM (decommissioned) - T203087
  • 11:15 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 with low load (duration: 00m 50s)
  • 10:56 moritzm: uploaded poolcounter 1.0.4+deb9u1 to apt.wikimedia.org (T199876)
  • 10:54 jynus: restart db1110 for upgrade
  • 10:32 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1121, depool db1110 (duration: 00m 50s)
  • 09:57 jynus: restart db1121 for upgrade
  • 09:46 marostegui: Drop users: connect, status and fabmigrate from dbstore1002 - T200801
  • 09:07 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1098 (s6 and s7) and depool db1121 (duration: 00m 49s)
  • 08:36 godog: repair sdc on ms-be2041 - T199198
  • 08:05 jynus: stopping db1098 (both db instances) for maintenance
  • 08:00 mobrovac@deploy1001: Started restart [proton/deploy@ecb9a0e]: (no justification provided)
  • 07:53 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1098 (duration: 00m 50s)
  • 07:45 addshore@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Store/: REVERT: Debug logging for T97368 gerrit:459905 (duration: 00m 51s)
  • 07:38 addshore@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Store/: Debug logging for T97368 gerrit:459905 (duration: 00m 53s)
  • 06:57 mobrovac@deploy1001: Finished deploy [proton/deploy@ecb9a0e]: Update to Puppeteer v1.7.0 and fix browser connection abort handling - T181623 (duration: 00m 58s)
  • 06:56 mobrovac@deploy1001: Started deploy [proton/deploy@ecb9a0e]: Update to Puppeteer v1.7.0 and fix browser connection abort handling - T181623
  • 06:35 moritzm: installing confuse security updates
  • 02:38 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Wed Sep 12 02:38:53 UTC 2018 (duration 10m 44s)
  • 02:28 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 08m 30s)

2018-09-11

  • 22:06 legoktm: legoktm@mwmaint1001:~$ mwscript deleteEqualMessages.php --wiki fixcopyrightwiki --delete --lang-code='*'
  • 21:56 legoktm@deploy1001: Finished scap: i18n updates for fixcopyright (duration: 32m 21s)
  • 21:23 legoktm@deploy1001: Started scap: i18n updates for fixcopyright
  • 21:22 legoktm@deploy1001: Synchronized php-1.32.0-wmf.20/skins/EUCopyrightCampaignSkin/: add og:image meta tag - https://gerrit.wikimedia.org/r/459836 (duration: 00m 51s)
  • 21:03 mutante: restarted apache on mwdebug1002, running puppet
  • 19:38 ema: switch all services to codfw only
  • 19:31 ema: switch restbase to active/active
  • 19:20 ema: depool eqiad from edge traffic
  • 19:06 ema: route esams via codfw
  • 17:10 XioNoX: delete BGP sessions with old AS10089 router on cr1-eqsin
  • 16:53 godog: repair sdd on ms-be1043 - T199198
  • 16:27 mutante: added gtirloni to acl*sre-team on Phabricator (T203489)
  • 16:17 godog: correction, sdk1 on ms-be1041 - T199198
  • 16:16 godog: repair sdd1 on ms-be1043 - T199198
  • 15:06 godog: serve switch originals and thumbs from codfw only
  • 15:00 godog: begin switching swift to codfw
  • 14:40 END: (PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0) (akosiaris@sarin)
  • 14:40 START: - Cookbook sre.switchdc.services.02-restore-ttl (akosiaris@sarin)
  • 14:38 END: (PASS) - Cookbook sre.switchdc.services.01-switch-dc (exit_code=0) (akosiaris@sarin)
  • 14:38 Switching: services parsoid, restbase, restbase-async, mobileapps, apertium, citoid, cxserver, eventstreams, graphoid, mathoid, proton, pdfrender, recommendation-api, zotero, eventbus, ores, wdqs, wdqs-internal: eqiad => codfw (akosiaris@sarin)
  • 14:38 START: - Cookbook sre.switchdc.services.01-switch-dc (akosiaris@sarin)
  • 14:38 END: (PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0) (akosiaris@sarin)
  • 14:32 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (akosiaris@sarin)
  • 14:31 END: (FAIL) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=99) (akosiaris@sarin)
  • 14:31 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (akosiaris@sarin)
  • 14:31 END: (FAIL) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=99) (akosiaris@sarin)
  • 14:31 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (akosiaris@sarin)
  • 13:21 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) (volans@sarin)
  • 13:21 START: - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (volans@sarin)
  • 13:14 END: (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) (volans@sarin)
  • 13:14 START: - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (volans@sarin)
  • 13:12 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) (volans@sarin)
  • 13:12 START: - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (volans@sarin)
  • 13:08 volans: performing some additional switchdc live test
  • 13:02 volans: upgraded spicerack to version 0.0.8 on sarin/neodymium - T199079
  • 12:28 gehel: restarting tilerator on maps1* (eqiad) - heap memory exceeded
  • 12:09 moritzm: installing jq security updates on trusty
  • 12:01 dereckson@deploy1001: Synchronized wmf-config/throttle.php: Update Informatika SZŠ Chomutov throttle rule (T203909) (duration: 00m 50s)
  • 12:00 dereckson@deploy1001: sync-file aborted: Update Informatika SZŠ Chomutov throttle rule (duration: 00m 04s)
  • 11:49 volans: uploaded spicerack_0.0.8-1{,+deb9u1} to apt.wikimedia.org {jessie,stretch}-wikimedia - T199079
  • 11:37 moritzm: restarting hhvm on mw1261-mw1265 to pick up curl security updates
  • 11:25 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set category collation to uca-et-u-kn on Estonian-language wikis (T202977) (duration: 00m 50s)
  • 10:37 marostegui: Disable GTID on all codfw masters (sX, x1, esX) (not in db2040 as it is not enabled there) T189107
  • 10:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1096:3315, db1100 (duration: 00m 49s)
  • 10:30 tgr@deploy1001: Finished scap: T204018 update i18n on fixcopyrightwiki (duration: 31m 01s)
  • 10:27 marostegui: db1096:3315 and db1100 were test pages - NO MORE TEST PAGES ARE EXPECTED FROM NOW ON - T200509
  • 10:16 marostegui: Stop replication on db2075 to test the paging (should not page)
  • 10:14 marostegui: Stop replication on db1100 to test the paging
  • 10:03 marostegui: Stop replication on db2084:3315 for alert testing
  • 09:59 tgr@deploy1001: Started scap: T204018 update i18n on fixcopyrightwiki
  • 09:54 marostegui: Stop replication on db1096:3315 for paging testing
  • 09:25 moritzm: installing curl security updates
  • 08:39 godog: repair xfs on sdh/sdc on ms-be2040 - T199198
  • 08:27 marostegui: Stop replication on db1100 for new alert testing (this should generate a page) T200509
  • 08:25 jynus: restarting replication on db2034 after testing dc switch replication sync step
  • 08:14 marostegui: Stop replication on db1096:3315 for new alert testing (this should generate a page) T200509
  • 08:13 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1096:3315, db1100 (duration: 00m 49s)
  • 08:13 jynus: stopping replication on db2034 to test dc switch replication sync step
  • 08:12 marostegui@deploy1001: sync-file aborted: Depool db1096:3315, db1100 (duration: 00m 08s)
  • 08:05 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2084:3315 and db2075 (duration: 00m 49s)
  • 07:49 marostegui: Stop replication on db2075 for alert testing T200509
  • 07:33 marostegui: Stop replication on db2084:3315 for alert testing T200509
  • 07:27 marostegui: Disable puppet on all the DBs for alert testing - https://phabricator.wikimedia.org/T200509
  • 05:57 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2084:3315 and db2075 (duration: 00m 51s)
  • 02:45 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Tue Sep 11 02:45:31 UTC 2018 (duration 10m 44s)
  • 02:44 Krinkle: krinkle@mwmaint1001$ mwscript deleteEqualMessages.php --wiki fixcopyrightwiki --delete --lang-code='*'
  • 02:34 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 13m 27s)
  • 01:00 Krinkle: krinkle@mwmaint1001$ mwscript deleteEqualMessages.php --wiki fixcopyrightwiki --delete --lang-code='*'

2018-09-10

  • 22:56 andrewbogott: rebooting/reimaging labvirt1019 and 1020 for T204003
  • 22:43 ejegg: updated CiviCRM from 52873b5dcc to 2a3583cca1
  • 22:00 ejegg: updated payments-wiki from 05d796e844 to 44b9409104
  • 21:24 bawolff@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/OpenStackManager/special/SpecialNovaSudoer.php: T203885 (duration: 00m 50s)
  • 20:29 volans: running warmup script against codfw appservers
  • 19:03 brion: stopped requeueTranscodes.php job on mwmaint1001.eqiad pending dc switch
  • 17:42 XioNoX: trunk cloud-instances1-b-eqiad to cloudnet1003/4:eth1 - T202636
  • 17:03 elukey: stop ircecho to avoid excessive spamming
  • 16:37 elukey: rolling restart of aqs on aqs* to pick new druid backend settings
  • 15:24 gehel: reducing elasticsearch low watermark to 75% on cirrus / eqiad cluster
  • 15:07 elukey: reboot analytics100[1,2] for kernel security upgrades
  • 14:32 elukey: reboot analytics1003 for kernel + openjdk-8 upgrades
  • 14:28 anomie@deploy1001: Synchronized php-1.32.0-wmf.20/includes/parser/ParserOutput.php: Backport for T203716 (duration: 00m 50s)
  • 14:26 hashar: Switching CI Jenkins mail server from mx1001 to localhost | T203607
  • 13:45 marostegui: Drop user metrics and wikilytics from dbstore1002
  • 12:23 moritzm: restarting hhvm on mw1261-mw1265 to pick up lcms security update
  • 12:09 moritzm: installing lcms2 security updates
  • 11:57 moritzm: installing chromium on proton* (tested on deployment-prep with the new release)
  • 11:44 volans: completed execution of "cookbook sre.switchdc.mediawiki --live-test codfw eqiad" - T199073
  • 11:27 END: (FAIL) - Cookbook sre.switchdc.mediawiki.08-restart-parsoid (exit_code=99) (volans@sarin)
  • 11:21 START: - Cookbook sre.switchdc.mediawiki.08-restart-parsoid (volans@sarin)
  • 11:19 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0) (volans@sarin)
  • 11:18 START: - Cookbook sre.switchdc.mediawiki.08-update-tendril (volans@sarin)
  • 11:16 ladsgroup@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/CentralAuth/maintenance/deleteLocalPasswords.php: SWAT: Fix typo (T201009) (duration: 00m 50s)
  • 11:14 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) (volans@sarin)
  • 11:13 START: - Cookbook sre.switchdc.mediawiki.08-start-maintenance (volans@sarin)
  • 11:11 END: (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0) (volans@sarin)
  • 11:11 START: - Cookbook sre.switchdc.mediawiki.08-restore-ttl (volans@sarin)
  • 11:09 END: (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) (volans@sarin)
  • 11:09 [DRY-RUN]: MediaWiki read-only period ends at: 2018-09-10 11:09:40.587760 (volans@sarin)
  • 11:09 START: - Cookbook sre.switchdc.mediawiki.07-set-readwrite (volans@sarin)
  • 11:08 moritzm: uploaded a co-installable 4.14 kernel to apt.wikimedia.org (to be used for installer tests)
  • 11:06 ladsgroup@deploy1001: Synchronized wmf-config/CommonSettings.php: Add ['null'] (T201009) (duration: 00m 50s)
  • 10:57 END: (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) (volans@sarin)
  • 10:57 START: - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (volans@sarin)
  • 10:52 END: (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0) (volans@sarin)
  • 10:52 START: - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (volans@sarin)
  • 10:50 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-traffic (exit_code=0) (volans@sarin)
  • 10:48 START: - Cookbook sre.switchdc.mediawiki.04-switch-traffic (volans@sarin)
  • 10:41 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
  • 10:40 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
  • 10:38 END: (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) (volans@sarin)
  • 10:38 START: - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (volans@sarin)
  • 10:29 END: (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) (volans@sarin)
  • 10:28 START: - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (volans@sarin)
  • 10:27 moritzm: uploaded linux-meta 1.20+deb9u2 for apt.wikimedia.org/stretch-wikimedia
  • 10:25 END: (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) (volans@sarin)
  • 10:25 [DRY-RUN]: MediaWiki read-only period starts at: 2018-09-10 10:25:20.558408 (volans@sarin)
  • 10:25 START: - Cookbook sre.switchdc.mediawiki.02-set-readonly (volans@sarin)
  • 10:12 END: (FAIL) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=99) (volans@sarin)
  • 10:12 START: - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (volans@sarin)
  • 10:02 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-wipe-and-warmup-caches (exit_code=0) (volans@sarin)
  • 09:53 START: - Cookbook sre.switchdc.mediawiki.00-wipe-and-warmup-caches (volans@sarin)
  • 09:36 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) (volans@sarin)
  • 09:36 START: - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (volans@sarin)
  • 09:32 END: (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0) (volans@sarin)
  • 09:31 START: - Cookbook sre.switchdc.mediawiki.00-disable-puppet (volans@sarin)
  • 09:30 volans: starting execution of "cookbook sre.switchdc.mediawiki --live-test codfw eqiad" - T199073
  • 08:22 marostegui: Drop users metric and wikilytics from core databases
  • 08:04 marostegui: Drop unused root grants from core servers
  • 07:46 moritzm: installing ghostscript security updates
  • 07:18 volans: restarted pdfrender on scb2004 - T174916
  • 07:04 oblivian@deploy1001: Synchronized wmf-config/throttle.php: Deploy throttle rule for Czech School T203909 (duration: 00m 51s)
  • 02:51 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Mon Sep 10 02:51:00 UTC 2018 (duration 10m 52s)
  • 02:40 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 13m 48s)
  • 00:46 tstarling@deploy1001: Synchronized wmf-config/set-time-limit.php: (no justification provided) (duration: 00m 49s)
  • 00:12 tstarling@deploy1001: Synchronized w/infinite-loop.php: Testing for T97192 (duration: 00m 48s)
  • 00:07 tstarling@deploy1001: Synchronized wmf-config/PhpAutoPrepend.php: T97192 (duration: 00m 49s)
  • 00:04 tstarling@deploy1001: Synchronized wmf-config/set-time-limit.php: T97192 (duration: 00m 52s)

2018-09-08

  • 09:45 gtirloni: tools restarted cron and truncated /var/log/exim4/paniclog (T196137)
  • 04:22 krinkle@deploy1001: Synchronized multiversion/: Ia27a8f7ed612f (duration: 00m 49s)
  • 04:16 krinkle@deploy1001: Synchronized wmf-config/profiler.php: Ia27a8f7ed612f (duration: 00m 54s)
  • 01:10 mutante: also rsyncing /var/lib/tor-instances/ data for second instance and restarting service (T196701)
  • 00:53 mutante: radium - stopping rsync.service
  • 00:27 mutante: torrelay1001 - reset internal state (sighup) with "arm" and pressing x twice
  • 00:18 mutante: to watch what is happenin on torrelay1001 - sudo -u debian-tor arm - if asked for password it's in passwords::tor in private
  • 00:16 mutante: tor relay switched over from radium to torrelay1001, fixed /var/lib/tor permissions, restarted service, flipped DNS CNAME (5M TTL), traffic can be seen with "arm", monitoring all green (T196701)

2018-09-07

  • 23:26 mutante: ms-be2042 - repairing /dev/sdj1 (T199198)
  • 23:25 mutante: ms-be2041 - repairing /dev/sdh1 (T199198)
  • 23:23 mutante: ms-be1041 - repairing xfs per https://wikitech.wikimedia.org/wiki/Swift/How_To#Repair_xfs_free_blocks_counter_corruption (T199198)
  • 22:17 mutante: gerrit - restarting for config change to move log files to /var/log/gerrit/
  • 22:16 mutante: - cobalt (gerrit) - applying change to move log file location, manually moved logs to /var/log/gerrit, remove old log dir, let puppet re-create it, like on gerrit2001
  • 21:31 mutante: gerrit2001, moving gerrit logfiles to /var/log/gerrit, removing old gerrit logdir, letting puppet re-create it as symlink
  • 18:20 mutante: LDAP: correction, 'monipe' replaced with 'onimisionipe' in wmf group (T202708)
  • 18:12 mutante: LDAP: added user 'monipe' to group 'wmf' (T202708)
  • 18:02 legoktm@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/EUCopyrightCampaign/: Update MEPs - https://gerrit.wikimedia.org/r/458628 (for real this time) (duration: 00m 50s)
  • 17:52 legoktm@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/EUCopyrightCampaign/: Update MEPs - https://gerrit.wikimedia.org/r/458628 (duration: 00m 50s)
  • 17:45 XioNoX: apply firewall changes on pfw3-eqiad - T203793
  • 17:40 XioNoX: apply firewall changes on pfw3-codfw - T203793
  • 16:42 XioNoX: explicitely permit install1002/2002:80 in filter labs-in4 on cr1/2-eqiad - T190424
  • 14:56 moritzm: uploaded linux-meta 1.20+deb9u1 to apt.wikimedia.org/stretch-wikimedia (provides a new meta package for Linux 4.14)
  • 14:29 moritzm: installing PHP security updates on krypton
  • 14:27 moritzm: installing libtirpc security updates on trusty
  • 12:35 elukey: reboot kafka200[2,3] (eventbus codfw) for kernel + openjdk-8 upgrades
  • 10:03 Amir1: ladsgroup@mwmaint1001:~$ mwscript extensions/CentralAuth/maintenance/deleteLocalPasswords.php --wiki=fawiki --user Ladsgroup --prefix (T201009)
  • 09:38 hashar@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/UniversalLanguageSelector: Revert "Simplify by using native JavaScript instead of jQuery" - T203750 (duration: 00m 55s)
  • 08:49 ema: passive checks awol on einsteinium, restarting icinga -- T196336
  • 08:45 jynus: reloading apache with bad config for tendril for testing (small downtime)
  • 08:33 marostegui: Rebooting haproxies to pick up new config after all the tests - T201021
  • 08:16 banyek: genarting false alert about https auth on dbmonitor1001
  • 07:24 moritzm: rebooting mw2270-mw2290 for kernel security updates
  • 06:51 moritzm: rebooting mw2240-mw2269 for kernel security updates
  • 06:49 moritzm: rebooting mw2240-mw2269 for kernel security updates
  • 04:58 marostegui: Disable puppet on dbproxy1006 for logging testing - T201021
  • 00:21 thcipriani@deploy1001: Finished scap: SWAT: Add italian translation T203297 Improve German translation German translation: Replace "Vertreter" by "EU-Abgeordnete" (duration: 34m 38s)

2018-09-06

  • 23:46 thcipriani@deploy1001: Started scap: SWAT: Add italian translation T203297 Improve German translation German translation: Replace "Vertreter" by "EU-Abgeordnete"
  • 23:43 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Score/includes/Score.php: SWAT: Add checks for length of generated audio files T203560 (duration: 00m 58s)
  • 23:16 ejegg: updated CiviCRM from 7918924e73 to 52873b5dcc
  • 22:16 ebernhardson: restart mjolnir-kafka-bulk-daemon service to pickup earlier deploy.
  • 21:43 RoanKattouw: Deleting all user_properties rows with up_property='pagetriage-lastuse' (T202175)
  • 21:32 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/includes/MovePage.php: T203661 - I9ebdcbc566b (duration: 00m 57s)
  • 21:19 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/includes/exception/MWExceptionHandler.php: I3f35a519b50ae (duration: 00m 58s)
  • 21:02 ppchelko@deploy1001: Finished deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 5 (duration: 04m 01s)
  • 20:58 ppchelko@deploy1001: Started deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 5
  • 20:56 ppchelko@deploy1001: Finished deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 4 (duration: 04m 47s)
  • 20:51 ppchelko@deploy1001: Started deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 4
  • 20:51 ppchelko@deploy1001: Finished deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 3 (duration: 07m 16s)
  • 20:44 ppchelko@deploy1001: Started deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 3
  • 20:43 ppchelko@deploy1001: Finished deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 2 (duration: 09m 26s)
  • 20:34 ppchelko@deploy1001: Started deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190, take 2
  • 20:34 ppchelko@deploy1001: Finished deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190 (duration: 03m 56s)
  • 20:30 ppchelko@deploy1001: Started deploy [restbase/deploy@53bc0a6]: Revert bumping Parsoid content-type filter T194190
  • 18:27 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@d3e2c23]: repair msearch daemon cli args (duration: 03m 56s)
  • 18:23 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@d3e2c23]: repair msearch daemon cli args
  • 18:20 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Flow/: Ia0112ae62e6b - T203647 (duration: 01m 02s)
  • 18:12 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/includes/: I31a97d0168 - T203583 (duration: 01m 13s)
  • 17:56 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Canary machines back to wmf.20
  • 17:45 thcipriani@deploy1001: scap failed: average error rate on 6/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/2cc7028226a539553178454fc2f14459 for details)
  • 17:34 volans: uploaded spicerack_0.0.7-1{,+deb9u1} to apt.wikimedia.org {jessie,stretch}-wikimedia - T199079
  • 17:20 jynus_: dropping old logs from dbproxy1003 T201021
  • 17:15 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@735235b]: re-try bump to master (duration: 12m 46s)
  • 17:02 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@735235b]: re-try bump to master
  • 17:00 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@735235b]: new cli flags for msearch daemon, bump kafka-python dep to 1.4.x (duration: 16m 19s)
  • 16:44 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@735235b]: new cli flags for msearch daemon, bump kafka-python dep to 1.4.x
  • 16:44 ebernhardson@deploy1001: deploy aborted: new cli flags for msearch daemon, bump kafka-python dep to 1.4.x (duration: 00m 07s)
  • 16:43 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@735235b]: new cli flags for msearch daemon, bump kafka-python dep to 1.4.x
  • 16:40 volans: upgraded spicerack to version 0.0.6 on sarin/neodymium - T199079
  • 16:38 volans: uploaded spicerack_0.0.6-1{,+deb9u1} to apt.wikimedia.org {jessie,stretch}-wikimedia - T199079
  • 15:41 END: (PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0) (switchdc/oblivian@neodymium)
  • 15:41 START: - Cookbook sre.switchdc.services.02-restore-ttl (switchdc/oblivian@neodymium)
  • 15:37 END: (PASS) - Cookbook sre.switchdc.services.01-switch-dc (exit_code=0) (switchdc/oblivian@neodymium)
  • 15:37 Switching: services pdfrender: eqiad => codfw (switchdc/oblivian@neodymium)
  • 15:37 START: - Cookbook sre.switchdc.services.01-switch-dc (switchdc/oblivian@neodymium)
  • 15:36 END: (PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0) (switchdc/oblivian@neodymium)
  • 15:31 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (switchdc/oblivian@neodymium)
  • 15:30 END: (PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0) (switchdc/oblivian@neodymium)
  • 15:30 START: - Cookbook sre.switchdc.services.02-restore-ttl (switchdc/oblivian@neodymium)
  • 15:30 END: (PASS) - Cookbook sre.switchdc.services.01-switch-dc (exit_code=0) (switchdc/oblivian@neodymium)
  • 15:30 Switching: services pdfrender: codfw => eqiad (switchdc/oblivian@neodymium)
  • 15:30 START: - Cookbook sre.switchdc.services.01-switch-dc (switchdc/oblivian@neodymium)
  • 15:04 END: (PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0) (switchdc/oblivian@neodymium)
  • 14:59 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (switchdc/oblivian@neodymium)
  • 14:52 END: (PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0) (switchdc/oblivian@neodymium)
  • 14:52 START: - Cookbook sre.switchdc.services.02-restore-ttl (switchdc/oblivian@neodymium)
  • 14:52 END: (PASS) - Cookbook sre.switchdc.services.01-switch-dc (exit_code=0) (switchdc/oblivian@neodymium)
  • 14:52 Switching: services pdfrender: eqiad => codfw (switchdc/oblivian@neodymium)
  • 14:51 START: - Cookbook sre.switchdc.services.01-switch-dc (switchdc/oblivian@neodymium)
  • 14:51 END: (PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0) (switchdc/oblivian@neodymium)
  • 14:46 START: - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (switchdc/oblivian@neodymium)
  • 14:38 elukey: reboot kafka2001 (eventbus codfw host) for kernel + openjdk-8 upgrades
  • 13:46 moritzm: reboots of mc hosts in codfw completed
  • 13:45 elukey: reboot kafka100[2-6] for kernel + openjdk-8 upgrades
  • 13:24 elukey: reboot kafka-jumbo1001 for openjdk-8 + kernel security upgrades
  • 13:08 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.32.0-wmf.20
  • 13:03 hashar: all wikis to 1.32.0-wmf.20 | T191066
  • 12:51 ema: trafficserver 7.1.3+ds-4wm3 uploaded to stretch-wikimedia T199720
  • 10:22 moritzm: rebooting mc2* hosts for kernel security update
  • 09:33 moritzm: disabling puppet on install1002 for some d-i tests
  • 09:11 oblivian@deploy1001: Synchronized wmf-config/mc.php: Fixing memcached configuration for labstestwiki T203479 (duration: 00m 56s)
  • 08:57 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1120 (duration: 00m 58s)
  • 08:40 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1120 (duration: 00m 57s)
  • 08:32 marostegui: Enable replication codfw -> eqiad on es2,es3 - T189107
  • 08:25 marostegui: Enable replication codfw -> eqiad on s7,s8,x1 - T189107
  • 08:17 marostegui: Enable replication codfw -> eqiad on s1,s3,s4 - T189107
  • 08:09 godog: bounce ircecho on einsteinium, stuck and not on irc
  • 08:09 hashar: rebooting contint2001 for kernel security update
  • 08:06 godog: repair sde1 on ms-be2042 - T199198
  • 08:06 marostegui: Enable replication codfw -> eqiad on s5,s6,s2 - T189107
  • 08:01 moritzm: rebooting contint1001 for kernel security update
  • 07:58 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Fix DB configuration in preparation for dc switchover (duration: 00m 57s)
  • 07:54 hashar: Upgraded packages on contint1001 and contint2001
  • 07:24 _joe_: rolling restart of eqiad HHVM appservers
  • 07:10 moritzm: run decomission_appserver on mw2213 (T203434)
  • 04:38 TimStarling: on mwdebug1001 restarting hhvm after probably breaking it by trying to attach with gdb
  • 04:08 tstarling@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Quiz/Question.php: T203628 (duration: 00m 56s)
  • 04:06 tstarling@deploy1001: Synchronized php-1.32.0-wmf.19/extensions/Quiz/Question.php: (no justification provided) (duration: 00m 57s)
  • 03:17 legoktm@deploy1001: Synchronized wmf-config/CommonSettings.php: Have canonical Main Page URL be the domain root for fixcopyrightwiki (duration: 00m 57s)
  • 02:47 legoktm@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/UniversalLanguageSelector/: ULS fixes for Accept-Language stuff (duration: 01m 44s)
  • 02:38 legoktm@deploy1001: Finished scap: EUCopyrightCampaign updates (duration: 62m 43s)
  • 01:36 legoktm@deploy1001: Started scap: EUCopyrightCampaign updates

2018-09-05

  • 23:37 jforrester@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Score: SWAT clean-up following messy deployment and undeployment of patch (duration: 00m 57s)
  • 23:27 jforrester@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Score/includes/Score.php: SWAT Fix error for T203560 (duration: 00m 54s)
  • 23:25 jforrester@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/EducationProgram/includes/pagers/StudentPager.php: SWAT Revert fix forT203577 (duration: 00m 56s)
  • 23:22 jforrester@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Score/includes/Score.php: SWAT Fix error on malformed MIDI files T203560 (duration: 01m 04s)
  • 23:14 jforrester@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/EducationProgram/includes/pagers/StudentPager.php: SWAT Fix spammy log errors T203577 (duration: 00m 58s)
  • 22:17 XenoRyet: updated payments-wiki from e749025c1a to 05d796e844
  • 21:54 volans: uploaded spicerack_0.0.5-1{,+deb9u1} to apt.wikimedia.org {jessie,stretch}-wikimedia - T199079
  • 20:59 XioNoX: clear bgp neighbor 80.249.209.209 on cr2-esams (max prefix limit)
  • 20:28 arlolra: Updated Parsoid to 740b3a4 (T198400, T202819)
  • 20:19 arlolra@deploy1001: Finished deploy [parsoid/deploy@f3ef0c8]: Updating Parsoid to 740b3a4 (duration: 09m 57s)
  • 20:17 krinkle@deploy1001: Synchronized php-1.32.0-wmf.19/extensions/Kartographer/: Ie5744d45b - T203427 (duration: 00m 57s)
  • 20:14 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Kartographer/: Ie5744d45b - T203427 (duration: 00m 58s)
  • 20:12 legoktm: legoktm@deploy1001:~$ cat /home/legoktm/fixcopyright | mwscript purgeList.php --wiki=aawiki
  • 20:11 legoktm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Tweak $wgSitename for fixcopyrightwiki (duration: 00m 58s)
  • 20:09 arlolra@deploy1001: Started deploy [parsoid/deploy@f3ef0c8]: Updating Parsoid to 740b3a4
  • 19:49 legoktm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Set $wgSitename and $wgULSLanguageDetection on fixcopyrightwiki (duration: 00m 57s)
  • 19:46 legoktm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable EUCopyrightCampaign extensions and SkinPerPage for fixcopyrightwiki (duration: 00m 56s)
  • 19:42 legoktm@deploy1001: Finished scap: build l10n for EUCopyrightCampaign and SkinPerPage (duration: 64m 26s)
  • 18:39 moritzm: fixed package state on auth2001
  • 18:38 legoktm@deploy1001: Started scap: build l10n for EUCopyrightCampaign and SkinPerPage
  • 18:30 mutante: cp1080 - pooling again after T203194 appears fixed
  • 18:04 XioNoX: upgrade asw2-a-eqiad (not in production)
  • 17:47 ppchelko@deploy1001: Finished deploy [restbase/deploy@b768926]: Bump expected Parsoid version and release new metrics endpoints, take 3 (duration: 06m 40s)
  • 17:42 cmjohnson1: swapping DAC cable cp1080
  • 17:40 ppchelko@deploy1001: Started deploy [restbase/deploy@b768926]: Bump expected Parsoid version and release new metrics endpoints, take 3
  • 17:39 ppchelko@deploy1001: Finished deploy [restbase/deploy@b768926]: Bump expected Parsoid version and release new metrics endpoints, take 2 (duration: 04m 02s)
  • 17:35 ppchelko@deploy1001: Started deploy [restbase/deploy@b768926]: Bump expected Parsoid version and release new metrics endpoints, take 2
  • 17:35 ppchelko@deploy1001: Finished deploy [restbase/deploy@b768926]: Bump expected Parsoid version and release new metrics endpoints (duration: 13m 24s)
  • 17:22 ppchelko@deploy1001: Started deploy [restbase/deploy@b768926]: Bump expected Parsoid version and release new metrics endpoints
  • 17:14 krinkle@deploy1001: Synchronized wmf-config/InitialiseSettings.php: I73d2ce30 - T191086 (duration: 00m 57s)
  • 16:45 cmjohnson1: swapping ethernet cable for stat1006
  • 16:42 gehel@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=wdqs,name=wdqs1003.eqiad.wmnet
  • 16:42 gehel: shutting down wdqs1003 for new SSD and reimage - T202780
  • 16:38 stephanebisson: Finished extensions/PageTriage/maintenance/FixNominatedForDeletion.php --wiki enwiki
  • 16:38 stephanebisson: Starting extensions/PageTriage/maintenance/FixNominatedForDeletion.php --wiki enwiki
  • 16:35 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/PageTriage/modules/ext.pageTriage.models/ext.pageTriage.article.js: SWAT: Fix CopyPatrol links for drafts T203284 (duration: 00m 57s)
  • 16:31 mutante: LDAP: removed user 'albe' from groups 'wmde' and 'nda' (T203561)
  • 16:30 thcipriani@deploy1001: Synchronized php-1.32.0-wmf.19/extensions/PageTriage/maintenance/FixNominatedForDeletion.php: SWAT: Maintenance: fix ptrp_deleted T202582 (duration: 00m 57s)
  • 16:23 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable logging for Schema:CitationUsage at 100% T191086 (duration: 01m 23s)
  • 14:50 elukey@deploy1001: Finished deploy [analytics/refinery@77a5a83]: small changes to pageview whitelist and scripts (duration: 08m 40s)
  • 14:42 elukey@deploy1001: Started deploy [analytics/refinery@77a5a83]: small changes to pageview whitelist and scripts
  • 14:21 akosiaris: switchover the deployment server back to deploy1001
  • 14:11 gehel@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=codfw,cluster=wdqs,name=wdqs2002.codfw.wmnet
  • 14:10 gehel: shutting down wdqs2002 for new SSD and reimage - T202777
  • 14:07 elukey: reboot druid* hosts for kernel + openjdk-8 upgrades
  • 13:44 godog: repair sdn1 on ms-be2041 - T199198
  • 13:40 ottomata: reimaging thorium to debian stretch (this will cause an announced {stats,analytics}.http://wm.org/ downtime!) - T192641
  • 13:39 herron: updating phabricator mail smtp-host to localhost T196916
  • 13:20 moritzm: rebooting tegmen for kernel security update
  • 13:10 moritzm: installing php5 security updates on jessie
  • 13:09 _joe_: depooling /repooling restbase, mathoid in codfw for switchover pre-flight testing
  • 13:08 hashar@deploy2001: Synchronized php: group1 wikis to 1.32.0-wmf.20 (duration: 01m 07s)
  • 13:07 hashar@deploy2001: rebuilt and synchronized wikiversions files: group1 wikis to 1.32.0-wmf.20
  • 12:29 stephanebisson: Finished extensions/PageTriage/maintenance/FixNominatedForDeletion.php --wiki testwiki
  • 12:29 stephanebisson: Starting extensions/PageTriage/maintenance/FixNominatedForDeletion.php --wiki testwiki
  • 11:20 moritzm: rebooting mwmaint2001 for kernel security update
  • 11:12 moritzm: rearmed keyholder on deploy1001
  • 11:08 moritzm: rebooting deploy1001 for kernel security update
  • 10:43 akosiaris@deploy2001: Synchronized wmf-config/ProductionServices.php: (no justification provided) (duration: 04m 57s)
  • 10:43 marostegui: Stop MySQL and reboot db2093 for kernel upgrade
  • 10:33 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 00m 05s)
  • 10:33 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:28 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 00m 14s)
  • 10:28 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:24 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 00m 04s)
  • 10:24 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:22 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 00m 49s)
  • 10:22 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:21 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 00m 04s)
  • 10:21 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:21 akosiaris@deploy2001: deploy aborted: (no justification provided) (duration: 00m 00s)
  • 10:20 moritzm: installing bind9 security updates (client-side tools and libraries)
  • 10:16 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 00m 52s)
  • 10:15 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:12 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 01m 39s)
  • 10:10 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:09 akosiaris@deploy2001: Finished deploy [servermon/servermon@c474a6b]: (no justification provided) (duration: 01m 44s)
  • 10:08 akosiaris@deploy2001: Started deploy [servermon/servermon@c474a6b]: (no justification provided)
  • 10:04 godog: repair sdh on ms-be1043 - T199198
  • 10:04 akosiaris: switchover the deployment server as a test for the switchover next week
  • 10:00 volans: upgraded spicerack to version 0.0.4 on sarin/neodymium - T199079
  • 09:59 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2213.codfw.wmnet
  • 09:52 moritzm: installing java security updates on druid*
  • 09:47 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1114 but with small api load due to ongoing issues (duration: 00m 56s)
  • 09:16 volans: uploaded spicerack_0.0.4-1{,+deb9u1} to apt.wikimedia.org {jessie,stretch}-wikimedia - T199079
  • 09:12 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Set all individual codfw sections in read-write, codfw globally still in ro (duration: 00m 56s)
  • 09:07 elukey: rebooting druid1001 for kernel + openjdk-8 upgrades
  • 08:52 oblivian@puppetmaster1001: conftool action : edit; selector: name=ReadOnly,scope=codfw
  • 08:51 oblivian@puppetmaster1001: conftool action : edit; selector: name=ReadOnly,scope=codfw
  • 08:44 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Set s5 section in read-write, codfw should be still in ro (duration: 00m 58s)
  • 08:37 marostegui: Drop partitions from db2040 (s7 master) for metawiki.pagelinks - T203548
  • 08:22 elukey: reboot aqs100[5-9] for kernel + openjdk-8 upgrades
  • 07:49 ema: upload vhtcpd 0.1.2-1 to stretch-wikimedia T199720
  • 07:48 elukey: re-enable puppet on mc1035 - memcache unit refreshed, mw cache shard wiped - T203429
  • 07:27 kartik@deploy1001: Finished deploy [cxserver/deploy@f341eec]: Update cxserver to 81d1a97 (T202933, T202283, T189438) (duration: 04m 03s)
  • 07:24 moritzm: rebooting video scalers/job runners in codfw for kernel security update
  • 07:23 kartik@deploy1001: Started deploy [cxserver/deploy@f341eec]: Update cxserver to 81d1a97 (T202933, T202283, T189438)
  • 07:02 elukey: restart oozie on analytics1003 to pick up new smtp settings
  • 06:57 moritzm: installing java security updates on aqs*
  • 06:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1077 (duration: 00m 57s)
  • 06:40 marostegui: Deploy schema change on s3 master (db1075)
  • 05:13 marostegui: Deploy schema change on db1077 with replication, this will generate lag on labsdb:s3
  • 05:12 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1077 (duration: 01m 09s)
  • 05:06 marostegui: Deploy schema change on s7 primary master (db1062)
  • 03:42 eileen: civicrm revision changed from 7bde3d0ab2 to 7918924e73, config revision is 0b227269a8
  • 03:18 l10nupdate@deploy1001: ResourceLoader cache refresh completed at Wed Sep 5 03:18:07 UTC 2018 (duration 10m 12s)
  • 03:07 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.20) (duration: 16m 17s)
  • 02:33 l10nupdate@deploy1001: scap sync-l10n completed (1.32.0-wmf.19) (duration: 13m 15s)
  • 01:42 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/includes/resourceloader/: I8e8d3a2cd2cc - T201686 (duration: 01m 31s)

2018-09-04

  • 23:44 ejegg: updated civicrm from 1330df3064 to 7bde3d0ab2
  • 23:34 catrope@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/PageTriage/maintenance/: Add new maintenance script to fix deleted flag in PageTriage (T202582) (duration: 00m 58s)
  • 23:30 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable TemplateWizard on testwiki and test2wiki (T202545) (duration: 00m 58s)
  • 23:22 RoanKattouw: Ran namespaceDupes.php --fix on plwiktionary and plwiki
  • 23:21 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Modify gender namespaces on plwiktionary (T202347) (duration: 00m 57s)
  • 22:22 XioNoX: add BGP sessions to AS64096 in eqord
  • 22:19 XioNoX: add BGP sessions to AS8220 in eqiad + eqsin
  • 22:19 Jeff_Green: authdns-update to deploy DNS changes removing thulium
  • 21:28 tzatziki: deleted archived file
  • 21:19 legoktm@deploy1001: Synchronized wmf-config/: Enable EUCopyrightCampaign extensions/skin on beta cluster (3/3) (duration: 00m 58s)
  • 21:18 legoktm@deploy1001: Synchronized wmf-config/CommonSettings.php: Enable EUCopyrightCampaign extensions/skin on beta cluster (2/3) (duration: 00m 57s)
  • 21:16 legoktm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable EUCopyrightCampaign extensions/skin on beta cluster (1/3) (duration: 00m 57s)
  • 21:05 krinkle@deploy1001: Synchronized php-1.32.0-wmf.20/resources/src/: resourceloader: startup and mediawiki.base improvements and fixes (duration: 00m 57s)
  • 21:03 krinkle@deploy1001: Synchronized php-1.32.0-wmf.19/extensions/FlaggedRevs/frontend/FlaggablePageView.php: I6dce0c (duration: 00m 58s)
  • 21:00 krinkle@deploy1001: Synchronized php-1.32.0-wmf.19/resources/src/startup/: mw.loader improvements and fixes (duration: 00m 58s)
  • 20:12 ejegg: rolled CiviCRM back to 1330df3064
  • 20:05 ejegg: updated CiviCRM from 1330df3064 to 7bde3d0ab2
  • 19:26 gehel: rolling restart of elasticsearch / cirrus / eqiad for various updates and data directory migration completed - T198351
  • 18:38 XioNoX: change internal NAT for 208.80.155.12 - T203475
  • 18:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 (duration: 00m 57s)
  • 16:31 elukey: reboot aqs1004 again for kernel + openjdk-8 upgrades (now available since the root partition is not read only anymore)
  • 16:26 thcipriani@deploy1001: Synchronized README: noop sync file - test scap 3.8.5-1 (duration: 00m 54s)
  • 16:24 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: noop wikiversions sync for T198640
  • 16:21 godog: upload scap 3.8.5-1 - T203271
  • 16:18 fdans@deploy1001: Finished deploy [analytics/refinery@2c4ec7a]: deploying refinery to update pageview def (duration: 09m 57s)
  • 16:08 fdans@deploy1001: Started deploy [analytics/refinery@2c4ec7a]: deploying refinery to update pageview def
  • 15:30 elukey: reboot aqs1004 after running fsck on root (was read-only)
  • 15:13 papaul: shutting down backup2001 to disable 1GB NIC
  • 15:08 marostegui: Deploy schema change on db1078
  • 15:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 (duration: 00m 56s)
  • 15:02 marostegui: Deploy schema change on db1095:3313
  • 15:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 (duration: 00m 57s)
  • 14:45 addshore@deploy1001: Synchronized php-1.32.0-wmf.20/extensions/Wikibase: Track new ItemId formatter usages (duration: 01m 17s)
  • 14:14 bstorm_: Removing subtree_check from project nfs on labstore1004/5
  • 14:02 jynus: stopping replication and running partitioning on logging on db1088:3311 T189107
  • 14:00 hashar@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.32.0-wmf.20
  • 13:45 hashar@deploy1001: Finished scap: testwiki to php-1.32.0-wmf.20 and rebuild l10n cache - T191066 (duration: 62m 55s)
  • 13:07 hashar: 1.32.0-wmf.20 is still syncing for testwiki due to l10ncache generation
  • 13:06 elukey: restart memcached on mc1035 with -v option - T203429
  • 12:57 moritzm: reimaging backup2001
  • 12:43 hashar@deploy1001: Started scap: testwiki to php-1.32.0-wmf.20 and rebuild l10n cache - T191066
  • 12:42 hashar@deploy1001: Pruned MediaWiki: 1.32.0-wmf.18 [keeping static files] (duration: 06m 58s)
  • 12:35 hashar: scap clean 1.32.0-wmf.18 | T191066
  • 12:32 hashar: Applied security patches for 1.32.0-wmf.20 | T191066
  • 12:24 hashar: scap prep 1.32.0-wmf.20 # T191066
  • 12:07 marostegui: Deploy schema change on db1123
  • 12:07 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 (duration: 00m 48s)
  • 11:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1090:3317 (duration: 00m 48s)
  • 11:50 hashar: Cutting branch 1.32.0-wmf.20 | T191066
  • 11:47 hashar: Preparing train deploy 1.32.0-wmf.20 | T191066
  • 11:35 hashar: European SWAT completed
  • 11:34 hashar@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Wikidata: Use new item ID formatter for Q1-Q100000 - T201835 (duration: 00m 49s)
  • 11:19 hashar@deploy1001: Synchronized php-1.32.0-wmf.19/extensions/DismissableSiteNotice: Revert "Use session storage instead of cookies for site notices" - T199274 (duration: 00m 50s)
  • 11:11 jynus: stopping replication and running partitioning on logging on db1085:3311 T189107
  • 11:10 hashar@deploy1001: Synchronized static/images/project-logos: Update logos for the Russian Wikisource - T203343 (duration: 00m 49s)
  • 11:06 hashar@deploy1001: Synchronized wmf-config/throttle.php: Two throttle rules for SMEX editathon - T203392 (duration: 00m 51s)
  • 11:05 hashar@deploy1001: sync-file aborted: (no justification provided) (duration: 00m 01s)
  • 10:29 marostegui: Deploy schema change on dbstore1002:s3
  • 10:23 marostegui: Deploy schema change on s3 codfw masters (this will generate lag on s3 codfw)
  • 10:17 elukey: restart ircecho on einstenium to force it re-join #wikimedia-analytics
  • 09:57 _joe_: restarted pdfrender on scb1001
  • 09:40 jynus: deploying latest event schedulers to all core db hosts
  • 09:32 elukey: restart ircecho as attempt to see if icinga-vm re-joins the analytics chan
  • 09:24 jynus: stop, upgrade and running analyze on db1114
  • 09:23 marostegui: Deploy schema change on db1090:3317
  • 09:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1090:3317 (duration: 00m 49s)
  • 09:12 akosiaris: depool pdfrender in eqiad, hopefully codfw will be better equipped to handle the load