Server admin log/Archive 25

From Wikitech
Jump to: navigation, search

September 30

  • 23:29 logmsgbot: spage Synchronized wmf-config/InitialiseSettings.php: Create log group for Echo (duration: 00m 11s)
  • 23:27 logmsgbot: spage Synchronized php-1.25wmf1/extensions/Echo: Echo no-op (change reverted) (duration: 00m 09s)
  • 22:55 ori: re-enabling puppet on mw1019
  • 22:36 ori: disabling puppet on mw1019 to enable debug logging in apache
  • 22:09 mutante: removing linne from DNS - was already shutdown about 24 hours before
  • 21:57 K4-713: updated prod civicrm to 477a5107a0c93ceac5214
  • 21:44 ori: Spike of bitter irony from Nemo_bis on #wikimedia-operations starting 21:43 UTC
  • 21:33 logmsgbot: ori Synchronized php-1.25wmf1/languages/Language.php: I672c699c (2/2) (duration: 00m 03s)
  • 21:33 logmsgbot: ori Synchronized php-1.25wmf1/includes/specialpage/SpecialPageFactory.php: I672c699c (1/2) (duration: 00m 07s)
  • 21:23 Nemo_bis: widespread reproducible 503 errors on wikidata and elsewhere
  • 20:55 andrewbogott: powering down virt0, just to see what breaks
  • 20:48 andrewbogott: shutting down pdns on virt0
  • 20:48 andrewbogott: shutting down opendj on virt (temporary, a preview of tomorrow)
  • 18:50 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 16s)
  • 18:49 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 15s)
  • 18:41 mutante: pc1001-1003 - can't generate tmp files for percona monitoring checks -> puppet fail
  • 18:24 mutante: killing silver from icinga and puppet
  • 18:23 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 16s)
  • 18:21 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.25wmf1
  • 18:05 logmsgbot: ori Synchronized wmf-config/HHVMRequestInit.php: (no message) (duration: 00m 07s)
  • 18:04 K4-713: re-enabled all queue consumers
  • 17:56 ejegg: updated civicrm from e83c999f39e6ae847d9b48e38c8c825fc10d1635 to b6c350f620c8dc1f3410de179c19cbcbdeb62270
  • 17:19 K4-713: disabled qc jobs and TY mail send for pending civi deploy
  • 15:45 hashar: Updating our Jenkins job builder fork 686265a..ee80dbc (no job changed)
  • 15:42 bblack: rebooting mexia
  • 15:33 logmsgbot: demon Synchronized docroot/bits/favicon/wikipedia.ico: Favicons are my favorite icons, especially when they're only 18% of the size of the original (duration: 00m 04s)
  • 15:16 logmsgbot: demon Synchronized php-1.25wmf1/extensions/Wikidata: (no message) (duration: 00m 11s)
  • 15:14 logmsgbot: demon Synchronized php-1.25wmf1/extensions/VisualEditor: (no message) (duration: 00m 08s)
  • 15:12 akosiaris: merging https://gerrit.wikimedia.org/r/#/c/163735/1, changing the LDAP master from sanger to ldap-mirror for inbound mail
  • 15:12 andrewbogott: running sync-common on virt1000
  • 15:12 logmsgbot: demon Synchronized visualeditor.dblist: (no message) (duration: 00m 04s)
  • 15:11 logmsgbot: demon Synchronized visualeditor-default.dblist: (no message) (duration: 00m 04s)
  • 15:06 logmsgbot: demon Synchronized wmf-config/Wikibase.php: (no message) (duration: 00m 04s)
  • 15:06 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s)
  • 14:26 _joe_: restarted apache on mw1196, lots of apc errors
  • 14:22 logmsgbot: oblivian gracefulled all apaches
  • 12:10 mark: Stopped exim daemon on mchenry
  • 09:41 godog: removed obsolete /etc/puppet/hiera from strontium and palladium, /etc/puppet/hieradata is the new location
  • 09:24 godog: reboot ms-be2001 as a test
  • 04:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 30 04:18:48 UTC 2014 (duration 18m 47s)
  • 03:17 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-30 03:17:15+00:00
  • 02:41 logmsgbot: ori Synchronized 503.html: Ia88b306ef: Make the 503 error page consistent with other 5xx error pages (duration: 00m 08s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-30 02:34:07+00:00
  • 01:00 Krinkle: Jenkins connection seemed in order with integration-slave1007 and 8, but disconnecting and relaunching the slave agents immediately resulted in them getting jobs assigned. Cause unknown, problem resolved for now.
  • 00:58 Krinkle: integration-slave1007 and integration-slave1008 have not gotten any jobs in the past 24h. integration-slave1006 however has gotten loads of action. Investigating load balancing issue.
  • 00:24 mutante: linne - shutting down, revoking puppet cert, salt key, puppet/icinga ...
  • 00:12 logmsgbot: maxsem Synchronized w/skins-1.5: (no message) (duration: 00m 03s)
  • 00:12 MaxSem: https://gerrit.wikimedia.org/r/#/c/162520/ broke stuff, reverted
  • 00:10 logmsgbot: maxsem Synchronized live-1.5: (no message) (duration: 00m 03s)

September 29

  • 23:59 logmsgbot: maxsem Synchronized w: https://gerrit.wikimedia.org/r/#/c/162520/ (duration: 00m 03s)
  • 23:58 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/163773/ (duration: 00m 03s)
  • 23:50 logmsgbot: maxsem Synchronized php-1.25wmf1/extensions/MultimediaViewer/: second try... (duration: 00m 04s)
  • 23:42 logmsgbot: maxsem Finished scap: SWATting a bunch of stuff (duration: 18m 44s)
  • 23:32 andrewbogott: stopped apache, nova-scheduler, keystone, puppetmaster on virt0
  • 23:31 bd808: /var/lib/jenkins-slave/tmpfs 100% full on lanthanum.eqiad.wmnet
  • 23:26 andrewbogott: disabling puppet on virt0 so I can kill off services one by one...
  • 23:23 logmsgbot: maxsem Started scap: SWATting a bunch of stuff
  • 23:17 logmsgbot: maxsem Synchronized docroot/: <mutante> he killed the dolphin (duration: 00m 06s)
  • 23:13 Reedy: dist-upgraded logstash1001 and reboot
  • 22:47 Reedy: dist-upgrade logstash1002 and reboot
  • 22:36 Reedy: dist-upgrade on logstash1003 and rebooting
  • 22:34 Reedy: restarted elasticsearch on logstash1003 post java upgrades
  • 22:30 Reedy: packages upgraded on logstash1002
  • 22:28 mutante: silver - shutting down, wait with wiping it for a few days, just incase
  • 22:28 Reedy: packages upgraded on logstash1001
  • 22:24 Reedy: elasticsearch upgradeed to 1.3.2 on logstash1003
  • 22:18 andrewbogott: renaming labs-ns1 to labs-ns0 and labs-ns2 to labs-ns1
  • 22:02 Reedy: elasticsearch upgradeed to 1.3.2 on logstash1002
  • 22:01 mutante: silver - revoke puppet cert, salt-key, stopping services, disable monitoring
  • 21:58 mutante: stopping udp2log-vumi on silver - not needed anymore per Yuvipanda
  • 21:12 Reedy: elasticsearch upgradeed to 1.3.2 on logstash1001
  • 20:50 bd808: Ran sync-common on tmh1002.eqiad.wmnet for cscott's failed sync-dir there
  • 20:49 bd808: Ran sync-common on tmh1001.eqiad.wmnet for cscott's failed sync-dir there
  • 20:29 logmsgbot: cscott Synchronized wmf-config: Switch default PDF renderer to OCG (duration: 00m 15s)
  • 20:04 subbu: deployed Parsoid version deed30b2
  • 19:41 ottomata: restarted varnishkafka on cp3019 to troubleshoot drerrs
  • 19:26 Reedy: doing rolling upgrade of elasticsearch on logstash100[1-3]
  • 17:59 cscott: updated OCG to version 89d8f29a24295b05d0643abe976fea83b56575c9
  • 17:58 logmsgbot: ori Synchronized php-1.24wmf22/includes/password/Pbkdf2Password.php: I3b0a1de69: Test for string in Pbkdf2Password::crypt() (duration: 00m 05s)
  • 17:47 bblack: stopped powerdns and disabled puppet on virt1000 to prevent further cache pollution w/ bad data in public caches
  • 15:57 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT enable collection extension svwikiversity (duration: 00m 06s)
  • 15:53 hashar: Zuul jobs reregistered
  • 15:46 hashar: Zuul lost all Jenkins jobs :(
  • 15:24 logmsgbot: manybubbles Synchronized php-1.24wmf22/extensions/UploadWizard/: SWAT update UploadWizard (duration: 00m 05s)
  • 15:17 logmsgbot: manybubbles Synchronized php-1.25wmf1/extensions/Wikidata/: SWAT update wikidata to fix hhvm issues. (duration: 00m 14s)
  • 15:05 logmsgbot: manybubbles Synchronized wmf-config/wikitech.php: SWAT sync wikitech file - is a noop I believe (duration: 00m 05s)
  • 15:02 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT fix config type of flow. (duration: 00m 06s)
  • 13:32 hashar: Restarted Zuul
  • 13:27 hashar: Zuul: tweaking configuration files 162584
  • 09:31 godog: deployed new swift ring to eqiad-prod
  • 08:21 hashar: Restarting Jenkins to have a plugin installed/loaded properly
  • 03:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep 29 03:25:11 UTC 2014 (duration 25m 10s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-29 02:26:50+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-29 02:15:34+00:00
  • 02:14 bblack: restarting squid on carbon (webproxy)

September 28

  • 23:27 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: switch db1042 load groups to db1056 (duration: 00m 06s)
  • 23:17 springle: powercycle db1042
  • 23:15 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1042, locked up (duration: 00m 07s)
  • 23:12 bblack: restarted apache on mw1123 + mw1196
  • 23:11 bblack: test
  • 23:11 bblack: restarted apache on mw1123 + mw1196
  • 20:28 ori: Puppet failures appear to be caused by apt-get timeouts
  • 10:09 _joe_: updated bash (again) across the whole cluster
  • 03:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Sep 28 03:24:20 UTC 2014 (duration 24m 19s)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-28 02:28:14+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-28 02:17:01+00:00

September 27

  • 18:35 logmsgbot: ori Synchronized php-1.25wmf1/extensions/Scribunto/engines/LuaSandbox/Engine.php: Capture traces for bug 71045 (duration: 00m 13s)
  • 18:35 logmsgbot: ori Synchronized php-1.24wmf22/extensions/Scribunto/engines/LuaSandbox/Engine.php: Capture traces for bug 71045 (duration: 00m 17s)
  • 04:03 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 27 04:03:25 UTC 2014 (duration 3m 24s)
  • 02:46 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-27 02:46:53+00:00
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-27 02:25:05+00:00

September 26

  • 23:26 mutante: switched noc.wikimedia.org to terbium, behind misc-web
  • 22:01 bd808: sudo apache2ctl graceful on logstash100[123] for ldap revert
  • 22:00 bd808: running puppet on logstash100[123] to revert ldap change
  • 21:56 bd808: sudo apache2ctl graceful on logstash100[123] for ldap change
  • 21:35 andrewbogott: gracefulled apache on neon
  • 21:21 mutante: graceful'ed apache on neon
  • 20:45 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Disable HHVM beta-feature on wikidatawiki (duration: 00m 06s)
  • 19:58 awight: update CRM from 25159fcfc29921b08de86f12121fb292139be09d to 3e42bac8cb7f58f5e504946f4944c69ca5553e60
  • 19:42 mutante: removing root's public_html from fenari - backup kept just in case
  • 19:15 AaronS: Deployed security patches to CentralAuth
  • 19:09 Krinkle: git-deploy: Deploying integration/slave-scripts 08147c42ea42e1a5eca1d29
  • 19:08 logmsgbot: aaron Synchronized php-1.25wmf1/extensions/CentralAuth: (no message) (duration: 00m 07s)
  • 19:06 logmsgbot: aaron Synchronized php-1.24wmf22/extensions/CentralAuth: (no message) (duration: 00m 08s)
  • 18:15 Nemo_bis: untruncated: andrewbogott> ldap is broken for gerrit, should be working elsewhere
  • 18:14 legoktm: ldap is broken
  • 18:09 K4-713: re-enabled donations queue consumer
  • 17:50 awight: CRM queue consumer disabled
  • 17:43 andrewbogott: upgraded libgnutls26 on ytterbium
  • 17:35 andrewbogott: "git reset --hard origin" to remove that terrible hotfix on palladium and strontium.
  • 17:28 awight: CRM jobs reenabled
  • 17:22 hoo: Manually ran rebuildEntityPerPage.php for Wikidata
  • 17:16 andrewbogott: hotfixing /var/lib/git/operations/puppet in hopes of fixing gerrit so I don't have to hotfix no more
  • 17:08 awight: updated crm from 06c9546f9b68f6ecbaaf510944418aa52f9ed0fb to 25159fcfc29921b08de86f12121fb292139be09d
  • 17:02 awight: disabling CRM jobs for deployment...
  • 15:29 andrewbogott: puppet is now moving all labs instances to new ldap servers: ldap-eqiad and ldap-codfw
  • 15:02 cscott: documented what I'm going to clear the OCG queues at https://wikitech.wikimedia.org/wiki/OCG#Pruning_the_queue
  • 14:36 bblack: address for ns1 switched in our local dns data - https://gerrit.wikimedia.org/r/163164
  • 13:57 hoo: Manually declared the global rename Secretary-> VlsergeyBot done after it twice timed out on pages moves on ruwiki
  • 13:39 akosiaris: moved mathoid to low-traffic lvs servers@eqiad
  • 12:48 cscott: cleared OCG caches again when I woke up to buy me more time to investigate the issue properly.
  • 08:44 awight: rollback: revision for civicrm locked to 06c9546f9b68f6ecbaaf510944418aa52f9ed0fb
  • 08:30 _joe_: updated hhvm on mw1053, kicked the jr a couple of times, working again now
  • 08:29 awight: large_donation schema migration 7000
  • 08:28 awight: skip over wmf_civicrm schema migration 7022 -- *why* did I make that unsafe
  • 08:24 awight: fundraising_code_update: revision for civicrm changed from 06c9546f9b68f6ecbaaf510944418aa52f9ed0fb to 5aca00fd4573f0fe8f385baa7238172f6ae54438
  • 08:19 awight: disabling CRM jobs during deployment
  • 08:09 cscott: cleared OCG queues and cache to quiet icinga; will try to get to the root cause tomorrow.
  • 07:41 hashar: Updated our Jenkins Job Builder fork 2d74b16..686265a
  • 07:06 logmsgbot: ori Synchronized php-1.24wmf22/extensions/WikimediaEvents: Update WikimediaEvents for 791e14cfc1d (duration: 00m 05s)
  • 06:53 logmsgbot: ori Synchronized php-1.25wmf1/extensions/WikimediaEvents: Update WikimediaEvents for 0e087daea5 (duration: 00m 07s)
  • 06:41 cscott: updated OCG to version f3a6c1cbba118d4a5e1aa019937dc50159fc823d
  • 04:43 _joe_: updating bash, USN-2363
  • 04:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep 26 04:10:12 UTC 2014 (duration 10m 11s)
  • 03:09 logmsgbot: LocalisationUpdate completed (1.25wmf1) at 2014-09-26 03:09:47+00:00
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-26 02:36:45+00:00
  • 00:14 awight: turning off Civi jobs before deployment

September 25

  • 23:31 logmsgbot: maxsem Synchronized php-1.25wmf1/skins/Vector/: https://gerrit.wikimedia.org/r/#/c/163021/ (duration: 00m 03s)
  • 23:15 logmsgbot: maxsem Synchronized php-1.25wmf1/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/162971/ (duration: 00m 04s)
  • 23:12 logmsgbot: maxsem Synchronized php-1.25wmf1/includes/resourceloader/ResourceLoaderSiteModule.php: https://gerrit.wikimedia.org/r/#/c/163024/ (duration: 00m 03s)
  • 23:10 logmsgbot: maxsem Synchronized php-1.25wmf1/includes/api/ApiQueryAllUsers.php: https://gerrit.wikimedia.org/r/#/c/163027/ (duration: 00m 03s)
  • 23:08 logmsgbot: maxsem Synchronized php-1.24wmf22/includes/api/ApiQueryAllUsers.php: https://gerrit.wikimedia.org/r/#/c/163026/ (duration: 00m 03s)
  • 23:02 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/163048 (duration: 00m 03s)
  • 22:58 logmsgbot: ori Synchronized php-1.24wmf22/extensions/Wikidata: Update Wikidata for I0acd2096d21b (duration: 00m 11s)
  • 21:41 mutante: powercycling mw1053
  • 20:36 mutante: no !log
  • 20:36 legoktm: manually migrated "NickK" to a global account
  • 20:29 mutante: repooled mw1051
  • 19:49 bd808: Restarted logstash on logstash1001. udp2log events were not being recorded.
  • 19:30 logmsgbot: reedy Synchronized php-1.25wmf1/: (no message) (duration: 00m 46s)
  • 19:24 logmsgbot: reedy Synchronized php-1.24wmf22/resources/src/mediawiki.ui/components/buttons.less: (no message) (duration: 00m 14s)
  • 19:22 bblack: ntp work done on hosts
  • 19:18 logmsgbot: reedy Synchronized php-1.25wmf1/: (no message) (duration: 00m 55s)
  • 19:17 logmsgbot: reedy Synchronized php-1.24wmf22/extensions/CentralAuth/: (no message) (duration: 00m 14s)
  • 18:55 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 18:47 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 18:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.25wmf1
  • 18:08 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf22
  • 17:20 logmsgbot: reedy Finished scap: testwiki to 1.25wmf1 and build l10n cache (duration: 28m 36s)
  • 16:52 logmsgbot: reedy Started scap: testwiki to 1.25wmf1 and build l10n cache
  • 16:41 Reedy: Purged php-1.24wmf9
  • 16:38 logmsgbot: reedy Purged l10n cache for 1.24wmf20
  • 15:31 bblack: testing ntpd changes on acamar, achernar, chromium, hydrogen, nescio, and baham (puppet-agent disabled)
  • 15:19 logmsgbot: mattflaschen Synchronized wmf-config/CommonSettings.php: Extend GettingStarted bucketting period end date to Sept. 28 (duration: 00m 07s)
  • 12:36 godog: update bash on elastic1014 analytics1021 elastic1013
  • 11:33 _joe_: gracefully reloaded apache on mw1139 and mw1199, apc issues
  • 11:29 logmsgbot: aude Synchronized php-1.24wmf22/extensions/Wikidata/extensions/Wikibase/lib/config/WikibaseLib.default.php: fix apc issues (duration: 00m 06s)
  • 11:03 _joe_: updated bash on elastic1007
  • 10:57 godog: upgraded bash on labsdb1003
  • 10:31 Nemo_bis: SAL is here
  • 09:22 godog: graphite temporarily down, fix incoming
  • 06:13 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062 (duration: 00m 07s)
  • 03:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 25 03:58:02 UTC 2014 (duration 58m 1s)
  • 03:02 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-25 03:02:46+00:00
  • 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-25 02:32:56+00:00
  • 02:08 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Use Debian-packaged texvc on Trusty app servers (duration: 00m 04s)
  • 01:39 ori: gracefuling apaches
  • 00:55 mutante: icinga - manually deleted duplicate host labs-ns1 to fix icinga config and reloads

September 24

  • 23:21 ejegg: Updated paymentswiki from 3ac5dd1c3fade37b6f3a4879aef8ea71b3bbbf08 to 83464deed3b66da655ca5d1086852237c4793b71
  • 23:17 logmsgbot: catrope Synchronized php-1.24wmf22/extensions/VisualEditor: SWAT (duration: 00m 04s)
  • 23:14 logmsgbot: catrope Synchronized php-1.24wmf22/resources/lib/oojs-ui/: SWAT (duration: 00m 05s)
  • 23:12 greg-g: restarted jouncebot, he wasn't announcing deploy windows
  • 23:00 mutante: OCG - scheduled downtime/disabled notifications for LVS check
  • 22:44 andrewbogott: salted a bash update on labs instances, which turned out to be updated already.
  • 22:09 cscott: icinga VS HTTP IPv4 on ocg.svc.eqiad.wmnet test is most likely due to `du -s` of a 6G cache directory, not critical. timeouts can be increased to quiet it. i will look into adding a -quick parameter or some such tomorrow to make the health check faster.
  • 20:56 cscott: updated OCG to version 48acb8a2031863e35fad9960e48af60a3618def9
  • 20:43 logmsgbot: aaron Synchronized php-1.24wmf22/includes/cache/bloom: ad8a7a761d5f3bd086bbd6c88870e83c701e59e3 (duration: 00m 04s)
  • 20:00 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
  • 19:47 logmsgbot: yurik Synchronized php-1.24wmf22/extensions/ZeroBanner/: Updating to master (duration: 01m 10s)
  • 19:46 logmsgbot: yurik Synchronized php-1.24wmf21/extensions/ZeroBanner/: Updating to master (duration: 01m 07s)
  • 19:14 logmsgbot: yurik Finished scap: updating Graph, JsonConfig, ZeroBanner & ZeroPortal to master for 21 & 22 (duration: 07m 46s)
  • 19:07 logmsgbot: yurik Started scap: updating Graph, JsonConfig, ZeroBanner & ZeroPortal to master for 21 & 22
  • 18:55 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: Updating interwiki cache (duration: 00m 14s)
  • 18:53 logmsgbot: reedy Synchronized php-1.24wmf22/extensions/WikimediaMaintenance: (no message) (duration: 00m 14s)
  • 17:13 manybubbles: lowered throttling on Elasticsearch index transfer from one node to another speed because I hate excitement
  • 15:38 Nemo_bis: cscott> i'm working on the OCG health issue above. i'll let you know when i know what's going on. icinga-wm> PROBLEM - OCG health on ocg1002 is CRITICAL
  • 15:37 logmsgbot: demon Synchronized php-1.24wmf22/extensions/CentralAuth: (no message) (duration: 00m 05s)
  • 15:21 logmsgbot: demon Synchronized php-1.24wmf22/extensions/CirrusSearch/maintenance/updateOneSearchIndexConfig.php: (no message) (duration: 00m 05s)
  • 15:01 logmsgbot: demon Synchronized wmf-config/Wikibase.php: (no message) (duration: 00m 06s)
  • 14:57 Jeff_Green: restarted service ocg on ocg1001
  • 14:40 manybubbles: finished deployment - load spikes look to be gone. yay
  • 14:22 logmsgbot: manybubbles Synchronized php-1.24wmf21/extensions/CirrusSearch/: Switch implementation of Cirrus link counting jobs to hopefully lower overall load. (duration: 00m 04s)
  • 14:21 logmsgbot: manybubbles Synchronized wmf-config: More cirrus config to lower load (duration: 00m 04s)
  • 14:17 logmsgbot: manybubbles Synchronized wmf-config: Cirrus config to lower load (duration: 00m 04s)
  • 14:14 logmsgbot: manybubbles Synchronized php-1.24wmf22/extensions/CirrusSearch/: Switch implementation of Cirrus link counting jobs to hopefully lower overall load. (duration: 00m 06s)
  • 14:08 manybubbles: starting deployment to lower cirrus load spikes
  • 13:19 manybubbles: *disabled*
  • 13:17 manybubbles: disable row awareness on Cirrus's elasticsearch cluster - might help balance load better. too much load was on one row
  • 13:04 hashar: Zuul proceeding queue again
  • 13:00 hashar: Jenkins: disconnecting Gearman client from Zuul and reconnecting
  • 12:59 hashar: Zuul / Jenkins stuck
  • 09:33 hashar_: Jenkins switched mwext-UploadWizard-qunit back to Zuul cloner by applying pending change 161459
  • 09:33 hashar_: restarting zuul-merger
  • 09:32 hashar_: restarting zuul
  • 09:19 hashar_: Upgrading Zuul to f0e3688 Cherry pick https://review.openstack.org/#/c/123437/1 which fix bug 71133 Zuul cloner: fails on extension jobs against a wmf branch
  • 05:41 legoktm: ran script to back populate bug 70620 on metawiki (/home/legoktm/ca/populateBug70620.php on terbium)
  • 04:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep 24 04:29:53 UTC 2014 (duration 29m 52s)
  • 03:34 logmsgbot: tstarling Finished scap: (no message) (duration: 12m 09s)
  • 03:22 logmsgbot: tstarling Started scap: (no message)
  • 03:21 logmsgbot: tstarling scap failed: RuntimeError scap requires SSH agent forwarding (duration: 00m 00s)
  • 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-24 03:12:54+00:00
  • 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-24 02:39:39+00:00
  • 02:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1062 (duration: 00m 06s)
  • 01:25 mutante: tridge - shutting down

September 23

  • 23:47 logmsgbot: maxsem Synchronized php-1.24wmf22/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
  • 23:15 logmsgbot: maxsem Synchronized wmf-config/CommonSettings.php: fail! (duration: 00m 04s)
  • 23:12 logmsgbot: maxsem Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/162297/ (duration: 00m 03s)
  • 23:06 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/MassMessage/: https://gerrit.wikimedia.org/r/#/c/161002/ (duration: 00m 03s)
  • 22:04 logmsgbot: aaron Synchronized php-1.24wmf22/includes/jobqueue/JobRunner.php: f23f1ad35f02f6a17c9b5842aa6d8c152a273639 (duration: 00m 04s)
  • 21:54 logmsgbot: ebernhardson Finished scap: Bump flow submodule (and change an i18n message) in 1.24wmf21 and 1.24wmf22 (duration: 28m 14s)
  • 21:25 logmsgbot: ebernhardson Started scap: Bump flow submodule (and change an i18n message) in 1.24wmf21 and 1.24wmf22
  • 20:24 cscott: updated OCG to version 1cf9281ec3e01d6cbb27053de9f2423582fcc156
  • 19:38 mutante: stopped etherpad, added repairPad.js, attempted repair of pad 'WRN201409', started etherpad
  • 18:30 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 18:28 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 16s)
  • 18:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf22
  • 16:59 logmsgbot: aaron Synchronized wmf-config/jobqueue-eqiad.php: Removed redundant config due to new job runner (duration: 00m 05s)
  • 16:29 _joe_: manually created /srv/mediawiki bind mount on searchidx1001; moved old contents to /a/mediawiki-stale, to avoid filling the disk
  • 15:33 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT remove C and MW namspace aliases from ckbwiki (duration: 00m 07s)
  • 15:24 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT add *.beeldbank.cultureelerfgoed.nl to upload list (duration: 00m 04s)
  • 15:16 logmsgbot: manybubbles Synchronized php-1.24wmf21/extensions/CirrusSearch/: SWAT update Cirrus for better error handling (duration: 00m 04s)
  • 15:08 logmsgbot: manybubbles Synchronized php-1.24wmf22/extensions/CirrusSearch/: SWAT deploy cirrus backports (duration: 00m 05s)
  • 13:48 akosiaris: change url-downloader ip to point to the new one
  • 13:01 logmsgbot: manybubbles Synchronized wmf-config/: Throttle cirrus jobs some more. (duration: 00m 04s)
  • 12:24 logmsgbot: manybubbles Synchronized wmf-config/: Some new cirrus config (duration: 00m 07s)
  • 09:16 godog: deployed codfw-prod swift ring to palladium
  • 04:49 logmsgbot: tstarling Synchronized php-1.24wmf21/languages/Language.php: profiling (duration: 00m 05s)
  • 04:10 logmsgbot: tstarling Synchronized php-1.24wmf21/languages/Language.php: profiling (duration: 00m 05s)
  • 03:42 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 23 03:42:29 UTC 2014 (duration 42m 28s)
  • 03:29 logmsgbot: tstarling Synchronized wmf-config/CommonSettings.php: fix profiling (duration: 00m 07s)
  • 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-23 02:43:48+00:00
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-23 02:30:38+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-23 02:17:31+00:00
  • 00:26 mutante: tridge - revoking puppet cert, deleting salt key, decom ...

September 22

  • 23:49 logmsgbot: ebernhardson Synchronized php-1.24wmf22/extensions/LiquidThreads/: Bump LiquidThreads submodule in 1.24wmf22 (duration: 00m 06s)
  • 23:48 logmsgbot: ebernhardson Synchronized php-1.24wmf22/extensions/UploadWizard/: Bump UploadWizard submodule in 1.24wmf22 (duration: 00m 04s)
  • 23:46 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/LiquidThreads/: Bump LQT submodule in 1.24wmf21 (duration: 00m 04s)
  • 23:35 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/UploadWizard/: sync UploadWizard in 1.24wmf21 (duration: 00m 07s)
  • 23:32 logmsgbot: ebernhardson Synchronized php-1.24wmf21/includes/rcfeed/MachineReadableRCFeedFormatter.php: Use safe attribute accessor for RecentChange (duration: 00m 04s)
  • 23:30 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/UploadWizard/: Bump UploadWizard submodule in php-1.24wmf21 (duration: 00m 04s)
  • 23:30 logmsgbot: ebernhardson Synchronized php-1.24wmf21/extensions/Flow/: Bump flow submodule in php-1.24wmf21 (duration: 00m 06s)
  • 23:17 logmsgbot: ebernhardson Synchronized wmf-config/InitialiseSettings.php: Set wgUploadNavigationUrl for eowiki (duration: 00m 05s)
  • 21:04 bd808: production-logstash-eqiad healed by restarting elasticsearch on logstash1002 after OOM + split brain
  • 20:54 bd808: split brain on logstash1002 preceded by by java OOM for elasticsearch
  • 20:52 bd808: logstash1002 went split brain from rest of logstash elastic search cluster. restarting
  • 20:24 subbu: deployed Parsoid ff9476f9
  • 19:31 hashar: Jenkins is broken for extensions patches proposed against the wmf branches bug 71133
  • 18:32 Krinkle: lanthanum tmpfs filled up again, purged manually (bug 71128)
  • 17:22 ori: updated HHVM on beta cluster to HHVM to 3.3.0-20140918+wmf1
  • 17:00 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Push Cirrus' non-content enwiki shards apart (no-op) (duration: 00m 04s)
  • 15:52 godog: reboot ms-be2001 into PXE to test a re-install
  • 15:07 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Graph extension on mediawiki.org gerrit:161908 (duration: 00m 09s)
  • 15:02 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add securepoll-create-poll right to sysop on testwiki gerrit:161653 (duration: 00m 09s)
  • 15:01 logmsgbot: anomie Synchronized wmf-config/CommonSettings.php: SWAT: Add REL1_24 as branch in ExtensionDistributor gerrit:161666 (duration: 00m 10s)
  • 14:12 hashar: Jenkins deleted job mediawiki-core-lint , replaced by mediawiki-core-phplint
  • 12:10 apergos: shutdown of db1050 to install trusty
  • 10:04 hashar: Jenkins back and fully operational
  • 09:55 hashar: restarting jenkins
  • 09:37 hashar_: Jenkins: deleting old mediawiki extensions jobs (rm -fR /var/lib/jenkins/jobs/*testextensions-master). They are no more triggered and superseded by the *-testextension jobs.
  • 03:36 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep 22 03:36:40 UTC 2014 (duration 36m 39s)
  • 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-22 02:41:29+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-22 02:29:09+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-22 02:16:20+00:00

September 21

  • 22:43 ori: ms-be1008 overloaded starting 18:00:24 UTC, syslog says "BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:2196]". machine became unresponsive at 21:35, coinciding with a spike of 5xxs, lasting until Coren powercycled it at 22:10.
  • 03:37 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Sep 21 03:37:31 UTC 2014 (duration 37m 30s)
  • 03:16 springle: labsdb1001 mysqld restarted in gdb; crash loop with a labs user's table
  • 02:46 logmsgbot: ori Synchronized wmf-config/throttle.php: I7bb42b49a: Increase account creation throttle on enwiki for Cochrane colloquium. (duration: 00m 07s)
  • 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-21 02:41:36+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-21 02:29:51+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-21 02:16:56+00:00

September 20

  • 22:28 Krinkle: Reloading Zuul to deploy I0170766cfc06b8e6
  • 20:30 andrewbogott: rebooting virt1006 to make good and sure it doesn't spontaneously re-enter the compute pool
  • 20:29 andrewbogott_afk: moved all VMs off of virt1006, disabled compute service
  • 03:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 20 03:46:00 UTC 2014 (duration 45m 59s)
  • 02:46 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-20 02:46:05+00:00
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-20 02:33:34+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-20 02:19:34+00:00

September 19

  • 22:16 RoanKattouw: Restarting Jenkins
  • 21:57 logmsgbot: spage Synchronized php-1.24wmf21/extensions/Flow/modules/new/components/flow-board.js: Flow bug 71054 backport (duration: 00m 04s)
  • 20:50 ori: restarted HHVM and cleared bytecode cache on all HHVM app servers
  • 20:47 _joe_: restarted hhvm on mw1018, cleaning the cache as well
  • 20:25 ori: Deployed Ic71064e08 (type hint fix for Wikidata) to wmf21/22.
  • 19:09 bblack: restarted hhvm on mw1021
  • 18:59 _joe_: rolling restart of hhvm servers
  • 18:22 bblack: restarting hhvm on mw1020 (again!)
  • 18:19 hashar: Jenkins: reverting job mwext-VisualEditor-qunit to previous state (i.e. without Zuul cloner)
  • 18:17 bblack: restarting hhvm on mw1020
  • 17:57 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I3e1bd5e4bb: Don't manipulate the environment to determine TZ offset (Bug: 71036) (duration: 00m 13s)
  • 17:30 bblack: turned down apache prefork procs on fenari to reduce swapping
  • 17:16 ottomata: initiating controlled shutdown of kafka broker analytics1021 to test some kafkatee weirdness, as well as a potential kafka/zookeeper bug
  • 17:07 bblack: restarting apache on fenari
  • 16:21 bblack: restarted hhvm on mw1019 + 1021
  • 14:57 hashar: Jenkins friday deploy: migrate all MediaWiki extension qunit jobs to Zuul cloner.
  • 14:37 akosiaris: initiated rsync of tridge data that is to be kept to nas1001-a
  • 13:56 springle: killing any sleeping connection on enwiki db slaves to make room
  • 13:56 mark: Stopped jobrunners on mw1001-1003
  • 12:36 springle: temporarily disable log fsync on enwiki slaves
  • 12:14 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1072 with ReadAheadNone (duration: 00m 09s)
  • 11:32 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1072. seems more susceptible to replag; find out why. (duration: 00m 10s)
  • 09:14 _joe_: restarted hhvm on mw1053, stuck to 100% cpu since last restart (activating stats)
  • 05:01 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep 19 05:01:54 UTC 2014 (duration 1m 52s)
  • 03:45 logmsgbot: LocalisationUpdate completed (1.24wmf22) at 2014-09-19 03:45:33+00:00
  • 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-19 03:11:43+00:00
  • 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-19 02:38:25+00:00
  • 00:43 cscott: updated OCG to version ce16f7adb60d7c77409e2e11ba0e5d6cce6955d5

September 18

  • 23:55 logmsgbot: ori Started scap: Add HHVM as a beta feature
  • 23:54 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I2466f6b6e: Add HHVM to beta feature whitelist (duration: 00m 08s)
  • 23:52 logmsgbot: ori Synchronized php-1.24wmf22/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 06s)
  • 23:51 logmsgbot: ori Synchronized php-1.24wmf21/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 06s)
  • 23:25 logmsgbot: catrope Synchronized php-1.24wmf22/resources/lib/oojs-ui/: oojs-ui bugfixes (duration: 00m 06s)
  • 23:13 logmsgbot: catrope Synchronized php-1.24wmf22/extensions/VisualEditor/: SWAT (duration: 00m 08s)
  • 23:04 logmsgbot: catrope Synchronized php-1.24wmf21/extensions/UploadWizard/: SWAT (duration: 00m 08s)
  • 19:57 Jeff_Green: iridium.wm.o exim conf checked, puppet reenabled
  • 19:54 Jeff_Green: magnesium.wm.o exim conf checked, puppet reenabled
  • 19:50 Jeff_Green: sodium.wm.o exim conf checked, puppet reenabled
  • 19:48 logmsgbot: reedy Synchronized php-1.24wmf22/extensions/Flow/: (no message) (duration: 00m 16s)
  • 19:45 Jeff_Green: iodine.wm.o exim conf checked, puppet reenabled
  • 19:44 Jeff_Green: polonium.wm.o exim conf checked, puppet reenabled
  • 19:35 Jeff_Green: lead.wm.o exim conf checked, puppet reenabled
  • 19:22 logmsgbot: reedy Synchronized php-1.24wmf22: (no message) (duration: 00m 57s)
  • 19:16 Jeff_Green: disabling puppet on polonium, lead, sodium, iridium, magnesium, and iodine to monitor rollout of https://gerrit.wikimedia.org/r/155753
  • 19:05 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: rest of group0 to 1.24wmf22
  • 19:01 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf21
  • 18:59 bblack: restarting apache on fenari
  • 18:49 logmsgbot: reedy Finished scap: testwiki to 1.24wmf22 and build l10n cache (duration: 30m 23s)
  • 18:44 Jeff_Green: testing exim configuration change on lead.wm.o
  • 18:18 logmsgbot: reedy Started scap: testwiki to 1.24wmf22 and build l10n cache
  • 17:49 logmsgbot: reedy Started scap: testwiki to 1.24wmf22 and build l10n cache
  • 17:08 cmjohnson1: replacing failed disk es1005
  • 17:05 logmsgbot: yurik Finished scap: (no message) (duration: 23m 26s)
  • 16:43 yurikR: yurik scaping zero - partner needs an l10n message asap
  • 16:42 logmsgbot: yurik Started scap: (no message)
  • 15:38 hashar: restarting Zuul just to be safe
  • 15:06 logmsgbot: anomie Synchronized php-1.24wmf21/resources/src/mediawiki.action/mediawiki.action.view.redirectPage.css: SWAT: mediawiki.action.view.redirectPage: Correct a CSS selector gerrit:161239 (duration: 00m 23s)
  • 15:01 logmsgbot: anomie Synchronized php-1.24wmf21/extensions/Wikidata/: SWAT: Update Wikidata to fix broken xml api output gerrit:161232 (duration: 00m 38s)
  • 11:40 apergos: forgot to log this earlier: manually started salt minion on radon, elastic1015, searchidx1001, it wasn't running there
  • 09:00 godog: updated authdns to 0c2225d
  • 08:56 springle: xtrabackup clone db1016 to db2010
  • 07:48 godog: re-enabled icinga notifications for ms-be1001
  • 07:09 bblack: removing pybal cfg "eqiad/misc_web_https" (unused now, https://gerrit.wikimedia.org/r/161183)
  • 06:53 bblack: removing pybal cfg "esams/wikimedialbsecure" (unused, points at maerlant)
  • 06:47 bblack: removing pybal symlink "$site/ipv6", also unused (old ipv6 protoproxying)
  • 06:45 bblack: removing pybal symlink "$site/text-varnish", seems to be a remnant no longer in use
  • 04:20 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 18 04:20:56 UTC 2014 (duration 20m 55s)
  • 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-18 03:09:44+00:00
  • 02:53 logmsgbot: yurik Synchronized wmf-config/CommonSettings.php: (no message) (duration: 01m 53s)
  • 02:52 yurikR: yurik Fixing graph ext namespace name - otherwise get screen of WMF death on graph: ns visits
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-18 02:36:46+00:00
  • 00:32 logmsgbot: marktraceur Finished scap: [SWAT] Move things out of assets/ and into resources/assets/ (duration: 35m 28s)

September 17

  • 23:57 logmsgbot: marktraceur Started scap: [SWAT] Move things out of assets/ and into resources/assets/
  • 23:47 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: [SWAT] Enable Graph on metawiki and labswiki (duration: 00m 10s)
  • 23:42 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/Graph/: [SWAT] Update Graph to master (duration: 00m 08s)
  • 23:41 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/Graph/: [SWAT] Update Graph to master (duration: 00m 07s)
  • 23:35 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/MultimediaViewer/: [SWAT] Fix reuse dropdown message weirdness (duration: 00m 07s)
  • 23:29 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/MultimediaViewer/: [SWAT] Fix reuse dropdown message weirdness (duration: 00m 08s)
  • 23:10 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/UploadWizard/: [SWAT] Fix EventLogging schema declarations for UploadWizard (duration: 00m 11s)
  • 21:41 mutante: fixing updates on planet feeds - file permissions
  • 21:11 manybubbles: restarting rebuilding cirrus's enwiki index now that I've found the reason it wasn't working before - the new index was putting too many shards on an already full node and overwhelming it. silly allocation algorithm! thats a bad idea!
  • 21:07 logmsgbot: yurik Synchronized php-1.24wmf21/extensions/ZeroPortal/: (no message) (duration: 01m 05s)
  • 20:19 godog: rebooting ms-be1006
  • 19:00 Krinkle: jenkins-slave tmpfs on lanthanum was filling up (> 500MB). I purged tmp dbs for old jobs. We should get these purged automatically and also increase the size as 500MB is too little.
  • 18:59 robh: disabled icinga alerts for ms-be1001, rebooting it to look at its raid bios settings for codfw deployment mirroring
  • 18:47 logmsgbot: yurik Synchronized php-1.24wmf20/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 39s)
  • 18:43 logmsgbot: yurik Synchronized php-1.24wmf21/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 35s)
  • 18:40 logmsgbot: yurik Synchronized wmf-config/: private wikis login/logout page names, zeroportal impersonator acct (duration: 01m 06s)
  • 18:23 mutante: phabricator - made aklapper an admin
  • 17:26 logmsgbot: andrew rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
  • 17:23 logmsgbot: andrew Synchronized wikiversions.json: (no message) (duration: 00m 05s)
  • 17:04 manybubbles: cirrus brownout looks just about fixed. So! My plan for periodically explicitly merging deletes has some problems.....
  • 16:42 gwicke: restarted parsoid on wtp102{2,3,4}
  • 16:31 manybubbles: just going to make this clear - the current cirrus brownout doesn't seem to be effecting my queries but we're getting hit with pool counter full events - sadness. its not caused by switching cirrus to ruwiki's primary backend - its caused by me attempting to perform index maintenance activities.
  • 16:23 akosiaris: restarted node on wtp boxes except wtp1022,wtp1023,wtp1024
  • 16:23 manybubbles: caused cirrus brownout by executing a force merge for enwiki's general index. ooops
  • 16:06 logmsgbot: manybubbles Synchronized wmf-config/: set cirrus as primary search backend for ruwiki and make permanent some settings set on the fly (duration: 00m 06s)
  • 15:57 manybubbles: manually pushed apart ruwiki and nlwiki's shards as well - might help - updated commit to reflect that
  • 15:42 manybubbles: gerrit change to lock that into place is https://gerrit.wikimedia.org/r/#/c/160974/ and I'll deploy it in my window in 15 minutes.
  • 15:41 manybubbles: manually forcing Cirrus's commonswiki's file index apart from one another in an attempt to lower the consistently high load on elastic1013
  • 15:34 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Set wgMetaNamespace for labswiki (duration: 00m 14s)
  • 14:54 springle: db1062 out of action for bug hunt https://mariadb.atlassian.net/browse/MDEV-6751
  • 14:48 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: (no message) (duration: 00m 16s)
  • 14:45 godog: restarted apache2 on magnesium, validate removal of ssl certs
  • 13:38 hashar: Zuul upgraded successfully apparently.
  • 13:33 hashar: stopping zuul for upgrade
  • 13:29 hashar: upgrading Zuul to 2.0.0.286.gb1811ab
  • 12:20 hashar: upgrading jenkins 1.565.1 -> 1.565.2
  • 09:53 akosiaris: stopped apache2 on fenari, it was leaking memory, puppet restarted it, need to kill this machine ASAP
  • 09:52 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s1 db1061 (duration: 00m 08s)
  • 06:55 springle: xtrabackup clone db1061 to db2016
  • 06:52 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool s1 db1061 for codfw cloning (duration: 00m 07s)
  • 06:27 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s7 db1039 (duration: 00m 08s)
  • 04:34 logmsgbot: tstarling Synchronized docroot/bits: (no message) (duration: 00m 10s)
  • 04:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep 17 04:32:17 UTC 2014 (duration 32m 16s)
  • 03:17 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-17 03:17:38+00:00
  • 03:07 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s6 db1015 (duration: 01m 41s)
  • 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-17 02:43:02+00:00
  • 02:21 springle: xtrabackup clone db1048 to db2012
  • 02:15 springle: xtrabackup clone db1046 to db2011
  • 02:00 springle: xtrabackup clone db1016 to db2010
  • 01:54 springle: xtrabackup clone db1031 to db2009
  • 01:33 springle: xtrabackup clone db1039 to db2029
  • 01:33 springle: xtrabackup clone db1015 to db2028
  • 01:29 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool s6 db1015 and s7 db1039 (duration: 00m 20s)
  • 01:15 Reedy: updateCollation on shwiki done
  • 00:59 Reedy: running `mwscript updateCollation.php --wiki=shwiki --previous-collation=uppercase` in screen on tin
  • 00:58 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: shwiki collation (duration: 00m 16s)
  • 00:53 Reedy: updateCollation on etwiki done
  • 00:52 Reedy: updateCollation on etwiktionary done
  • 00:48 Reedy: running `mwscript updateCollation.php --wiki=etwiktionary --previous-collation=uppercase` in screen on tin
  • 00:47 Reedy: etwikisource collation updated (9918 rows)
  • 00:47 Reedy: etwikiquote collation updated (706 rows)
  • 00:46 Reedy: etwikimedia collation updated (121 rows)
  • 00:46 Reedy: etwikibooks collation updated (280 rows)
  • 00:45 Reedy: running `mwscript updateCollation.php --wiki=etwiki --previous-collation=uppercase` in screen on tin
  • 00:45 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: et collations (duration: 00m 15s)
  • 00:43 Reedy: updateCollation on frwikiversity done
  • 00:42 Reedy: running `mwscript updateCollation.php --wiki=frwikiversity --previous-collation=uppercase` in screen on tin
  • 00:42 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: frwikiversity collation (duration: 00m 17s)
  • 00:40 Reedy: updateCollation on skwiki done
  • 00:26 Reedy: Running `mwscript updateCollation.php --wiki=skwiki --previous-collation=uppercase` in screen on tin
  • 00:25 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: skwiki collation (duration: 00m 15s)
  • 00:18 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 15s)

September 16

  • 23:22 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/VisualEditor/: (no message) (duration: 00m 04s)
  • 23:21 logmsgbot: maxsem Synchronized php-1.24wmf20/extensions/VisualEditor/: (no message) (duration: 00m 04s)
  • 23:16 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/Wikidata: (no message) (duration: 00m 24s)
  • 23:15 MaxSem: Wikidata submodule in wmf21 was in the middle of rebase - reset and updating to a newer submodule commit
  • 23:12 logmsgbot: maxsem Synchronized php-1.24wmf20/extensions/Wikidata: (no message) (duration: 00m 17s)
  • 23:07 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/GettingStarted: https://gerrit.wikimedia.org/r/#/c/160084/ (duration: 00m 08s)
  • 21:36 Jeff_Green: SPF record deployed for donate.wikimedia.org
  • 21:01 logmsgbot: ejegg Synchronized php-1.24wmf20/extensions/CentralNotice/modules/ext.centralNotice.bannerController/bannerController.js: (no message) (duration: 00m 06s)
  • 19:38 csteipp: deployed patches for bugs 70469 and 70672
  • 19:17 logmsgbot: catrope Synchronized php-1.24wmf21/extensions/VisualEditor/: Revert IE hacks so Firefox will stop corrupting non-Latin characters (duration: 00m 06s)
  • 19:15 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/VisualEditor/: (no message) (duration: 00m 09s)
  • 18:32 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
  • 18:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf21
  • 17:03 logmsgbot: bd808 Finished scap: No code change scap to test scap internal update (duration: 18m 06s)
  • 16:45 logmsgbot: bd808 Started scap: No code change scap to test scap internal update
  • 16:43 bd808|deploy: Updated scap to 663f137 (Check php syntax with parallel `php -l`)
  • 16:42 bd808|deploy: Trebuchet sync for scap reporting failure from osmium.eqiad.wmnet, mw1053.eqiad.wmnet, searchidx1001.eqiad.wmnet, fenari.wikimedia.org, and mw1110.eqiad.wmnet
  • 16:41 bd808|deploy: Trebuchet update for scap reporting failure from osmium.eqiad.wmnet, searchidx1001.eqiad.wmnet, fenari.wikimedia.org and mw1110.eqiad.wmnet
  • 16:00 _joe_: mw1018 and mw1021 in the hhvm appservers pool
  • 15:35 logmsgbot: reedy Synchronized docroot and w: Update symlinks to use /srv/mediawiki (duration: 00m 16s)
  • 15:34 hashar: Jenkins: deleting /srv/ssd/jenkins-slave/workspace/*testextensions-master on gallium and lanthanum.
  • 15:25 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 03s)
  • 15:23 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 19s)
  • 15:20 manybubbles: SWAT complete
  • 15:16 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/VisualEditor/: swat update for wmf20 (duration: 00m 25s)
  • 15:13 hashar: Jenkins: mediawiki extensions phpunit jobs should pass more or less until the CI system is sent an orbit and dies out horribly. in such a case ping me / phone.
  • 15:08 logmsgbot: manybubbles Synchronized php-1.24wmf21/extensions/VisualEditor/: SWAT visual editor update wmf21 (duration: 00m 07s)
  • 14:52 ottomata: set vm.dirty_expire_centisecs to 10000 (was 30000) on analytics1021 to experiment with paging and kafka-zookeeper timeouts
  • 14:36 godog: stopped htcp-purger on ms1004 RT #8358
  • 14:32 godog: silenced ms-be1014 until torrow, pending forced reboot
  • 14:28 hashar: Jenkins: breaking continuous integration for MediaWiki repositories. Extensions are now tested with mediawiki/vendor and, mediawiki/core is checked out to the patch branch if it exist. 160656
  • 14:20 akosiaris_: restarted apache on fenari , it was leaking memory, situation back to normal, cause unknown yet
  • 14:12 akosiaris_: stopped apache on fenari . It was in swap, investigating
  • 12:35 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool s2 db1054, s3 db1027, s4 db1056, s5 db1037 (duration: 00m 10s)
  • 12:26 godog: reboot ms-be1014, xfs issues
  • 12:22 godog: temporarily chgrp wikidev /var/log/hhvm/error.log on mw1018
  • 12:21 logmsgbot: reedy Synchronized php-1.24wmf20/LocalSettings.php: Fix path to be /srv based (duration: 00m 32s)
  • 11:25 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 35s)
  • 11:12 logmsgbot: reedy Purged l10n cache for 1.24wmf19
  • 11:12 logmsgbot: reedy Purged l10n cache for 1.24wmf18
  • 11:10 logmsgbot: reedy Purged l10n cache for 1.24wmf15
  • 09:21 _joe_: reimaging mw1018 and mw1021 w HAT: removing from pybal, etc.
  • 06:29 springle: xtrabackup clone db1037 to db2023
  • 05:31 springle: xtrabackup clone db1056 to db2019
  • 04:01 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 16 04:01:05 UTC 2014 (duration 1m 4s)
  • 03:11 springle: xtrabackup clone db1027 to db2018
  • 03:04 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-16 03:04:46+00:00
  • 02:53 springle: xtrabackup clone db1054 to db2017
  • 02:50 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool s2 db1054, s3 db1027, s4 db1056, s5 db1037 for codfw cloning (duration: 01m 12s)
  • 02:39 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1036, depool db1002 (duration: 00m 07s)
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-16 02:31:16+00:00

September 15

  • 23:32 logmsgbot: maxsem Synchronized php-1.24wmf21/resources/: SWAT: https://gerrit.wikimedia.org/r/#/c/160488/1 https://gerrit.wikimedia.org/r/#/c/160543/ (duration: 00m 06s)
  • 23:26 bblack: restarting lvs1001 for HT disable + kernel upgrade
  • 23:19 logmsgbot: maxsem Synchronized php-1.24wmf21/extensions/VisualEditor/: SWAT: https://gerrit.wikimedia.org/r/#/c/160554/ (duration: 00m 07s)
  • 23:12 bblack: restarting lvs1002 for HT disable + kernel upgrade
  • 23:07 Krinkle: Running sample job on integration-slave1006 and warming up npmjs.org cache
  • 22:56 Krinkle: Running sample job on integration-slave1008 and warming up npmjs.org cache
  • 22:49 Krinkle: Running sample job on integration-slave1007 and warming up npmjs.org cache
  • 22:48 Krinkle: Pooling the newly setup Trusty-based Jenkins slaves (integration-slave1006, integration-slave1007 and integration-slave1008)
  • 22:42 bblack: dropping static routes for 2620:0:861:ed1a::[d,f,10,11] -> lvs1005 from cr[12]-eqiad (only 11 is of any consequence, misc-web-lb, and they're advertised by bgp and this is preventing failover to lvs1002)
  • 21:28 cscott: updated OCG to version 188a3c221d927bd0601ef5e1b0c0f4a9d1cdbd31
  • 20:46 subbu: deployed Parsoid version b845bff9
  • 18:49 logmsgbot: ejegg Synchronized php-1.24wmf20/extensions/CentralNotice/: Update CentralNotice to remove jquery.json dependency (duration: 00m 23s)
  • 18:46 hoo: Sync to tmh100[12] failed, according to awight
  • 18:44 logmsgbot: ejegg Synchronized php-1.24wmf21/extensions/CentralNotice/: Update CentralNotice to remove jquery.json dependency (duration: 00m 09s)
  • 18:43 manybubbles: performance tests show cirrus should handle jawiki with no problem but if load spirals out of control and I'm not around then revert https://gerrit.wikimedia.org/r/#/c/160465/
  • 18:40 hoo: Local part of the global rename of Gnumarcoo => .avgas fatally timed out on itwiki. This needs to be fixed per hand.
  • 18:40 manybubbles: Setting Cirrus to jawiki's primary search backend went well but Japan is mostly asleep. If Elasticsearch load takes a turn for the worse in four or five hours then we'll know how it went.
  • 17:14 bd808: Restarted elasticsearch on logstash1003; 2014-09-14T09:33:57Z java.lang.OutOfMemoryError
  • 17:09 _joe_: killing salt-call on all mediawiki hosts
  • 17:06 bd808: Restarted elasticsearch on logstash1001; 2014-09-15T06:12:09Z java.lang.OutOfMemoryError
  • 17:04 bblack: using salt to kill salt-minion everywhere...
  • 17:02 bd808: Restarted logstash on logstash1001. I hoped this would fix the dashboards, but it looks like the backing elasticsearch cluster is too sad for them to work at the moment.
  • 16:55 bd808: Restarted hung elasticsearch service on logstash1002
  • 16:15 manybubbles: jawiki now has cirrus as primary. we're back to where we were before the great cascading failure of two months ago
  • 16:13 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s)
  • 15:29 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/MultimediaViewer/: [SWAT] Several backports for metrics and bugfixes in Media Viewer (duration: 00m 07s)
  • 15:27 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/MultimediaViewer/: [SWAT] Several backports for metrics and bugfixes in Media Viewer (duration: 00m 07s)
  • 15:18 logmsgbot: marktraceur Synchronized php-1.24wmf21/extensions/GeoCrumbs/GeoCrumbs.class.php: [SWAT] Handle return value NULL of GeoCrumbs::getParserCache (duration: 00m 07s)
  • 15:17 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/GeoCrumbs/GeoCrumbs.class.php: [SWAT] Handle return value NULL of GeoCrumbs::getParserCache (duration: 00m 07s)
  • 15:06 logmsgbot: marktraceur Synchronized wmf-config/: [SWAT] Remove 'renameuser' right from bureaucrats on CentralAuth wikis (duration: 00m 09s)
  • 14:54 logmsgbot: aude Synchronized wmf-config/Wikibase.php: Bump wikibase memcached key for test.wikidata, test, test2 (duration: 00m 16s)
  • 14:54 hashar: Updated Jenkins Job Builder fork: e5c0c61..2d74b16
  • 14:50 logmsgbot: aude Finished scap: Put test.wikidata back on mw1.24-wmf19 extension branch (duration: 37m 27s)
  • 14:43 manybubbles: restarting the enwiki cirrus reindex process - it crashed over the weekend. why you crash and leave error message "1". "1" is not a useful error message.
  • 14:13 logmsgbot: aude Started scap: Put test.wikidata back on mw1.24-wmf19 extension branch
  • 13:03 _joe_: fenari is swapping hard, restarting apache who was eating up all the RAM
  • 09:20 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: *.scienceimage.csiro.au to the wgCopyUploadsDomains 159999 bug 70771 (duration: 00m 06s)
  • 09:15 hashar: Jenkins: apt-get upgrade on prod slaves (updates php5 / libc / jdk 7)
  • 03:09 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1036 (duration: 00m 09s)
  • 02:03 logmsgbot: LocalisationUpdate failed: mwversionsinuse returned empty list
  • 01:47 logmsgbot: hoo Synchronized wmf-config/liquidthreads.php: Remove global $path (duration: 00m 07s)
  • 01:47 logmsgbot: hoo Synchronized wmf-config/flaggedrevs.php: Remove global $path (duration: 00m 10s)

September 14

  • 20:37 ori_: enabling puppet on mw1053
  • 20:11 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062, locked up (duration: 00m 09s)
  • 13:24 _joe_: stopped puppet aand the JR on mw1053
  • 12:42 hoo: Ran sync-common on mw1053 to stop "Unrecognized job type 'ChangeNotification'." exceptions
  • 11:14 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1005 (duration: 00m 07s)
  • 10:37 springle: restart es1005
  • 09:56 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1007, depool es1005 (duration: 00m 10s)
  • 02:01 logmsgbot: LocalisationUpdate failed: mwversionsinuse returned empty list
  • 00:45 ori_: fenari appears to still have twemproxy (in addition to nutcracker); decom'ing.
  • 00:29 ori_: restarting apache2 on fenari

September 13

  • 04:42 legoktm: global rename for Trevor Parscal (WMF) unstuck itself, yay
  • 04:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 13 04:22:04 UTC 2014 (duration 22m 3s)
  • 03:51 legoktm: global rename for Trevor Parscal --> Trevor Parscal (WMF) looks stuck on metawiki and mswiki, in queued state for both but showJobs.php says the jobs are active and claimed
  • 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-13 03:11:40+00:00
  • 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-13 02:38:26+00:00
  • 01:45 logmsgbot: ori Synchronized php-1.24wmf21/extensions/Flow: Update flow for I4da934dfe (duration: 00m 06s)
  • 01:45 logmsgbot: ori Synchronized php-1.24wmf20/extensions/Flow: Update flow for I4da934dfe (duration: 00m 06s)
  • 01:41 logmsgbot: ori Synchronized php-1.24wmf20/extensions/Flow: Update flow for I4da934dfe (duration: 00m 08s)

September 12

  • 21:26 csteipp: deployed fixes for bugs 70620, 69008
  • 20:37 logmsgbot: mattflaschen Synchronized php-1.24wmf21/extensions/GettingStarted/: Deploy to fix GettingStarted bucketting for users with null registration date (duration: 00m 05s)
  • 20:37 logmsgbot: mattflaschen Synchronized php-1.24wmf20/extensions/GettingStarted/: Deploy to fix GettingStarted bucketting for users with null registration date (duration: 00m 07s)
  • 19:34 legoktm: running migratePass0.php across all CentralAuth wikis
  • 17:43 logmsgbot: ori updated /a/common to I4e4187285: Rename some constants to clarify their meaning and purpose
  • 14:52 manybubbles: rebuilding enwiki's Cirrus index for more performance testing. Please be faster now. k?
  • 08:37 _joe_: rolling restart of pybal finished. Adding note on Fenari
  • 08:19 _joe_: reactivated puppet on all lvs hosts, esams almost done, pending eqiad
  • 08:06 _joe_: new pybal conf applied in all of ulsfo
  • 07:39 _joe_: changing pybal config place; stopping puppet on all loadbalancers
  • 04:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep 12 04:27:17 UTC 2014 (duration 27m 16s)
  • 03:15 logmsgbot: LocalisationUpdate completed (1.24wmf21) at 2014-09-12 03:15:57+00:00
  • 03:08 logmsgbot: mattflaschen Finished scap: One last CSS fix (wrapping issue for error state) for GettingStarted A/B test (duration: 24m 38s)
  • 02:43 logmsgbot: mattflaschen Started scap: One last CSS fix (wrapping issue for error state) for GettingStarted A/B test
  • 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-12 02:39:35+00:00
  • 01:33 logmsgbot: mattflaschen Synchronized php-1.24wmf21/extensions/GettingStarted/: CSS tweaks for GettingStarted A/B test (duration: 00m 07s)
  • 01:32 logmsgbot: mattflaschen Synchronized php-1.24wmf20/extensions/GettingStarted/: CSS tweaks for GettingStarted A/B test (duration: 00m 21s)
  • 01:29 logmsgbot: ori Synchronized wmf-config/wikitech.php: Ia5b81076e: Update path reference for /srv/mediawiki (duration: 00m 04s)
  • 01:28 logmsgbot: ori updated /a/common to Ia5b81076e: Update path reference for /srv/mediawiki
  • 01:19 ori: manually migrated /u/l/a/common-local to /srv/mediawiki on virt1000
  • 00:36 logmsgbot: ori Synchronized php-1.24wmf21/extensions/Wikidata: Update Wikidata to tip of master for I23b7eb54b8e (Bug: 70747) (duration: 00m 08s)
  • 00:12 logmsgbot: esanders Synchronized php-1.24wmf21/resources/lib/oojs-ui/: (no message) (duration: 00m 03s)
  • 00:12 logmsgbot: esanders Synchronized php-1.24wmf21/extensions/MultimediaViewer/: (no message) (duration: 00m 07s)
  • 00:00 logmsgbot: esanders Finished scap: SWAT deploy (duration: 28m 39s)

September 11

  • 23:31 logmsgbot: esanders Started scap: SWAT deploy
  • 23:29 logmsgbot: mattflaschen Finished scap: Deploy new GettingStarted recommendations A/B test (duration: 99m 34s)
  • 23:15 logmsgbot: esanders scap failed: LockFailedError Failed to lock /var/lock/scap: [Errno 11] Resource temporarily unavailable (duration: 00m 00s)
  • 23:00 mutante: restarting icinga-wm for config change
  • 21:49 logmsgbot: mattflaschen Started scap: Deploy new GettingStarted recommendations A/B test
  • 21:14 Krinkle: Stopping/starting zuul
  • 21:08 andrewbogott: restarting zuul on gallium
  • 20:58 andrewbogott: restarted jenkins, maybe
  • 20:56 ori: graceful'd apache on mw1053, missed it earlier
  • 20:49 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I1f3234746: Revert Scribunto: double the Lua CPU limit on the job runners (duration: 00m 05s)
  • 20:48 logmsgbot: ori updated /a/common to I1f3234746: Revert "Scribunto: double the Lua CPU limit on the job runners"
  • 20:42 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 20:15 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
  • 20:15 andrewbogott: syncing virt1000, again in hopes of moving to wmf20
  • 20:08 logmsgbot: reedy Synchronized php-1.24wmf21/extensions/Wikidata/: (no message) (duration: 00m 17s)
  • 19:58 Reedy: Running sync-common on mw1024
  • 19:52 Reedy: Running manual sync-common on mw1138
  • 19:51 logmsgbot: reedy Synchronized wmf-config/: Fix Zero settings (duration: 00m 15s)
  • 19:49 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf21
  • 19:44 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf20
  • 19:20 mutante: graceful'ed apache on mw1143
  • 19:16 Reedy: running sync-common on mw1143
  • 19:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
  • 19:02 bd808: Restarted elasticsearch on logstash1003 -- Java OOM error in logs and not recovering shards
  • 18:54 ori: graceful'd all apaches
  • 18:51 ori: graceful'd apache on mw1047, mw1151, mw1137, mw1146 and mw1076
  • 18:46 logmsgbot: ori Synchronized php-1.24wmf19/includes/WebStart.php: (no message) (duration: 00m 06s)
  • 18:45 logmsgbot: ori Synchronized php-1.24wmf19/includes/profiler/Profiler.php: (no message) (duration: 00m 07s)
  • 18:17 logmsgbot: reedy Started scap: testwiki to 1.24wmf21 and build l10n cache take 3
  • 18:16 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.Nd45X2RONi" --verbose' returned non-zero exit status 1 (duration: 01m 18s)
  • 18:14 logmsgbot: reedy Started scap: testwiki to 1.24wmf21 and build l10n cache
  • 18:13 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.IH8przTNHs" ' returned non-zero exit status 1 (duration: 04m 59s)
  • 18:08 logmsgbot: reedy Started scap: testwiki to 1.24wmf21 and build l10n cache
  • 18:02 manybubbles: raised logging on Elasticsearch cluster temporarily to get more information about merging - a process super important to keeping the index up to date in "real time"
  • 17:20 logmsgbot: ori updated /a/common to I0bda3deab: Replace remaining references to /u/l/a/common
  • 17:18 logmsgbot: ori updated /a/common to I37b0a8338: Get rid of MULTIVER_CDB_DIR_{APACHE,HOME}
  • 16:57 andrewbogott: sync-common on virt1000 -- with any luck this will upgrade us to wmf20
  • 16:56 logmsgbot: andrew rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
  • 16:53 logmsgbot: bd808 Finished scap: Preparing to move wikitech to 1.24wmf20 (second try) (duration: 24m 25s)
  • 16:46 andrewbogott: apache graceful on mw1039
  • 16:33 bd808|deploy: andrewbogott did apache graceful on mw1120 to stop wikidata APC logspam
  • 16:29 logmsgbot: bd808 Started scap: Preparing to move wikitech to 1.24wmf20 (second try)
  • 16:22 logmsgbot: andrew Finished scap: Preparing to move wikitech to 1.24wmf20 (duration: 06m 45s)
  • 16:19 bd808: Restarted logstash on logstash1001. Log empty and events not being stored in elasticsearch
  • 16:15 logmsgbot: andrew Started scap: Preparing to move wikitech to 1.24wmf20
  • 15:45 bblack: icinga config is correct now, back to normal puppet updates
  • 15:24 bblack: restarted icinga, manually removed some labsy things that were broken in config and temporarily disabled puppet :p
  • 14:44 _joe_: php upgrade finished
  • 14:23 _joe_: upgrading php across the cluster: libapache2-mod-php5 php5-cli php-pear php5 php5-common php5-curl php5-dev php5-intl php5-mysql php5-xmlrpc
  • 13:04 akosiaris: uploaded php5_5.3.10-1ubuntu3.14+wmf1 on apt.wikimedia.org
  • 10:00 _joe_: enabled puppet on mw1053
  • 09:38 _joe_: gracefulling mw1200 mw1196 and mw1186 as they have APC issues
  • 09:21 _joe_: upgrading hhvm and hhvm-luasandbox across the production cluster
  • 09:00 akosiaris: upgrading php5 to 5.3.10-1ubuntu3.14+wmf1 on mw1212
  • 08:34 _joe_: updating php-pear php5 php5-cli php5-common php5-curl php5-dev php5-intl php5-mysql php5-xmlrpc libapache2-mod-php5 on mw1018, see USN 2344-1
  • 03:41 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 11 03:41:03 UTC 2014 (duration 41m 2s)
  • 02:49 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-11 02:49:26+00:00
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-11 02:36:37+00:00
  • 02:23 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-11 02:23:29+00:00
  • 00:28 mutante: graceful'ed Apaches on mw1171, mw1187
  • 00:25 logmsgbot: ori Synchronized wmf-config: Id607bf36d: Update remaining references to /u/l/a/common-local (duration: 00m 03s)
  • 00:25 logmsgbot: ori Synchronized multiversion: Id607bf36d: Update remaining references to /u/l/a/common-local (duration: 00m 04s)
  • 00:22 logmsgbot: ori Synchronized docroot and w: Id607bf36d: Update remaining references to /u/l/a/common-local (duration: 00m 04s)
  • 00:07 logmsgbot: ori updated /a/common to Id607bf36d: Update remaining references to /u/l/a/common-local

September 10

  • 23:44 mutante: graceful'ed mw1202 apache
  • 23:29 mutante: deleted labstore1003.eqiad.wmnet.org from puppet stored resource db, fixes puppet runs on hosts with ssh host key collection
  • 23:26 logmsgbot: oblivian gracefulled all apaches
  • 23:22 logmsgbot: maxsem Synchronized php-1.24wmf20/includes/specialpage/SpecialPageFactory.php: https://gerrit.wikimedia.org/r/#/c/159526/ (duration: 00m 03s)
  • 23:22 logmsgbot: maxsem Synchronized php-1.24wmf19/includes/specialpage/SpecialPageFactory.php: https://gerrit.wikimedia.org/r/#/c/159526/ (duration: 00m 03s)
  • 23:21 logmsgbot: maxsem Synchronized php-1.24wmf20/extensions/CentralAuth/: (no message) (duration: 00m 03s)
  • 23:21 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/CentralAuth/: (no message) (duration: 00m 04s)
  • 23:19 logmsgbot: maxsem Synchronized php-1.24wmf10/resources/: https://gerrit.wikimedia.org/r/#/c/159513/ (duration: 00m 05s)
  • 22:52 mutante: labstore1003 - (earlier) revoked salt and puppet key and signed new after hostname fix - same salt-minion puppet errors that happen after reinstalls
  • 19:52 Reedy: Created Echo tables on extension1 for cawikimedia
  • 19:51 RobH: puppet disabled on carbon (install server) for a livehack test of config setting
  • 18:51 yurikR: yurik CommonSettings.php - zerowiki perm changes
  • 18:51 logmsgbot: yurik Synchronized wmf-config/CommonSettings.php: (no message) (duration: 01m 05s)
  • 18:26 logmsgbot: yurik Synchronized php-1.24wmf20/extensions/ZeroBanner: (no message) (duration: 01m 09s)
  • 18:22 logmsgbot: yurik Synchronized php-1.24wmf19/extensions/ZeroBanner: (no message) (duration: 01m 11s)
  • 18:00 manybubbles: cirrus index rebuild for test2wiki went well - doing the rest of group0
  • 17:35 manybubbles: rebuilding cirrus index for test2wiki to test some performance enhancements don't break anything. test2wiki is too small to see any gain from the enhancements though.
  • 17:25 Reedy: mw1126, mw1116, mw1122, mw1146, mw1121, mw1136, mw1114, mw1068 have been gracefulled
  • 17:10 bd808: Restarted logstash on logstash1001
  • 16:03 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: nlwiki cirrus (duration: 00m 04s)
  • 15:44 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 15s)
  • 15:02 logmsgbot: demon Synchronized wmf-config/wikitech.php: no-op (duration: 00m 06s)
  • 09:13 godog: rolling restart swift-proxy on ms-fe1*
  • 04:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep 10 04:17:36 UTC 2014 (duration 17m 35s)
  • 03:08 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-10 03:07:59+00:00
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-10 02:36:00+00:00
  • 02:28 ori: updated salt key for iridium and restarted salt-minion
  • 02:18 mutante: started salt-minion on iridium

September 9

  • 23:15 Krinkle: Reloading Zuul to deploy I26bc21ed2938e97e7ed6f6b
  • 23:15 logmsgbot: demon Synchronized php-1.24wmf20/extensions/CirrusSearch: Various fixes for things (duration: 00m 05s)
  • 23:00 mutante: added wikimedia.org to search in resolv.conf on terbium
  • 22:42 logmsgbot: ebernhardson Synchronized wmf-config/InitialiseSettings.php: Deploy config change I158e7c6852 (duration: 00m 04s)
  • 22:23 Krinkle: Reloading Zuul to deploy I27024680c74ca0130
  • 22:21 logmsgbot: ebernhardson Finished scap: Bump Echo and Flow versions in 1.24wmf19 (duration: 31m 25s)
  • 21:49 logmsgbot: ebernhardson Started scap: Bump Echo and Flow versions in 1.24wmf19
  • 20:42 akosiaris: service gmetad restart on nickel.wikimedia.org due to ganglia web not working
  • 20:15 cscott: updated OCG to version c9a2b4cf2502479eeabed07ab2de728695d96e46
  • 19:05 mutante: killed jgonera's screen session on stat1002 - puppet failed to deactivate otherwise
  • 18:46 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 18:32 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
  • 18:31 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Add cawikimedia
  • 18:28 logmsgbot: reedy Synchronized multiversion/: (no message) (duration: 00m 14s)
  • 18:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf20
  • 16:03 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: eswiki getting cirrus (duration: 00m 04s)
  • 15:32 bblack: deploying large DNS change https://gerrit.wikimedia.org/r/#/c/158382/ - be on the lookout for any related fallout from here...
  • 15:27 marktraceur: [SCAP] Deployed fix for oojs class names at James_F's behest, sorry for lack of message.
  • 15:26 logmsgbot: marktraceur Synchronized php-1.24wmf20/extensions/MobileFrontend/less/modules/editor/VisualEditorOverlay.less: (no message) (duration: 00m 07s)
  • 15:08 logmsgbot: marktraceur Synchronized php-1.24wmf20/tests/phpunit/includes/changes/OldChangesListTest.php: [SWAT] Fix undefined argument (css classes) in OldChangesList. (duration: 00m 07s)
  • 15:06 logmsgbot: marktraceur Synchronized php-1.24wmf20/includes/changes/OldChangesList.php: [SWAT] Fix undefined argument (css classes) in OldChangesList. (duration: 00m 07s)
  • 11:29 _joe_: git.wikimedia.org works now, no action needed
  • 11:26 MatmaRex: git.wikimedia.org is down: Error: 503, Service Unavailable
  • 10:04 _joe_: also re-enabling puppet
  • 10:02 _joe_: restarting manually apache on mw1178,mw1192,mw1163,mw1130,mw1018 as they started with the wrong pidfile before my fix
  • 09:24 _joe_: disabling puppet on appservers
  • 08:55 godog: launched "iptables" on tin to check current rules and it loaded iptables modules, logging for future reference
  • 08:10 _joe_: re-enabling puppet on appservers and imagescalers, change is good
  • 08:08 _joe_: restarted apache2 on mw1018
  • 08:06 _joe_: stopping apache on mw1018 for inspection
  • 07:36 _joe_: that was on appservers
  • 07:36 _joe_: disabling puppet, releasing a potentially harmful apache change
  • 04:56 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 9 04:56:25 UTC 2014 (duration 56m 24s)
  • 03:44 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-09 03:44:07+00:00
  • 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-09 03:11:27+00:00
  • 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-09 02:38:38+00:00
  • 01:02 logmsgbot: ebernhardson Synchronized php-1.24wmf20/extensions/Flow/includes/Content/BoardContentHandler.php: Sync BoardContentHandler.php for Flow in 1.24wmf20 (duration: 00m 04s)
  • 00:22 mutante: re-enabled mw1070 in pybal
  • 00:19 logmsgbot: ebernhardson Finished scap: Repeat SWAT scap deployment due to possible sync-common failure (duration: 38m 50s)

September 8

  • 23:59 ori: restarted rsync on mw1070 to unblock scap
  • 23:40 logmsgbot: ebernhardson Started scap: Repeat SWAT scap deployment due to possible sync-common failure
  • 23:39 logmsgbot: ebernhardson Finished scap: SWAT deploy updates to Flow, Echo and Thanks (duration: 24m 00s)
  • 23:34 mutante: disabled mw1070 in pybal because it refused sync
  • 23:31 ebernhardson: scap failed to connect to mw1070. Repeated message: rsync: failed to connect to mw1070.eqiad.wmnet (10.64.16.50): Connection refused (111)
  • 23:15 logmsgbot: ebernhardson Started scap: SWAT deploy updates to Flow, Echo and Thanks
  • 23:02 logmsgbot: ebernhardson Synchronized wmf-config/InitialiseSettings.php: gerrit:159089 Enable $wgContentHandlerUseDB on mediawikiwiki, testwiki, & test2wiki (duration: 00m 05s)
  • 20:14 subbu: deployed Parsoid ce108cb5
  • 18:01 logmsgbot: demon Synchronized php-1.24wmf19/extensions/Wikidata: Updating Wikidata to f1d2110 (duration: 00m 09s)
  • 17:19 mutante: disabled notifications for puppet freshness on neon
  • 16:19 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: svwiki: Cirrus as primary (duration: 00m 04s)
  • 15:42 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/Wikidata/: SWAT update wikidata to fix add links widget (duration: 00m 06s)
  • 15:32 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/LiquidThreads/: SWAT update liquidthreads to fix some missing images (duration: 00m 04s)
  • 15:28 manybubbles: 15:13:53 Synchronized php-1.24wmf19/extensions/WikiLove/: SWAT fix for WikiLove (duration: 00m 04s)
  • 15:28 manybubbles: this is the missing log:
  • 15:27 manybubbles: sync logging was down so it missed some syncing I just did.
  • 15:25 logmsgbot: manybubbles Synchronized php-1.24wmf20/extensions/WikiLove/: (no message) (duration: 00m 05s)
  • 15:20 logmsgbot: manybubbles Synchronized wmf-config: SWAT another cirrus setting update (duration: 00m 04s)
  • 15:10 logmsgbot: manybubbles Synchronized wmf-config: SWAT finish updating Cirrus settings (duration: 00m 05s)
  • 15:10 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT update some cirrus settings (duration: 00m 04s)
  • 15:10 cmjohnson1: shutting down neon for memory upgrade
  • 14:41 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1072 (duration: 00m 09s)
  • 12:23 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1073, depool db1072 (duration: 00m 06s)
  • 11:07 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1073 (duration: 00m 09s)
  • 10:55 _joe_: re-enabled puppet, the change results in a no-op as expected
  • 10:42 _joe_: disabling puppet on all appservers while updating apache config.
  • 04:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: move enwiki api traffic to db1051/db1066 (duration: 00m 09s)
  • 03:37 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep 8 03:36:13 UTC 2014 (duration 36m 12s)
  • 02:45 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-08 02:44:47+00:00
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-08 02:32:37+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-08 02:19:52+00:00

September 7

  • 23:35 Tim: upgrading liblua everywhere
  • 20:36 ori: mw1017: upgraded HHVM from 3.3-dev+20140728+wmf5 to 3.3-dev+20140728+wmf6
  • 15:12 apergos: manually changed /etc/hosts entry on analytics1004 from having "analyticas1004.eqiad.wmnet" to "analytics1004.eqiad.wmnet"
  • 06:15 godog: powercycle ms-be1005, not even responsive on console
  • 03:30 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Sep 7 03:29:51 UTC 2014 (duration 29m 50s)
  • 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-07 02:42:12+00:00
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-07 02:30:15+00:00
  • 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-07 02:17:44+00:00

September 6

  • 03:43 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Sep 6 03:42:22 UTC 2014 (duration 42m 21s)
  • 02:51 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-06 02:50:35+00:00
  • 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-06 02:37:41+00:00
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-06 02:24:35+00:00

September 5

  • 23:28 logmsgbot: kaldari Synchronized wmf-config/mobile-labs.php: enabling wikigrok on beta labs (en only) (duration: 00m 03s)
  • 23:28 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings-labs.php: enabling wikigrok on beta labs (en only) (duration: 00m 04s)
  • 23:27 logmsgbot: kaldari updated /a/common to Iec209bde0: Map config var for $wgMFEnableWikiGrok
  • 22:25 logmsgbot: kaldari Synchronized wmf-config/InitialiseSettings-labs.php: enabling wikigrok on beta labs (en only) (duration: 00m 05s)
  • 22:25 logmsgbot: kaldari updated /a/common to I6039956eb: Enable Wikigrok prototype for beta labs (enwiki only)
  • 22:24 awight: Deleted Light User and Merkle roles from the CRM
  • 20:20 RobH: coms folks still accessing blog data on holmium, powering back up
  • 20:18 bblack: restarted cp1056 bits cache and re-enabled in pybal
  • 18:34 mark: Depooled cp1056 for testing
  • 17:50 logmsgbot: ori Synchronized docroot and w: Iaa7518613: Fix spelling in symlink (duration: 00m 15s)
  • 17:45 logmsgbot: ori Synchronized docroot and w: I55a01a712: Fix relative symlinks for bits/static-master (duration: 00m 13s)
  • 13:00 Jeff_Green: lutetium dist-upgrade and reboot
  • 12:04 legoktm: running extensions/GlobalCssJs/removeOldManualUserPages.php for m:GlobalCssJs
  • 07:59 springle: dump es1007 to db1004, tokudb external storage page compression test. ok to kill in emergency
  • 06:40 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool es1007 (duration: 00m 07s)
  • 04:36 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Sep 5 04:35:11 UTC 2014 (duration 35m 10s)
  • 04:27 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1002 (duration: 00m 06s)
  • 04:01 springle: reboot es1002, fs check
  • 03:47 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-05 03:46:28+00:00
  • 03:46 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool es1002 for upgrade (duration: 00m 07s)
  • 03:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1062 and db1068 (duration: 02m 06s)
  • 03:10 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-05 03:09:20+00:00
  • 02:56 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062 and db1068 for upgrade (duration: 00m 56s)
  • 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-05 02:38:32+00:00
  • 01:01 manybubbles: applied same elasticsearch configuration to dewiki, eswiki, zhwiki, and frwiki
  • 00:18 manybubbles: configured elasticsearch to force enwiki's content shards to stay off of the same nodes. That ought to help performance.

September 4

  • 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/VisualEditor/: (no message) (duration: 00m 05s)
  • 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/ZeroPortal/: (no message) (duration: 00m 04s)
  • 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/extensions/Flow/: (no message) (duration: 00m 05s)
  • 23:34 logmsgbot: catrope Synchronized php-1.24wmf20/includes/specials/: (no message) (duration: 00m 04s)
  • 23:31 logmsgbot: catrope Synchronized php-1.24wmf19/extensions/ZeroPortal: (no message) (duration: 00m 05s)
  • 23:30 logmsgbot: catrope Synchronized php-1.24wmf19/extensions/Flow: (no message) (duration: 00m 05s)
  • 22:52 logmsgbot: reedy Finished scap: consistency (duration: 20m 44s)
  • 22:31 logmsgbot: reedy Started scap: consistency
  • 22:28 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 4 22:27:45 UTC 2014 (duration 54m 38s)
  • 21:54 bd808: sync-dir failure was really on osmium, not mw1161; confusing error messages are confusing
  • 21:50 bd808: Running sync-common on mw1161 to try and reproduce error seen during sync-file
  • 21:43 logmsgbot: spage Synchronized wmf-config/InitialiseSettings.php: Enable Flow on pages, including frwiki and hewiki (duration: 00m 09s)
  • 21:40 logmsgbot: spage updated /a/common to Ib0aaa60f0: Enable Flow on several pages
  • 21:08 logmsgbot: LocalisationUpdate completed (1.24wmf20) at 2014-09-04 21:07:10+00:00
  • 20:56 MaxSem: Running cleanupPageProps.php from terbium, now for realz
  • 20:42 mutante: restarting icinga-wm, making it join #wikidata for custom output
  • 20:16 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-04 20:15:35+00:00
  • 19:56 Reedy: mw1088 and mw1100 rsync errors during the manual l10n update
  • 19:25 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-04 19:23:57+00:00
  • 18:32 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf20
  • 18:26 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf19
  • 18:11 logmsgbot: reedy Synchronized php-1.24wmf19: (no message) (duration: 00m 55s)
  • 18:10 logmsgbot: reedy Synchronized php-1.24wmf20: (no message) (duration: 00m 35s)
  • 18:09 logmsgbot: reedy Finished scap: testwiki to 1.24wmf20 and build l10n cache (duration: 41m 33s)
  • 18:05 mutante: restarting service gitblit on antimony
  • 17:48 RobH: correction, simply surpressing alerts for the host in icinga is the better move, as the host isnt reclaimed yet, so not removing holmium from pupeptstoreddb
  • 17:46 RobH: stopping puppet on holmium and removing it from puppetstoreddb so it doesnt show in icinga once updated
  • 17:45 RobH: shutting down holmium, as blog has migrated for a month now. Not yet wiping system, please leave for me (robh)
  • 17:27 logmsgbot: reedy Started scap: testwiki to 1.24wmf20 and build l10n cache
  • 16:44 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 17s)
  • 16:36 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 15:53 logmsgbot: andrew Synchronized private/WikitechPrivateSettings.php: (no message) (duration: 00m 01s)
  • 15:40 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 15:18 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: plwiki gets Cirrus (duration: 00m 06s)
  • 14:56 bd808: ori updated scap to 773f95f (change deploy_dir to /srv/mediawiki) ~15 hours ago
  • 08:16 _joe_: running sync-common on mw1017, trying to debug the hhvm bad state
  • 06:37 godog: clear slowlog on elastic1004
  • 05:25 jeremyb: temp hack fix deployed for morebots (here and labs, not the other instances)
  • 04:47 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1035, warm up (duration: 00m 08s)
  • 04:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Sep 4 04:31:28 UTC 2014 (duration 31m 27s)
  • 03:43 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-04 03:42:34+00:00
  • 03:13 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-04 03:11:58+00:00
  • 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-04 02:40:45+00:00
  • 01:08 mutante: production wants project name?
  • 01:02 andrewbogott: the SAL still works, but the bot fails to acknowledge. Something to do with a change on wikitech
  • 00:59 andrewbogott: testing the log
  • 00:43 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)

September 3

  • 23:52 logmsgbot: reedy Synchronized php-1.24wmf15/includes/EditPage.php: (no message) (duration: 00m 14s)
  • 23:43 logmsgbot: ori Synchronized docroot and w: (no message) (duration: 00m 05s)
  • 23:28 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 23:04 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/157855/ https://gerrit.wikimedia.org/r/#/c/158265/ (duration: 00m 04s)
  • 21:09 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Disable GlobalUsage on labswiki (duration: 00m 15s)
  • 20:59 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
  • 20:47 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 15s)
  • 20:45 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 15s)
  • 20:38 logmsgbot: andrew Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 05s)
  • 20:38 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 04s)
  • 20:37 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 04s)
  • 20:16 subbu: deployed Parsoid version 78e55c6b (deploy repo sha c0761179)
  • 18:52 logmsgbot: yurik Synchronized wmf-config: enabling graph ext on zerowiki & collabwiki (duration: 01m 06s)
  • 18:51 MaxSem: Running sync-common on mw1163
  • 18:48 logmsgbot: yurik Synchronized php-1.24wmf18/extensions/Graph/: (no message) (duration: 01m 09s)
  • 18:47 logmsgbot: yurik Synchronized php-1.24wmf19/extensions/Graph/: (no message) (duration: 01m 05s)
  • 16:52 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 03s)
  • 16:52 logmsgbot: andrew Synchronized private/WikitechPrivateLdapSettings.php: (no message) (duration: 00m 03s)
  • 16:51 logmsgbot: andrew Synchronized private/WikitechPrivateSettings.php: (no message) (duration: 00m 05s)
  • 16:51 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 03s)
  • 16:19 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 03s)
  • 16:18 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 04s)
  • 16:16 logmsgbot: andrew Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 05s)
  • 15:41 _joe_: mw1020 correctly reimaged, putting it in the hhvm pool
  • 15:27 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - Update another cirrus config - this time maybe it will work (duration: 00m 05s)
  • 15:12 manybubbles: deployed throttling for Cirrus job named cirrusSearchLinksUpdate - it handles updating the index when a transcluded page changes - we'll have to check on the backlog over the next few hours/days to see if it stabilizes
  • 15:11 logmsgbot: manybubbles Synchronized php-1.24wmf19/extensions/Wikidata/: (no message) (duration: 00m 07s)
  • 15:07 manybubbles: mw1020 gets WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! during sync-dir call
  • 15:07 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy cirrus config changes - make sure to get mw1020 (duration: 00m 04s)
  • 15:05 manybubbles: https://gerrit.wikimedia.org/r/#/c/157861/ didn't work as expected - dropped everything out of using the all field......
  • 15:03 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy cirrus config changes (duration: 00m 06s)
  • 14:53 cmjohnson1: running sync-common on mw1178
  • 14:52 cmjohnson1: adding mw1178 back to pybal
  • 12:42 _joe_: typo: mw1020, not mw1120
  • 12:41 _joe_: mw1120: remove from pybal, schedule downtime, reimage to HAT
  • 11:23 godog: run gmond on elastic1002 manually to debug ES collector issues
  • 11:17 godog: run gmond on elastic1001 manually to debug ES collector issues
  • 07:55 _joe_: re-enabling mw1192, what we were seeing was probably load and not anything else
  • 06:56 ori: restarted memcached on virt1000 due to cache pollution from migration (different memc drivers w/different encoding)
  • 04:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Sep 3 04:53:50 UTC 2014 (duration 53m 49s)
  • 03:51 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-03 03:50:17+00:00
  • 03:17 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-03 03:16:37+00:00
  • 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-09-03 02:42:24+00:00
  • 00:18 mutante: deleted PDF files older than 3d and a huge 1G one on ocg1001 in reaction to monitoring complaints
  • 00:00 logmsgbot: ori Synchronized php-1.24wmf18/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 03s)

September 2

  • 23:49 logmsgbot: ori Synchronized php-1.24wmf19/extensions/WikimediaEvents: Update WikimediaEvents for cherry-picks (duration: 00m 03s)
  • 23:32 logmsgbot: reedy Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 13s)
  • 23:22 logmsgbot: catrope Synchronized php-1.24wmf19/includes/OutputPage.php: 5094c0d9c (duration: 00m 05s)
  • 23:14 Krinkle: Running extensions/GlobalCssJs/removeOldManualUserPages.php per m:GlobalCssJs
  • 22:49 logmsgbot: reedy Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 14s)
  • 22:37 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 13s)
  • 22:26 logmsgbot: reedy Synchronized wmf-config/wikitech.php: (no message) (duration: 00m 13s)
  • 22:10 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
  • 21:57 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 21s)
  • 21:55 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 24s)
  • 21:51 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 25s)
  • 21:45 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 23s)
  • 21:37 logmsgbot: reedy Synchronized wmf-config/db-eqiad.php: Wikitech db (duration: 00m 22s)
  • 21:34 logmsgbot: bd808 Finished scap: no-op scap to build l10n for wikitech (duration: 55m 48s)
  • 20:39 logmsgbot: bd808 Started scap: no-op scap to build l10n for wikitech
  • 20:35 logmsgbot: bd808 Synchronized wmf-config/wikitech.php: eebc99a Require before instatiate (duration: 00m 04s)
  • 20:31 logmsgbot: bd808 Synchronized private/PrivateSettings.php: Absolute path for WikitechPrivateSettings.php (duration: 00m 05s)
  • 20:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
  • 20:05 MaxSem: Running cleanupPageProps.php everywhere
  • 19:51 MaxSem: Running cleanupPageProps.php on mw.org and meta
  • 19:14 logmsgbot: reedy Synchronized wmf-config/Wikibase.php: Bump epoch (duration: 00m 14s)
  • 19:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf19, added labswiki too
  • 18:56 logmsgbot: bd808 Synchronized fishbowl.dblist: Add labswiki (wikitech) (duration: 00m 05s)
  • 18:25 logmsgbot: andrew Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 08s)
  • 17:49 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 14s)
  • 17:39 logmsgbot: andrew Synchronized multiversion/MWMultiVersion.php: (no message) (duration: 00m 04s)
  • 17:14 logmsgbot: andrew Finished scap: Deploying wikitech config (duration: 33m 03s)
  • 17:01 bd808: Fetched f711ea7 to /a/common on tin; not syncing because of in-process scap.
  • 16:41 logmsgbot: andrew Started scap: Deploying wikitech config
  • 16:21 logmsgbot: andrew scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="cawikibooks" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.SCRILhxGxO" ' returned non-zero exit status 1 (duration: 01m 17s)
  • 16:20 logmsgbot: andrew Started scap: Deploying wikitech config
  • 16:17 ottomata: installing newer version of webstatscollector on oxygen and gadolinium, restarting filter process on oxygen
  • 16:08 logmsgbot: andrew Synchronized /a/common/private/WikitechPrivateSettings.php: (no message) (duration: 00m 04s)
  • 16:07 logmsgbot: andrew Synchronized /a/common/private/PrivateSettings.php: (no message) (duration: 00m 03s)
  • 16:03 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Commons gets Cirrus as primary (duration: 00m 04s)
  • 15:44 godog: bring mw1114 -> mw1131 to weight 15
  • 15:21 logmsgbot: marktraceur Synchronized wmf-config/: [SCAP] SpecialCite is now CiteThisPage (duration: 00m 07s)
  • 15:17 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: [SCAP] Enable the TemplateData GUI editor on Norwegian Wikipedia (duration: 00m 07s)
  • 15:14 logmsgbot: marktraceur updated /a/common to Ia1758b21e: depool db1035 for upgrade, move s3 vslow/dump to db1019
  • 15:06 logmsgbot: marktraceur Synchronized php-1.24wmf19/includes/EditPage.php: [SCAP] Revert "Toolbar: Only show on WikiText pages" (duration: 00m 07s)
  • 15:05 logmsgbot: marktraceur Synchronized php-1.24wmf18/includes/EditPage.php: [SCAP] Revert "Toolbar: Only show on WikiText pages" (duration: 00m 08s)
  • 12:36 godog: increase weight to 15 for mw1132 -> mw1148
  • 10:00 _joe_: depooling mw1192, high CPU temperatures; we may need to check fan status
  • 07:20 _joe_: powercycling mw1192, blank console, unresponsive
  • 07:02 springle: removed all-but-latest large slow logs on elastic1004 and elastic1014
  • 06:22 springle: removed txt files filling up db1047 /tmp, looked like analytics SELECT INTO OUTFILE, dated mid-August
  • 05:58 springle: dump s3 db1035 to db1069:3313
  • 05:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1067, warm up (duration: 00m 08s)
  • 04:59 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1067 (duration: 00m 07s)
  • 03:35 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1035 (duration: 00m 07s)
  • 03:16 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Sep 2 03:15:22 UTC 2014 (duration 15m 21s)
  • 02:53 springle: restarted dbstore1002 mysqld for upgrade
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-02 02:25:36+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-02 02:14:30+00:00

September 1

  • 23:00 Krinkle: Running extensions/GlobalCssJs/removeOldManualUserPages.php per m:GlobalCssJs
  • 21:50 ori: disabled gerrit account Caothu9669; spam
  • 19:12 Reedy: Deleted php-1.24wmf[6-8] from apaches via dsh
  • 19:01 logmsgbot: reedy Purged l10n cache for 1.24wmf13
  • 19:01 logmsgbot: reedy Purged l10n cache for 1.24wmf14
  • 19:00 logmsgbot: reedy Purged l10n cache for 1.24wmf15
  • 18:59 logmsgbot: reedy Purged l10n cache for 1.24wmf16
  • 18:58 logmsgbot: reedy Purged l10n cache for 1.24wmf17
  • 16:44 ottomata: removed some large slow query logs from elastic* nodes, need to look into this...
  • 12:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1044, take 2 (duration: 00m 06s)
  • 12:04 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1044 (duration: 00m 06s)
  • 11:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: pool db1044, warm up (duration: 00m 06s)
  • 09:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1027 (duration: 00m 07s)
  • 07:27 godog: deploy latest ring to swift eqiad-prod
  • 07:11 godog: powercycle ms-be1010 "cpu soft lockup" on console
  • 05:28 springle: xtrabackup clone db1027 to db1044
  • 05:26 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1027 while cloning (duration: 00m 07s)
  • 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Sep 1 03:14:22 UTC 2014 (duration 14m 21s)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-09-01 02:27:39+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-09-01 02:16:18+00:00

August 31

  • 19:56 hashar: Jenkins updated HHVM to (3.3-dev+20140728+wmf5) over (3.3-dev+20140728+wmf4)
  • 14:38 bblack: restarted apache on strontium
  • 14:34 bblack: restarted apache on tungsten, machine is overloaded
  • 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 31 03:14:41 UTC 2014 (duration 14m 40s)
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-08-31 02:28:42+00:00
  • 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-31 02:17:24+00:00
  • 02:00 ori: Stopped HHVM jobrunner and disabled Puppet on mw1053 due to bug 70177.

August 30

  • 08:18 godog: restart mailman on sodium, pending https://gerrit.wikimedia.org/r/#/c/156766/
  • 08:01 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1071, warm up (duration: 00m 07s)
  • 06:33 jgage: analytics1021 back in service after election
  • 06:14 jgage: upgraded & rebooted analytics1021
  • 06:01 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: return db1037 to normal load (duration: 00m 06s)
  • 05:03 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool es1001 (duration: 00m 06s)
  • 04:06 springle: upgrade es1001 to mariadb 10
  • 03:56 springle: xtrabackup clone db1037 to db1071
  • 03:55 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: reduce db1037 load while cloning (duration: 00m 06s)
  • 03:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: pool db1073, depool db1071 (duration: 00m 07s)
  • 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 30 03:17:57 UTC 2014 (duration 17m 56s)
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-08-30 02:32:06+00:00
  • 02:21 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-30 02:20:39+00:00

August 29

  • 21:15 logmsgbot: ori Synchronized wmf-config: I812c0bb6c: Scrap unused Twemproxy config files (duration: 00m 04s)
  • 21:12 logmsgbot: ori updated /a/common to I812c0bb6c: Scrap unused Twemproxy config files
  • 21:03 mutante: restarted uwsgi on tungsten
  • 20:49 mutante: tungsten extremely busy, graphite down, logging in since 5 minutes :p
  • 20:48 mutante: powercycling ms-be1006 - BUG: soft lockup - CPU#0 stuck ...
  • 19:19 mutante: installing package upgrades on iron, bast1001
  • 18:41 logmsgbot: aaron Synchronized php-1.24wmf18/maintenance/findMissingFiles.php: 994d4a556a070156fd04fb4951492f10696cc63c (duration: 00m 03s)
  • 18:30 logmsgbot: ori Synchronized php-1.24wmf19/resources/src/mediawiki.action/mediawiki.action.view.redirect.js: I19221a25a: mediawiki.action.view.redirect: Work around a IE 10+ HTML5 history API bug (duration: 00m 06s)
  • 18:30 logmsgbot: ori Synchronized php-1.24wmf18/resources/src/mediawiki.action/mediawiki.action.view.redirect.js: I19221a25a: mediawiki.action.view.redirect: Work around a IE 10+ HTML5 history API bug (duration: 00m 07s)
  • 15:36 hashar_: Jenkins: pooled a new slave 10.68.16.162 as wikidata-jenkins3 on behalf of addshore / wmde
  • 15:04 _joe_: shutting down mw1163, filled RT 8243 for repair.
  • 14:54 _joe_: re-enabled mw1130
  • 14:41 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable Wikibase badges css, follow up from last night deploy (duration: 00m 06s)
  • 14:22 _joe_: syncing mw1130
  • 14:06 _joe_: disable mw1130 from the api pool whil it gets resynced
  • 12:30 Krinkle: Running extensions/GlobalCssJs/removeOldManualUserPages.php per m:GlobalCssJs
  • 11:06 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1070 (duration: 00m 09s)
  • 08:04 hashar: Jenkins: in the jenkins-job-builder-config branch 'cloudbees' has been merged in 'master'. Unifying CI and browser tests jobs! \O/
  • 07:05 _joe_: re-enabling puppet on the jobrunner, to check if the luasandbox fix works
  • 06:33 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: return db1056 to normal load (duration: 00m 06s)
  • 04:14 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 29 04:13:03 UTC 2014 (duration 13m 2s)
  • 03:31 springle: xtrabackup clone db1056 to db1070
  • 03:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: reduce db1056 load while cloning (duration: 00m 06s)
  • 03:15 logmsgbot: LocalisationUpdate completed (1.24wmf19) at 2014-08-29 03:10:26+00:00
  • 02:48 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1070. pool db1072. (duration: 00m 07s)
  • 02:38 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-29 02:37:20+00:00
  • 01:49 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1070. pool db1072. (duration: 00m 06s)
  • 01:27 godog: repool ms-fe1002
  • 01:06 cmjohnson1: shutting down ms-fe1002 to relocate racks
  • 01:04 godog: depool ms-fe1002
  • 01:02 godog: repool ms-fe1001
  • 00:57 logmsgbot: ori Synchronized php-1.24wmf18/extensions/WikimediaEvents: Ib44fe0898: Inject 'wgPoweredByHHVM' JS config var if powered by HHVM (duration: 00m 03s)
  • 00:56 logmsgbot: ori Synchronized php-1.24wmf19/extensions/WikimediaEvents: Ib44fe0898: Inject 'wgPoweredByHHVM' JS config var if powered by HHVM (duration: 00m 04s)
  • 00:38 cmjohnson1: shutting down ms-fe1001 for rack relocation
  • 00:34 godog: depool ms-fe1001
  • 00:32 godog: repool ms-fe1004
  • 00:27 mutante: restarting gmetad on nickel
  • 00:04 cmjohnson1: shutting down ms-fe1004 to relocate racks

August 28

  • 23:58 godog: depool ms-fe1004
  • 23:51 godog: repooling ms-fe1003
  • 23:40 logmsgbot: maxsem Synchronized php-1.24wmf19/maintenance/: https://gerrit.wikimedia.org/r/#/c/156979/ (duration: 00m 04s)
  • 23:39 logmsgbot: maxsem Synchronized php-1.24wmf19/includes/: https://gerrit.wikimedia.org/r/#/c/156979/ (duration: 00m 06s)
  • 23:38 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/156994/ (duration: 00m 05s)
  • 23:36 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/Echo: https://gerrit.wikimedia.org/r/#/c/157008/ (duration: 00m 04s)
  • 23:35 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/Thanks/: https://gerrit.wikimedia.org/r/#/c/156898/ (duration: 00m 04s)
  • 23:34 logmsgbot: maxsem Synchronized php-1.24wmf19/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/156968/ (duration: 00m 05s)
  • 23:30 logmsgbot: maxsem Synchronized php-1.24wmf18/extensions/GlobalCssJs/: https://gerrit.wikimedia.org/r/#/c/157009/ (duration: 00m 04s)
  • 23:27 K4-713: Updated fraud filters on payments
  • 22:52 logmsgbot: aaron Synchronized php-1.24wmf18/maintenance/findMissingFiles.php: (no message) (duration: 00m 07s)
  • 22:15 mutante: restarted tools.morebots production instance - can i log now?
  • 22:13 cmjohnson1: ms-fe1003 down for relocation
  • 22:13 mutante: test
  • 21:15 robh: bast2001.wikimedia.org now online in codfw.
  • 21:15 robh: i never admin logged when install2001.wikimedia.org went online the other day, opps.
  • 21:15 ori: last sync was of Iac37a2369: resourceloader: Don't register raw modules client-side
  • 21:14 logmsgbot: ori Synchronized php-1.24wmf18/includes/resourceloader/ResourceLoaderStartUpModule.php: (no message) (duration: 00m 03s)
  • 20:57 logmsgbot: krinkle Synchronized php-1.24wmf19/includes/resourceloader/ResourceLoaderStartUpModule.php: fd5b963458c19 (duration: 00m 06s)
  • 20:33 ottomata: shutting down elastic1016
  • 20:16 ottomata: temporarily disable puppet on gadolinium
  • 19:36 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 19:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf19
  • 19:09 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf18
  • 19:07 logmsgbot: reedy Finished scap: testwiki to 1.24wmf19 (duration: 43m 00s)
  • 18:25 godog: install build-essential and fakeroot on tin
  • 18:24 logmsgbot: reedy Started scap: testwiki to 1.24wmf19
  • 17:26 logmsgbot: aaron Synchronized rpc: 9564e93ecd4953126d91b99d7728f63401a4dc86 (duration: 00m 07s)
  • 17:13 ^d: elastic: excluded the elastic1016 node from shard allocation, shards draining so we can take it down for disk testing
  • 16:01 ottomata: restarted webstats-collector on gadolinium
  • 13:18 mark: Reactivated cr2-eqiad AS3257 transit link
  • 10:44 springle: xtrabackup clone db1051 to db1073
  • 10:18 godog: restarting mailman on sodium
  • 08:52 godog: restarted apache on mw1134
  • 08:03 godog: killed stray mailman processes on sodium (no pid file) and restarted mailman
  • 06:11 springle: xtrabackup clone db1051 to db1072
  • 06:09 springle: restarted morebots

August 26

  • 21:04 hashar: Updating our Jenkins Job Builder fork 0268581..e5c0c61 . Will let us define variables in 'default' section and override them when invoking a job template ( https://review.openstack.org/#/c/100020/ )
  • 19:58 bd808: Ran sync-common on mw1053.eqiad.wmnet to recover from failure during last scap
  • 19:48 logmsgbot: aude Finished scap: Update new messages for Wikibase (duration: 07m 16s)
  • 19:41 logmsgbot: aude Started scap: Update new messages for Wikibase
  • 19:39 logmsgbot: aude Synchronized wmf-config/Wikibase.php: add Wikibase badges css setting (duration: 00m 10s)
  • 19:25 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable new serialization format for wikidata (duration: 00m 08s)
  • 19:10 logmsgbot: reedy Synchronized php-1.24wmf18/extensions/Echo/: (no message) (duration: 00m 14s)
  • 19:05 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable otherprojects sidebar beta feature (duration: 00m 15s)
  • 18:54 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf18
  • 18:53 logmsgbot: reedy Synchronized php-1.24wmf18/extensions/MassMessage: (no message) (duration: 00m 14s)
  • 18:52 logmsgbot: reedy Synchronized php-1.24wmf17/extensions/MassMessage: (no message) (duration: 00m 16s)
  • 18:19 jgage: Failover from analytics1010-eqiad-wmnet to analytics1004-eqiad-wmnet successful
  • 17:47 logmsgbot: bd808 Synchronized private/PrivateSettings.php: Syncing file rather than symlink (duration: 00m 04s)
  • 17:36 bd808: mw1010.eqiad.wmnet was out of sync too. I suspect there is something wrong with the fanout update step in scap
  • 17:26 bd808: /usr/local/apache/common-local out of date on mw1161.eqiad.wmnet; updated via sync-common
  • 17:25 bd808: sync-* not updating terbium properly; sync-common from terbium manually got several config changes; maybe a problem with mw1161.eqiad.wmnet rsync mirror
  • 17:14 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 04s)
  • 17:11 logmsgbot: demon Synchronized wmf-config/PrivateSettings.php: adjust swift auth url for cirrus (duration: 00m 04s)
  • 17:05 cmjohnson: swapping failed disk labsdb1003 slot 1
  • 16:42 bd808: Ran sync-common on osmium to verify that it now rebuilds l10n cache by default (and it does!)
  • 16:36 legoktm: running removeOldManualUserPages.php (GlobalCssJs) for users who requested it
  • 16:29 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Again, with feeling (duration: 00m 04s)
  • 16:26 logmsgbot: bd808 Finished scap: no-op scap to test scap code update (duration: 13m 31s)
  • 16:20 bd808|DEPLOY: Rsync sloooow to fenari "16:18:52 fenari INFO - Finished rsync common (duration: 04m 38s)"
  • 16:12 logmsgbot: bd808 Started scap: no-op scap to test scap code update
  • 16:07 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 04s)
  • 16:07 bd808|DEPLOY: Updated scap to 116027f (Make sync-common update l10n cdb files by default)
  • 15:05 logmsgbot: anomie Synchronized wmf-config: SWAT: Enable GlobalCssJs on all CentralAuth wikis minus loginwiki gerrit:154432 (duration: 00m 09s)
  • 13:32 hashar: Jenkins mediawiki-core-qunit job has been switched to Zuul cloner and pass! :-D
  • 13:29 _joe_: re-enabling puppet, change aborted as not all sites are served via hhvm on the hhvm appservers (true story). Will re-do once all configs are in their place
  • 13:12 _joe_: disabling puppet on all appservers while deploying an apache change
  • 12:48 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: db1054 to normal load (duration: 00m 06s)
  • 12:33 hashar: Jenkins reverted mediawiki-core-qunit to use Zuul cloner 156268. Gotta play with it on a new job name since it does not work out of the box as expected.
  • 12:12 hashar: Jenkins migrating mediawiki-core-qunit to use Zuul cloner 156268
  • 12:03 akosiaris: disable puppet on labsdb1006 for planet osm import
  • 11:53 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: pool db1054, warm up (duration: 00m 08s)
  • 09:04 godog: reboot ms-be1011, unresponse on network and console
  • 08:28 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1036 (duration: 00m 06s)
  • 05:41 springle: xtrabackup clone db1036 to db1054
  • 05:39 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1036 while cloning (duration: 00m 06s)
  • 05:28 springle: upgrade & restart db1054, fs check
  • 04:48 logmsgbot: demon Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 06s)
  • 04:27 springle: labsdb1002 back up
  • 04:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug 26 04:06:34 UTC 2014 (duration 6m 33s)
  • 03:23 ^d: restarting elasticsearch on elastic1001, elastic1003 and elastic1008. icinga may complain briefly.
  • 03:11 springle: filesystem issues on labsdb1002. stopped mysqld
  • 03:05 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-26 03:04:18+00:00
  • 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-26 02:33:00+00:00

August 25

  • 23:58 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Add 'movefile' to 'eliminator' user group on jawiki (duration: 00m 03s)
  • 23:53 logmsgbot: maxsem Finished scap: SWAT: CentralNotice update (duration: 29m 58s)
  • 23:23 logmsgbot: maxsem Started scap: SWAT: CentralNotice update
  • 23:19 logmsgbot: maxsem Synchronized php-1.24wmf17/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/156188/ (duration: 00m 05s)
  • 23:17 logmsgbot: maxsem Synchronized php-1.24wmf18/extensions/CentralNotice/: https://gerrit.wikimedia.org/r/#/c/156188/ (duration: 00m 04s)
  • 23:15 logmsgbot: maxsem Synchronized php-1.24wmf18/includes/htmlform/HTMLCheckField.php: https://gerrit.wikimedia.org/r/#/c/156015/ (duration: 00m 05s)
  • 20:06 subbu: deployed parsoid version 5b5a5ed5
  • 17:24 godog: reboot ms-be1004 to pick up kernel upgrade
  • 17:13 godog: rebooting ms-be1002 to pick up updated kernel
  • 16:54 ottomata: stopping puppet on cp3021. Testing an increase of http://kafka.queue.buffering.max.ms/ in order to avoid dropping messages during broker metadata change (e.g. leader elections)
  • 16:48 hashar: Jenkins pooled in a new slave wdjenkins-node1 that will be used to run Wikidata jenkins jobs. Work in progress with addshore. It is not running jobs yet.
  • 16:47 godog: reboot ms-be1011, xfsaild errors in dmesg
  • 16:25 hashar: Jenkins: disconnecting and reconnecting Gearman plugin from https://integration.wikimedia.org/ci/configure
  • 16:06 andrewbogott: wikitech deployment finished. Note that the OpenStackManager submodule is off of the MediaWiki branch because… the whole submodule setup there is a bit broken on account of a git bug that uses absolute paths to manage submodules.
  • 16:01 andrewbogott: deploying tiny OpenStackManager upgrade on wikitech
  • 15:58 ottomata: enabled elasticsearch shard allocation row awareness (via rest api)
  • 12:45 hashar: hard stopped/restarted Zuul (workflow config error)
  • 12:27 hashar: restarting zuul
  • 10:15 mark: setup cross-confederation BGP sessions from AS65001 (eqiad) to AS65002 (codfw)
  • 05:35 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: 156076 - Remove centralnotice-admin right assignments on 3 wikis - Basically a noop (duration: 00m 06s)
  • 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug 25 03:14:13 UTC 2014 (duration 14m 12s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-25 02:25:58+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-25 02:14:14+00:00

August 24

  • 23:17 ^d: slow indexing log going pretty bonanzas on elastic101[35]. Probably others too? Filling /var/log.
  • 12:02 mark: Removed IPv6 subnet 2620:0:860:2::/64 from cr2-pmtpa:irb.101
  • 03:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 24 03:17:46 UTC 2014 (duration 17m 45s)
  • 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-24 02:31:13+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-24 02:18:52+00:00

August 23

  • 11:33 mark: Manually removed IPv6 addresses from fenari
  • 11:23 mark: Deactivated IPv6 router-advertisement on cr2-pmtpa
  • 11:21 mark: Manually removed IPv6 address from mchenry
  • 10:36 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1004. pool db1053. (duration: 00m 07s)
  • 03:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 23 03:06:13 UTC 2014 (duration 6m 12s)
  • 02:23 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-23 02:21:59+00:00
  • 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-23 02:17:28+00:00
  • 01:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1056 (duration: 00m 06s)
  • 00:29 ori: disabled puppet on osmium again to debug a leak; please don't re-enable

August 22

  • 18:10 logmsgbot: ori updated /a/common to I338d72a47: Do not define MEDIAWIKI before loading WebStart.php
  • 17:43 ottomata: moving sqstat udp2log filter from analytics1003 to analytics1026, reqstats might blip for a sec...
  • 17:41 ori: nuking /srv/deployment/rcstream on rcs1002 to verify trebuchet package provider reprovisions it
  • 15:54 springle: xtrabackup db1056 to db1053
  • 15:53 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1056 while cloning (duration: 00m 07s)
  • 15:33 ^d: elastic1008: fixed /etc/hosts to point to actual IP instead of loopback
  • 15:18 springle: upgrade & restart db1053, fs check
  • 15:08 bd808: Still no apache2.log on fluorine or in logstash. Log seems to be available on fenari.
  • 14:51 springle: switched s1 sanitarium and labsdb replication to db1069:3311 mariadb 10
  • 14:39 mark: Removed IPv6 subnets 2620:0:860:1::/64 (squid subnet) and 2620:0:860:3::/64 (sandbox subnet) from cr2-pmtpa configuration
  • 04:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 22 04:10:47 UTC 2014 (duration 10m 46s)
  • 03:19 logmsgbot: LocalisationUpdate completed (1.24wmf18) at 2014-08-22 03:18:44+00:00
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-22 02:32:11+00:00
  • 00:03 logmsgbot: ori Finished scap: SWAT: d3de89777, 7abfe0d5e7, 8ec9853c32b, 476e9e90bd01 (duration: 06m 29s)

August 21

  • 23:57 logmsgbot: ori Started scap: SWAT: d3de89777, 7abfe0d5e7, 8ec9853c32b, 476e9e90bd01
  • 21:58 logmsgbot: ori Synchronized php-1.24wmf17/resources/src/mediawiki/mediawiki.js: I8d27442d1: Workaround for bug introduced by Icf6ede09b (duration: 00m 03s)
  • 21:57 manybubbles: performing elasticsearch upgrade on elastic1015
  • 21:02 logmsgbot: ori Synchronized php-1.24wmf17/resources/src/mediawiki/mediawiki.util.js: Touch resources/src/mediawiki/mediawiki.util.js (duration: 00m 06s)
  • 20:44 godog: rolling restart of swift-proxy on ms-fe1*
  • 20:11 godog: restarted swift-proxy on ms-fe1001
  • 19:55 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
  • 19:49 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf18
  • 19:46 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf17
  • 19:44 logmsgbot: reedy Finished scap: testwiki to 1.24wmf18 (duration: 34m 01s)
  • 19:31 mutante: disabled mw1178 in pybal
  • 19:27 godog: restarted memcached on ms-fe1004
  • 19:23 reedy|webirc: mw1178 returned [255]: ssh: connect to host mw1178 port 22: Connection timed out
  • 19:23 reedy|webirc: mw1019 returned [127]: bash: sync-common: command not found
  • 19:09 logmsgbot: reedy Started scap: testwiki to 1.24wmf18
  • 18:28 manybubbles: *victim*
  • 18:27 manybubbles: trying to recover from weird Elasticsearch upgrade failure by redoing the upgrade on one node while also blowing away the data directory during the upgrade. elastic1005, you are my first victem.
  • 17:28 cmjohnson1: removing mw1130 from pybal
  • 14:53 hashar: Jenkins: updated PHP CodeSniffer MediaWiki standard on all slaves.
  • 14:36 hashar_: Jenkins: updating mediawiki code sniffer repo bf82117..bc4e590
  • 10:02 hashar: Jenkins installed plugin Throttle Concurrent Builds.
  • 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Aug 21 03:20:47 UTC 2014 (duration 20m 46s)
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-21 02:34:56+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-21 02:19:26+00:00
  • 00:08 MatmaRex: (manybubbles contd.) …a single node going down but I expect the cluster to stay "yellow" during the process- no alerts.
  • 00:07 manybubbles: bd808 needs to plan a logstash upgrade soon - let it be logged
  • 00:05 manybubbles: if anyone is reading the SAL for fun or sees an error in Elasticsearch cluster in the next 24 hours - we're performing an elasticsearch upgrade. We've set it up this time so its super slow and boring. So boring I'm going to sleep through it. If you see more then transient complaining from icinga about elasticsearch you can call me/have someone with access to the contact list call me. I expect icinga to complain about a
  • 00:00 manybubbles: unattended rolling restart of Elasticsearch cluster is going just fine - adding the 30 minute sleep between servers and turning down the replication rate makes it pretty boring.

August 20

  • 23:07 awight: stopping the Thank You job
  • 22:50 ori: disabled puppet on osmium to debug memory leak
  • 21:46 logmsgbot: marktraceur Synchronized php-1.24wmf17/extensions/MultimediaViewer/: Add disable-by-default option to MultimediaViewer (duration: 00m 07s)
  • 21:09 logmsgbot: marktraceur Synchronized wmf-config: Turn off Media Viewer for logged-in users at Commons. (duration: 00m 07s)
  • 21:06 logmsgbot: marktraceur updated /a/common to I226bd1468: Add item-redirect to OAuth permissions
  • 19:50 hashar: Restarting Zuul to prettify build results bug 66095
  • 19:48 logmsgbot: awight Synchronized php-1.24wmf17/extensions/CentralNotice: push CentralNotice updates, including new hide cookie format (duration: 00m 05s)
  • 19:47 logmsgbot: awight Synchronized php-1.24wmf16/extensions/CentralNotice: push CentralNotice updates, including new hide cookie format (duration: 00m 04s)
  • 19:46 logmsgbot: awight Synchronized php-1.24wmf16/extensions/CentralNotice: push CentralNotice updates, including new hide cookie format (duration: 00m 07s)
  • 16:11 manybubbles: elastic1001 upgrade went well - upgrading elastic1002 now
  • 15:48 hashar: dns: Jenkins will now complain whenever you attempt to send tabs in any file of operations/dns.git bug 69478
  • 15:17 manybubbles: manually lowered elasticsearch recovery speeds to stem off high load caused by healing the restart of elastic1001 - we were slowing down enough that we were filling the pool counter
  • 15:05 logmsgbot: anomie Synchronized wmf-config/CommonSettings.php: SWAT: Add item-redirect to OAuth permissions gerrit:155257 (duration: 00m 09s)
  • 15:01 logmsgbot: anomie Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/: SWAT: Touch files on advice of Wikidata folks (duration: 00m 09s)
  • 15:01 logmsgbot: anomie Synchronized wmf-config/Wikibase.php: SWAT: Fix config for specialSiteLinkGroups in Wikibase gerrit:155218 (duration: 00m 09s)
  • 14:49 manybubbles: installing elasticsearch 1.3.2 on elasticsearch1001 only right now as a test
  • 14:47 manybubbles: upgrading elasticsearch plugins on all elasticsearch servers in preparation to upgrade to elasticsearch 1.3 - if we roll back we'll have to redeploy the plugins
  • 14:10 ottomata: changing group ownership and permissions on raw webrequest data in hdfs. Users now must be in the analytics-privatedata-users group to access.
  • 13:47 manybubbles: experimenting with lowering merge factor on enwiki's Cirrus index - should improve query performance at the cost of more background tasks in the Elasticserach cluster
  • 13:36 ottomata: disabling puppet on analytics1027 temporarily
  • 13:10 godog: reboot ms-be1003, xfs errors/panics
  • 12:03 logmsgbot: ori updated /a/common to Ic3fe1ef83: Update all symlinks to /apache
  • 11:36 hashar: Updating Jenkins Job Builder fork 666e953..0268581
  • 11:06 hashar_: mw1019 is missing sync-common causing sync issues.
  • 11:06 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: new domain www.veikkos-archiv.com to wgCopyUploadsDomains 155239 bug 69777 (duration: 00m 03s)
  • 11:05 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: new domain www.veikkos-archiv.com to wgCopyUploadsDomains 155239 bug 69777 (duration: 00m 03s)
  • 10:33 logmsgbot: ori Synchronized w/touch.php: Ic9d8837b1: Canonicalize some remaining references to /apache symlink (duration: 00m 05s)
  • 10:33 logmsgbot: ori Synchronized w/mobilelanding.php: Ic9d8837b1: Canonicalize some remaining references to /apache symlink (duration: 00m 05s)
  • 10:26 logmsgbot: ori updated /a/common to Ic9d8837b1: Canonicalize some remaining references to /apache symlink
  • 10:16 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Id2d5cfa4c: Canonicalize path to $wgSiteMatrixFile (duration: 00m 06s)
  • 09:40 godog: uploaded hhvm_3.3-dev+20140728+wmf5 to carbon
  • 09:27 hashar: restarted Jenkins Gearman plugin.
  • 04:06 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Aug 20 04:05:41 UTC 2014 (duration 5m 40s)
  • 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-20 03:11:08+00:00
  • 02:40 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-20 02:39:40+00:00

August 19

  • 23:17 logmsgbot: catrope Synchronized php-1.24wmf17/extensions/MobileFrontend: (no message) (duration: 00m 04s)
  • 23:14 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Set wmgWikibaseSiteGroup for wikinews (duration: 00m 05s)
  • 22:59 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Fix badges css on Wikidata (duration: 00m 11s)
  • 22:30 logmsgbot: aude Finished scap: Update Wikidata, WikimediaMessages and ZeroBanner (duration: 22m 02s)
  • 22:08 logmsgbot: aude Started scap: Update Wikidata, WikimediaMessages and ZeroBanner
  • 22:03 logmsgbot: aude Synchronized php-1.24wmf17/extensions/ZeroBanner: Update, per yurik (duration: 00m 18s)
  • 21:22 logmsgbot: aude Synchronized wikidataclient.dblist: Enable Wikibase on Wikinews (duration: 00m 08s)
  • 21:21 logmsgbot: aude Synchronized wmf-config: Config changes to enable Wikibase on Wikinews (duration: 00m 14s)
  • 21:12 aude: added and populated sites table for wikinews
  • 21:05 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Fix badges css and populateSitesTable script in Wikibase (duration: 00m 14s)
  • 20:26 RoanKattouw: Restarting Jenkins, it seems to be stuck
  • 19:58 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: enabling Wikidata to also be a client, e.g. use lua (duration: 00m 09s)
  • 19:58 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enabling Wikidata to also be a client, e.g. use lua (duration: 00m 12s)
  • 19:53 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/resources/: (no message) (duration: 00m 11s)
  • 19:48 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/modules/SitesModule.php: (no message) (duration: 00m 09s)
  • 19:48 logmsgbot: aude Synchronized wmf-config/Wikibase.php: fix config for special site links on Wikidata (duration: 00m 11s)
  • 19:37 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/modules/SitesModule.php: (no message) (duration: 00m 11s)
  • 19:26 logmsgbot: aude Synchronized wmf-config/Wikibase.php: allow adding site links to Wikidata (non-entity) pages on Wikidata (duration: 00m 08s)
  • 19:21 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable item redirects on Wikidata (duration: 00m 08s)
  • 19:16 logmsgbot: aude Synchronized wmf-config/Wikibase.php: enable badges on Wikidata (duration: 00m 08s)
  • 19:06 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: group1 back to wmf17
  • 19:05 logmsgbot: demon Synchronized php-1.24wmf17/extensions/Wikidata/extensions/Wikibase/lib/includes/changes/EntityChange.php: (no message) (duration: 00m 05s)
  • 18:33 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: all group1 back to wmf16 until WB patch comes
  • 18:22 andrewbogott: added virt1009 to the eqiad virt cluster
  • 18:17 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: wikidatawiki back to wmf16
  • 18:13 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to wmf17
  • 15:17 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: CopyUploadDomains for Commons gerrit:154718 (duration: 00m 12s)
  • 15:15 logmsgbot: anomie Synchronized commonsuploads.dblist: SWAT: Remove emlwiki from commonsuploads.dblist gerrit:154714 (duration: 00m 09s)
  • 15:13 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Affiliate namespace on chapcomwiki gerrit:154713 (for real this time) (duration: 00m 09s)
  • 15:12 mark: Completed network migration of BGP confideration renumbering: AS65002 -> AS65001, AS65003 -> AS65004, old AS65001 (pmtpa) is part of eqiad for its remaining lifetime
  • 15:12 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add Affiliate namespace on chapcomwiki gerrit:154713 (duration: 00m 09s)
  • 15:10 logmsgbot: anomie Synchronized php-1.24wmf16/includes/parser/Parser.php: SWAT: Fix URL protocol detection regex for file link= parameter gerrit:154844 (duration: 00m 09s)
  • 15:04 logmsgbot: anomie Synchronized php-1.24wmf17/includes/parser/Parser.php: SWAT: Fix URL protocol detection regex for file link= parameter gerrit:154845 (duration: 00m 09s)
  • 14:50 ottomata: starting stat1003 upgrade to trusty
  • 14:37 logmsgbot: demon updated /a/common to I035cebe20: Configure swift-backed snapshots for Cirrus in beta
  • 14:05 logmsgbot: demon Synchronized wmf-config/CirrusSearch-labs.php: beta swift config, no-op (duration: 00m 04s)
  • 13:39 hashar_: Jenkins upgrading hhvm on the Trusty Jenkins slave integration-slave1006-trusty : Unpacking hhvm (3.3-dev+20140728+wmf4) over (3.3-dev+20140728+wmf3)
  • 13:12 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: Fix $wgRestrictionLevels ordering bug 69640 (duration: 00m 04s)
  • 10:19 hashar: Jenkins: bringing back irc bot wmf-insecte in #wikimedia-qa . Will be used to notify failures/fixe of the beta cluster jenkins jobs
  • 09:58 godog: depool mw1019 from appservers, testing trusty+hhvm reinstall RT #8153
  • 07:39 bblack: strontium ok, icinga-wm back
  • 07:17 hashar: Jenkins: manually cleared out a tmpfs partition on lanthanum.eqiad.wmnet which was causing all MediaWiki / extensions jobs to fail completely. bug 69731. We need disk space monitoring which is bug 69733.
  • 07:09 bblack: ... and strontium passenger is failing to start up correctly again. icinga-wm disabled to avoid spam
  • 07:07 bblack: restarted apache2 service on strontium/palladium, expect another small spike of puppet fail->ok
  • 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug 19 03:20:21 UTC 2014 (duration 20m 20s)
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-19 02:36:21+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-19 02:15:14+00:00

August 18

  • 23:03 andrewbogott: isolated virt1006, re-enabling puppet on virt1000 and virt1006
  • 22:36 andrewbogott: disabling puppet on virt1000 and virt1006 while I try to convince the scheduler to overlook virt1006
  • 22:01 bblack: done futzing w/ puppetmasters+neon, all agents enabled and bot back online
  • 21:28 hashar: Zuul processing again. Definitely need to write doc about how to unstuck it
  • 21:02 hashar: Zuul / Jenkins stalled again :-/
  • 21:02 hashar: Zuul / Jenkins stalled again :-/
  • 19:35 bblack: testing new passenger perf params on strontium/palladium. agents on those two and icinga-wm still disabled
  • 19:04 bblack: restarted service apache2 on strontium - passenger for puppet master was dead again
  • 17:00 andrewbogott: added a (yuvi-built) python-txstatsd package to trusty on Carbon.
  • 16:37 bd808: deployment-prep Restarted Apache and HHVM on deployment-mediawiki02 to pick up removal of /etc/php5/conf.d/mail.ini
  • 16:26 logmsgbot: yurik Synchronized php-1.24wmf17/extensions: Syncing JsonConfig,ZeroPortal,ZeroBanner (duration: 01m 13s)
  • 16:22 logmsgbot: yurik Synchronized php-1.24wmf16/extensions: Syncing JsonConfig,ZeroPortal,ZeroBanner (duration: 01m 22s)
  • 16:18 legoktm: migrateAccount.php finished, 2014-08-18 15:42:12 processed 1528652 usernames (22.9/sec), 10 (0.0%) fully migrated, 7938 (0.5%) partially migrated
  • 16:05 hashar: Jenkins tox based jobs are now runnable in parallel 154834
  • 15:36 manybubbles: swat complete
  • 15:29 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - enable cirrus optimization - weighted all fields - on group0 wikis (duration: 00m 07s)
  • 15:29 logmsgbot: manybubbles Synchronized wmf-config/CirrusSearch-common.php: SWAT - drop unused Cirrus parameter (duration: 00m 05s)
  • 15:25 logmsgbot: manybubbles Synchronized php-1.24wmf16/extensions/CentralAuth: SWAT - two centralauth fixes (duration: 00m 05s)
  • 15:22 bblack: resuming slowly wiping varnish caches for mmap update (49 hosts to go), expect small 5xx spikes every ~1.5 hrs for the next few days
  • 15:22 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - noop - sync files adding bouncehandler to betalabs (duration: 00m 04s)
  • 15:19 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - create portal/portal talk namespaces on kowikisource (duration: 00m 04s)
  • 15:18 logmsgbot: manybubbles Synchronized php-1.24wmf17/extensions/CentralAuth/: SWAT - two centralauth fixes (duration: 00m 04s)
  • 15:13 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - create eliminator role on viwiki (duration: 00m 05s)
  • 15:11 logmsgbot: manybubbles Synchronized php-1.24wmf17/extensions/Wikidata/: (no message) (duration: 00m 07s)
  • 15:08 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - Add global-renamer group to metawiki (duration: 00m 04s)
  • 14:50 hashar: Jenkins: reverting PHP CodeSniffer upgrade 154825.We are back to 1.4.7. Previous patch had some issue.
  • 14:42 hashar: Jenkins: upgrading PHP Codesniffer from 1.4.7 to 1.4.8 (thanks to addshore 154053)
  • 14:39 bd808: No apache2.log in fluorine:/a/mw-log; Last file in /a/mw-log/archive is apache2.log-20140816.gz
  • 14:31 bd808: Restarted logstash on logstash1001; event volume was lower than expected
  • 13:49 hashar: restarting zuul. Got stuck again.
  • 13:29 hashar_: Restarted Zuul, some items where stuck in queue. Retrigger your jobs (revote +2 / new patchset / 'recheck' comment)
  • 13:23 logmsgbot: reedy Synchronized php-1.24wmf17/extensions/ExtensionDistributor: Unbreak ExtensionDistributor (duration: 00m 13s)
  • 13:18 hashar: Zuul stuck, looking.
  • 13:06 Reedy: Large amount of incoming traffic to bast1001 is me uploading files
  • 12:11 godog: rebalanced swift object ring in eqiad
  • 09:34 godog: reenabled puppet on neon and started ircecho
  • 09:23 godog: stop ircecho again on neon, disable puppet on neon
  • 09:11 godog: restarted apache2 on strontium
  • 08:58 godog: stopped ircecho on neon while diagnosing puppet failure
  • 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug 18 03:12:27 UTC 2014 (duration 12m 26s)
  • 03:06 hoo: Ran sync-common on mw1053 to stop "Unrecognized job type 'ChangeNotification'." exceptions
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-18 02:30:17+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-18 02:18:52+00:00

August 17

  • 21:07 legoktm: running migrateAccount.php without --safe or --auto on terbium for bug 69291
  • 18:45 hashar: Zuul upgraded
  • 18:41 hashar: Upgrading Zuul to latest version (that is not a friday afterall)
  • 09:22 springle: ongoing schema change wikidatawiki & testwikidatawiki wb_entity_per_page.epp_redirect_target. osc_host.sh processes on terbium ok to kill in emergency
  • 04:34 ottomata: restarted udp2log on oxygen
  • 03:05 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 17 03:04:22 UTC 2014 (duration 4m 21s)
  • 02:49 springle: killed stuff on labsdb1002 using all disk for temp tables. investigating
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-17 02:23:08+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-17 02:13:35+00:00

August 16

  • 18:12 bblack: (amssq33: and yes, removing from fe/be cache pools)
  • 18:11 bblack: powering off amssq33, it's clipping network traffic at peak times due to bad ethernet connection negotiated down to 100Mbps (see existing RT 7933 in esams queue)
  • 18:02 bblack: ms-be1006: syslog indicates it started generating repeated "BUG: soft lockup" 10 minutes before dying, in XFS kernel code again...
  • 17:55 bblack: rebooting ms-be1006, ping-dead in icinga for 23m, console was unresponsive
  • 17:37 bblack: restarted apache2 on palladium... looks like something went horribly wrong with its puppet of itself that somehow killed off puppetmaster service?
  • 03:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 16 03:06:29 UTC 2014 (duration 6m 28s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-16 02:26:02+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-16 02:16:00+00:00

August 15

  • 20:59 logmsgbot: kaldari Synchronized php-1.24wmf16/extensions/MobileFrontend/less: fixing iOS search bug (duration: 00m 05s)
  • 17:58 logmsgbot: aude Synchronized wmf-config/Wikibase.php: Enable redirects on test.wikidata (duration: 00m 07s)
  • 15:53 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Update test.wikidata (duration: 00m 07s)
  • 15:50 logmsgbot: aude Synchronized php-1.24wmf17/extensions/Wikidata: Fix database error and snak value display on test wikidata (duration: 00m 09s)
  • 15:00 ori: re-enabled puppet on mw1017
  • 13:33 ori: disabling puppet on mw1017 to test rsyslog config
  • 03:51 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 15 03:50:23 UTC 2014 (duration 50m 22s)
  • 03:04 logmsgbot: LocalisationUpdate completed (1.24wmf17) at 2014-08-15 03:03:49+00:00
  • 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-15 02:33:21+00:00
  • 00:24 logmsgbot: ori Finished scap: SWAT: cherry picks for TMH and Echo (duration: 14m 38s)
  • 00:09 logmsgbot: ori Started scap: SWAT: cherry picks for TMH and Echo

August 14

  • 23:24 logmsgbot: aude Synchronized wmf-config/Wikibase.php: Bump cache epoch and add badges setting on test.wikidata (duration: 00m 32s)
  • 23:13 logmsgbot: aude Finished scap: Update branch for test.wikidata (duration: 16m 48s)
  • 22:57 logmsgbot: aude Started scap: Update branch for test.wikidata
  • 22:26 logmsgbot: aaron Synchronized php-1.24wmf16/includes/DefaultSettings.php: 67bf481ce1644ff194d7565107d9b8ffe11bf4b7 (duration: 00m 07s)
  • 22:23 logmsgbot: aaron Synchronized wmf-config/CommonSettings.php: Increased wgParsoidCacheUpdateTitlesPerJob to 12 to lower the backlog (duration: 00m 07s)
  • 22:14 logmsgbot: aude scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="test2wiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.kFlVQdKnM2" ' returned non-zero exit status 255 (duration: 00m 40s)
  • 22:13 logmsgbot: aude Started scap: Update branch for test.wikidata
  • 21:49 logmsgbot: reedy Synchronized php-1.24wmf17/includes/context/RequestContext.php: (no message) (duration: 00m 15s)
  • 21:10 godog: restarted hhvm on mw1053
  • 20:47 _joe|away: stopping puppet, jobrunner on mw1053; HHVM is eating memory like godzilla
  • 19:29 bblack: puppeting labmon1001, etc
  • 18:57 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 18:55 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 14s)
  • 18:26 mutante: stopped ircecho on neon temporarily
  • 18:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf17
  • 18:05 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf16
  • 17:45 AaronSchulz: /srv/deployment/jobrunner updated to 795baf3ca4ce8308597dd74e5242aa5bfbbe961d
  • 17:39 logmsgbot: aaron Synchronized rpc: 6c0ece687bb6ff3fec0ca7e80a587525ebf18a70 (duration: 00m 08s)
  • 16:52 _joe_: uploaded new hhvm package 3.3-dev+20140728+wmf4
  • 16:23 logmsgbot: reedy Synchronized php-1.24wmf17/extensions/CentralAuth/: (no message) (duration: 00m 13s)
  • 16:23 logmsgbot: reedy Synchronized php-1.24wmf16/extensions/CentralAuth/: (no message) (duration: 00m 14s)
  • 15:49 Reedy: Running sync-common on mw1053
  • 15:48 logmsgbot: reedy Finished scap: testwiki to 1.24wmf17 (duration: 33m 13s)
  • 15:47 Jeff_Green: adjust wiki-mail._domainkey DNS record to allow sending from 'wiki*@" addresses, instead of just wiki@
  • 15:23 _joe_: powercycling mw1053, which looks like the victim of hhvm-induced ooms
  • 15:15 logmsgbot: reedy Started scap: testwiki to 1.24wmf17
  • 14:01 _joe_: puppet re-enabled on the appserver
  • 12:38 _joe_: stopping puppet on appservers while deploying a delicate change.
  • 12:12 manybubbles|away: cirrus index rebuilds are still proceeding without issue. Going to continue to let them run and keep half an eye on them. enwiki is nearly done. Commons and wikidata are done. Many of group1 are done - we're up to eswiktionary now - but there are many to go.
  • 09:30 _joe_: the hhvm jobrunner is back in production, seems healthy, see https://logstash.wikimedia.org/#/dashboard/elasticsearch/hhvm_jobrunner
  • 08:09 _joe_: reactivated the jobrunner on mw1053, with promising results. Puppettization pending (in ~ 1 hour)
  • 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Aug 14 03:11:33 UTC 2014 (duration 11m 32s)
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-14 02:29:52+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-14 02:16:34+00:00

August 13

  • 21:58 manybubbles: cirrus index rebuild is proceeding without trouble - I'm going to let it continue over night.
  • 21:46 andrewbogott: re-enabled puppetmaster on virt1000; apache changes seem stable now.
  • 21:18 _joe_: stopped puppet on virt1000, our fail
  • 13:23 springle: killed a mass of SpecialWhatLinksHere queries on enwiki
  • 12:51 manybubbles: restarting rebuilding Cirrus indexes to pick up weighted all field
  • 10:35 godog: bump swift weights for ms-be1013 ms-be1014 ms-be1015 to 2500
  • 08:38 hashar: gallium removing some sun-java6* packages coming from old lucid era
  • 07:47 hashar: upgrading Java on contint servers gallium and lanthanum , restarting Jenkins related process
  • 04:03 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Aug 13 04:02:23 UTC 2014 (duration 2m 22s)
  • 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-13 03:11:38+00:00
  • 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-13 02:40:36+00:00

August 12

  • 23:34 logmsgbot: hoo Synchronized tests/multiversion/MWMultiVersionTest.php: (no message) (duration: 00m 11s)
  • 23:32 logmsgbot: hoo Synchronized php-1.24wmf16/skins/Vector/skinStyles/mediawiki.special.preferences.less: Fix missing tab images on Special:Preferences (duration: 00m 10s)
  • 23:26 hoo: Had to abort scap on mw1053 (which is depooled) manually
  • 23:26 logmsgbot: hoo Finished scap: Update WikimediaMessages (superprotect messages for wmf16) (duration: 46m 16s)
  • 22:40 logmsgbot: hoo Started scap: Update WikimediaMessages (superprotect messages for wmf16)
  • 22:21 logmsgbot: hoo Synchronized php-1.24wmf16/extensions/ProofreadPage/: Fix JS error while editing (duration: 00m 10s)
  • 19:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf16
  • 19:06 logmsgbot: reedy Synchronized php-1.24wmf16/includes/specials/SpecialRecentchangeslinked.php: Fix FR bug (duration: 00m 14s)
  • 17:55 AaronSchulz: populateBacklinkNamespace.php finished on all wikis
  • 17:13 springle: restart mysqld on labsdb1002, upgrade to mariadb 10.0.13 for bugfix
  • 16:57 Jeff_Green: removed aluminium.wikimedia.org from production
  • 16:50 springle: restart mysqld on labsdb1001, upgrade to mariadb 10.0.13 for bugfix
  • 15:08 bblack: flipping ulsfo traffic back to ulsfo
  • 11:51 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: Set siteGroup for testwikidata (duration: 00m 11s)
  • 11:21 hashar: Jenkins: clearing up some obsolete symbolic links under gallium.wikimedia.org:/var/lib/jenkins/jobs/*/builds/ Running in a screen as user jenkins
  • 05:01 springle: rsync ~1TB labsdb1001 to labsdb1003, throttled ~25MB/s
  • 04:25 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: s2: repool db1009. s3: repool db1035. (duration: 00m 06s)
  • 03:45 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: s2: depool db1009. repool db1018. adjust db1036 load. (duration: 00m 07s)
  • 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug 12 03:14:34 UTC 2014 (duration 14m 33s)
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-12 02:32:09+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-12 02:18:36+00:00

August 11

  • 22:33 awight: update CRM schema to wmf_civicrm:7021
  • 21:47 andrewbogott: removed the old puppet-freshness check which should have no effect but may instead produce a torrent of alert spam https://gerrit.wikimedia.org/r/#/c/142560/
  • 04:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug 11 03:59:17 UTC 2014 (duration 59m 16s)
  • 03:05 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-11 03:04:15+00:00
  • 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-11 02:33:02+00:00

August 10

  • 23:47 logmsgbot: ori Synchronized php-1.24wmf16/extensions/MassMessage/includes: Revert MassMessage to 9884fbb50a (duration: 00m 06s)
  • 23:36 logmsgbot: ori Synchronized php-1.24wmf16/extensions/MassMessage: Update MassMessage for I840c98dca: Fix MassMessage::getMessengerUser() after Password API changes (duration: 00m 06s)
  • 22:59 logmsgbot: csteipp Finished scap: Deploy Ibe28a69c9fbab00b81c53b1643df722a3f1fbf19 at Eriks request (duration: 26m 13s)
  • 22:33 logmsgbot: csteipp Started scap: Deploy Ibe28a69c9fbab00b81c53b1643df722a3f1fbf19 at Eriks request
  • 16:25 logmsgbot: reedy Finished scap: Rebuild l10n cache for WikimediaMessages (duration: 22m 12s)
  • 16:02 logmsgbot: reedy Started scap: Rebuild l10n cache for WikimediaMessages
  • 15:01 logmsgbot: hoo Synchronized php-1.24wmf16/extensions/CentralAuth/: (no message) (duration: 00m 25s)
  • 15:00 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/CentralAuth/: (no message) (duration: 00m 25s)
  • 13:53 Reedy: Grant staff "superprotect" right per Robla/Erik request
  • 13:02 logmsgbot: tstarling Synchronized wmf-config/InitialiseSettings.php: Idfa21125 (duration: 00m 05s)
  • 13:02 logmsgbot: tstarling Synchronized wmf-config/CommonSettings.php: Idfa21125 (duration: 00m 06s)
  • 12:08 mutante: re-enabling puppet and services on tarin
  • 11:57 mutante: tarin - stopping poolcounterd, gmond,.. (Tampa, should really not be in use)
  • 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 10 03:14:52 UTC 2014 (duration 14m 51s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-10 02:33:15+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-10 02:19:53+00:00

August 9

  • 15:22 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I313b09ffc: Don't require native CDB support to load {interwiki,trustedxff}.cdb (duration: 00m 05s)
  • 14:25 Reedy: Removed <= MediaWiki 1.24wmf5
  • 13:29 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: move s4 api traffic back to db1042 (duration: 00m 06s)
  • 11:32 mutante: added Ryan Lane to NDA LDAP group
  • 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 9 03:20:07 UTC 2014 (duration 20m 6s)
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-09 02:36:52+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-09 02:19:03+00:00

August 8

  • 21:24 Reedy: mw1130 seems to be dead (unresponsive to ping)
  • 21:21 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: (no message) (duration: 01m 04s)
  • 21:06 awight: deployed crm default/settings.php
  • 17:07 mutante: jenkins/puppet-compiler - granting new LDAP group "nda" the same rights already given to matanya (and wmde even has more)
  • 16:06 bblack: datacenter traffic mapping back to normal, varnish fix/wipe/restart/etc work on pause for the weekend in a stable state
  • 16:02 andrewbogott: merging https://gerrit.wikimedia.org/r/#/c/150273/ which affects every puppet log everywhere...
  • 14:22 mutante: RT - reverted permission change for access requests requestors per robh
  • 13:50 mutante: RT - granted permission to show ticket summary for role requestor in queue access-requests
  • 12:49 akosiaris: uploaded ruby-jsduck 5.3.4-1wmftrusty1 and ruby-rkelly-remix 0.0.6-1trusty1 on apt.wikimedia.org
  • 12:33 ori: testwiki up, judgement poor
  • 12:28 hashar: Jenkins: somehow the ArtifactDeployer plugin got upgraded on Aug 7th 20:57 UTC despite it being broken bug 69197. Attempting manual downgrade
  • 12:13 hashar: reloading Jenkins
  • 12:07 akosiaris: ifconfig br0 0.0.0.0 on platinum to get rid of the IP on that interface and have facter work more reliably. This does not matter right now as it is an evaluation machine but logging it for completeness
  • 12:03 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
  • 11:32 _joe_: rebooting mw1017
  • 11:29 akosiaris: mw1130 has broken disk
  • 11:09 ori: running rsync-common on mw1017
  • 11:02 logmsgbot: hoo Synchronized php-1.24wmf16/extensions/CentralAuth/: Another shot towards bug 39996 (duration: 01m 04s)
  • 11:01 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/CentralAuth/: Another shot towards bug 39996 (duration: 01m 04s)
  • 09:29 _joe_: reimaging mw1017 aka testwiki.
  • 06:03 springle: ongoing schema changes: rev_content_model, rev_content_format. on terbium, osc_host.sh processes ok to kill in emergency
  • 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 8 03:12:21 UTC 2014 (duration 12m 20s)
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-08 02:28:39+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-08 02:16:13+00:00

August 7

  • 19:19 jgage: rebooting analytics1021 for kernel upgrade
  • 18:55 bblack: starting the process of fixing upload cache sizes, there will be periodic slim 5xx spikes...
  • 16:31 Jeff_Green: temporarily disabling icinga notifications for ocg100[123] ocg service check
  • 16:09 logmsgbot: krinkle Synchronized php-1.24wmf16/extensions/GlobalCssJs/GlobalCssJs.hooks.php: 4bbf4e0ed92f9a09 (duration: 00m 05s)
  • 15:48 mutante: zirconium - attempt to fix apache site setup manually
  • 15:46 logmsgbot: reedy Synchronized wmf-config/extension-list-labs: (no message) (duration: 00m 13s)
  • 15:38 logmsgbot: reedy Synchronized php-1.24wmf16/maintenance/findMissingFiles.php: (no message) (duration: 00m 20s)
  • 15:37 logmsgbot: reedy Synchronized php-1.24wmf15/maintenance/findMissingFiles.php: (no message) (duration: 00m 17s)
  • 15:12 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
  • 14:43 akosiaris: uploaded varnish_3.0.5plus~x-wm7trusty1 on apt.wikimedia.org (for usage in trusty labs machines, notably cxserver)
  • 14:23 mutante: shutting down elastic1018
  • 14:12 ^d: elastic1018: blacklisted from shard allocation since it's dead
  • 14:05 mutante: depooled elastic1018 - service wasnt running and signs of broken hardware (SSD)
  • 13:57 mark: Temporarily set max connections to swift from cp1049 backend varnish from 1000 to 2000
  • 13:56 mutante: starting elasticsearch on elastic1018
  • 12:23 hashar: Zuul upgraded labs branch to match production (i.e. have same version of Zuul cloner)
  • 12:20 hashar: restarting Zuul
  • 11:25 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: I53f76a35ac - No longer allow voyage 'crats to usermerge (duration: 00m 15s)
  • 11:13 akosiaris: removed laner@wikimedia.org entirely. It pointed to rlane@wikimedia.org which no longer exists
  • 11:11 akosiaris: removed rlane from root@wikimedia.org and usability@wikimedia.org
  • 10:45 mutante: iron, bast1001 - installed package upgrades
  • 09:13 hashar: Jenkins: polling a new Jenkins slave using Trusty integration-slave1006-trusty [10.68.17.223] with 4 CPU. Copy pasted from 1004-trusty
  • 08:32 hashar: Jenkins: switching job from https://github.com/wmf-analytics/libcidr/ to https://gerrit.wikimedia.org/r/analytics/libcidr
  • 07:44 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: move s4 api traffic to db1056 (duration: 00m 07s)
  • 07:39 mark: Set OSPF metric 1000 on cr2-eqiad:xe-5/2/2 (GTT link)
  • 05:39 springle: labsdb1002 restart
  • 03:48 springle: labsdb1001 restart
  • 03:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Aug 7 03:08:49 UTC 2014 (duration 8m 48s)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-07 02:27:52+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-07 02:15:45+00:00

August 6

  • 21:33 hashar: Jenkins: moved mediawiki-core-regression-hhvm-master to run on Trusty instance
  • 20:26 hashar: Jenkins: downgraded ansicolor plugin from 0.4 to 0.3.1 Some colors.js function emits ANSI codes to reset the color which are not properly understood
  • 20:06 hashar: I have broke Zuul/Jenkins :-]
  • 18:53 hashar: Jenkins slow startup is bug 69197
  • 18:50 hashar: restarting jenkins
  • 18:49 hashar: Stopping Jenkins. Reverting upgrade of artifact deployer plugin
  • 18:10 mutante: puppet-catalog-compiler says to "wait while Jenkins is getting ready to work"
  • 17:20 hashar: Jenkins process jobs again, the UI will take a bunch of hours to load though due to some issue when initializing
  • 17:14 hashar: killed Jenkins
  • 17:12 _joe_: stopped the jobrunner on mw1053, was running in fcgi mode unpuppetized and with a broken vhost. Fixed it, it started spawning exceptions. DO NOT enable puppet again
  • 17:02 ^d: jenkins restarted, was stuck
  • 15:52 hashar: Restarted Zuul and Zuul-merger on gallium to tweak logging settings 152118
  • 11:30 logmsgbot: hoo Synchronized wmf-config/CommonSettings.php: Grant 'centralauth-rename' to 'steward' (duration: 00m 24s)
  • 11:26 logmsgbot: demon Synchronized wmf-config/abusefilter.php: (no message) (duration: 00m 19s)
  • 10:10 hashar: Jenkins web interface is back up
  • 09:54 logmsgbot: demon Synchronized wmf-config/abusefilter.php: abuse filter settings for fawiki (duration: 00m 21s)
  • 07:33 hashar: restarting Jenkins. It apparently like to parse the whole history on reload, so aborting that.
  • 07:13 hashar: Upgrading Jenkins plugin and restarting.
  • 07:04 hashar: upgrading Jenkins to latest LTS
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Aug 6 03:10:06 UTC 2014 (duration 10m 5s)
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-06 02:29:00+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-06 02:16:38+00:00

August 5

  • 15:07 logmsgbot: root gracefulled all apaches
  • 15:03 logmsgbot: root gracefulled all apaches
  • 12:30 hasharEat: Upgrading python-gear on gallium and restarting zuul and zuul-merger
  • 12:26 akosiaris: uploaded python-gear_0.5.5-1 on apt.wikimedia.org
  • 03:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Aug 5 03:08:30 UTC 2014 (duration 8m 29s)
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-05 02:27:58+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-05 02:16:52+00:00
  • 00:41 springle: ongoing schema changes: ar_content_model, ar_content_format. on terbium, osc_host.sh processes ok to kill in emergency

August 4

  • 22:58 bblack: rebooting ms-be1012
  • 21:47 ottomata: reenabling puppet on analytics1027
  • 21:46 jgage: all kafka brokers upraded to 0.8.1.1 and data replicated: done
  • 20:37 ottomata: stopping puppet on analytics1027 to temporarily disable camus cron job
  • 19:07 ottomata: starting upgrade of kafka cluster
  • 19:02 logmsgbot: maxsem Synchronized php-1.24wmf16/includes/User.php: https://gerrit.wikimedia.org/r/#/c/151691/ (duration: 00m 06s)
  • 18:57 jgage: beginning kafka upgrade: disabling puppet on brokers
  • 13:17 apergos: stopped labs rsync job from dataset1001, mount of labstore1003 was borked, removed 90GB of stuff on /mnt/data (= /) filesystem, restarted nfsd on dataset1001, dumps back to going
  • 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Aug 4 03:11:03 UTC 2014 (duration 11m 2s)
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-04 02:27:58+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-04 02:16:46+00:00

August 3

  • 03:30 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Aug 3 03:28:56 UTC 2014 (duration 28m 55s)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-03 02:27:44+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-03 02:16:39+00:00

August 2

  • 15:28 godog: reboot ms-be1008, stuck on xfs errors and most processes in D state
  • 14:10 Krinkle: Restarting Zuul
  • 14:08 hashar: Jenkins / Zuul stuck bug 69045
  • 14:00 Krinkle: Restarting Jenkins in attempt to unstuck the clogged Zuul pipeline for gallium
  • 04:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Aug 2 04:20:45 UTC 2014 (duration 20m 44s)
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-02 02:32:36+00:00
  • 02:21 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-02 02:20:02+00:00
  • 01:49 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1018, replag (duration: 00m 06s)
  • 00:43 Krinkle: Restarting Jenkins on gallium because the pipeline is clogged

August 1

  • 20:25 andrewbogott: shorted the logrotate interval on vanadium; disk space critical should resolve soon
  • 18:10 logmsgbot: csteipp Synchronized php-1.24wmf16/extensions/CentralAuth: Fix for bug 69007 - logins failing for old style hashes (duration: 00m 06s)
  • 17:32 AaronSchulz: Restarted maintenance/populateBacklinkNamespace.php on enwiki
  • 17:31 logmsgbot: aaron Synchronized php-1.24wmf15/maintenance/populateBacklinkNamespace.php: e1cea29342f964cd9a720310185b09ca41eb1a4a (duration: 00m 04s)
  • 17:16 akosiaris: upgraded etherpad-lite on zirconium to 1.4.0-2. Uploaded etherpad-lite_1.4.0-2 on apt.wikimedia.org
  • 17:11 logmsgbot: aaron Synchronized php-1.24wmf15/includes: d218d86dff90a5f0110353c492bd2e8ddaf35497 (duration: 00m 08s)
  • 17:09 logmsgbot: aaron Synchronized php-1.24wmf16/includes: f1a8ff7f802b57cc9f452d47c4c762a185ed93c2 (duration: 00m 06s)
  • 15:48 logmsgbot: reedy Synchronized php-1.24wmf16/includes/specials/SpecialRecentchangeslinked.php: (no message) (duration: 00m 14s)
  • 12:07 apergos: powercycled dataset1001, inaccessible via mgmt console, only visible message was 'mnt.nfs failed'
  • 09:10 _joe_: apache mediawiki::web train finished its run. re-enabling puppet on all appservers
  • 07:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Aug 1 07:46:04 UTC 2014 (duration 46m 3s)
  • 07:24 _joe_: stopping puppet on appservers to deploy a potentially dangerous case
  • 05:16 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: Move enwiki api traffic away from lagging slaves (duration: 00m 07s)
  • 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf16) at 2014-08-01 03:11:14+00:00
  • 02:40 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-08-01 02:38:56+00:00
  • 00:52 logmsgbot: catrope Synchronized php-1.24wmf16/extensions/VisualEditor/lib/ve/modules/ve/ui/inspectors/ve.ui.CommentInspector.js: Fix typo in class name (duration: 00m 10s)

July 31

  • 23:23 logmsgbot: mwalker Synchronized php-1.24wmf16: Updating core and Flow for SWAT (duration: 00m 53s)
  • 23:05 logmsgbot: mwalker Synchronized wmf-config: Updating configuration for 150145 (duration: 00m 05s)
  • 21:17 RobH: blog.wikimedia.org cname changed to migrate over to wp servers
  • 20:22 AaronSchulz: Started populateBacklinkNamespace.php on s1-s3,s5-s7 (commons already running)
  • 20:13 cscott: updated OCG to version d2919c59eb09e09fc87777696411a070620aef45
  • 19:40 hashar: Jenkins build its first hhvm extension \O/ https://integration.wikimedia.org/ci/job/php-FastStringSearch-hhvm-build/2/console
  • 19:24 Coren_away: labsdb1005 had to blow away the postgres slave: was using all the space on / because DB at wrong spot (should have been /srv/postgres)
  • 18:40 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
  • 18:27 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
  • 18:09 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf16
  • 18:02 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf15
  • 17:47 logmsgbot: aaron Synchronized wmf-config/CommonSettings.php: Increased "htmlCacheUpdate" throttle limit (duration: 00m 07s)
  • 17:46 logmsgbot: reedy Finished scap: testwiki to 1.24wmf16 and build l10n cache (duration: 22m 35s)
  • 17:23 logmsgbot: reedy Started scap: testwiki to 1.24wmf16 and build l10n cache
  • 14:57 bblack: added labstore1003 to filter labs-in4 terms allow-labstore-(udp|tcp)4 on cr[12]-eqiad
  • 14:33 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Allow sysops and 'crats on wikimania2014wiki to grant confirmed (duration: 00m 15s)
  • 14:12 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages to 1.24wmf15
  • 14:12 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
  • 14:10 logmsgbot: reedy Finished scap: Rebuild 1.24wmf15 l10n cache for WikimediaMessages updates (duration: 22m 40s)
  • 14:05 bblack: removed labs-in4 and labs-in6 filters on vlan 1117 (labs-hosts1-a-eqiad) on cr[12]-eqiad
  • 13:47 logmsgbot: reedy Started scap: Rebuild 1.24wmf15 l10n cache for WikimediaMessages updates
  • 13:44 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/RelatedSites/: (no message) (duration: 00m 15s)
  • 13:44 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/WikimediaMessages: (no message) (duration: 00m 14s)
  • 12:10 hashar: stopping Jenkins and restarting it
  • 12:04 hashar: reloading Jenkins configuration
  • 11:37 hashar: Jenkins: upgrading almost all jobs to use a new label 'UbuntuPrecise' bug 68340 150785
  • 10:49 hashar: Jenkins: attempting to poll a Trusty slave (integration-slave1004-trusty [10.68.17.148] with label UbuntuTrusty).
  • 10:32 hashar: Jenkins: tweaking jobs labels, that might eventually screw up Zuul/Jenkins entirely.
  • 08:43 _joe_: start rolling reload of nginx to catch up with the new ssl config
  • 06:50 springle: labsdb1001 migration complete, should be all systems go
  • 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 31 03:18:07 UTC 2014 (duration 18m 6s)
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-31 02:35:29+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-31 02:19:17+00:00
  • 02:06 springle: labsdb1001 migrating to mariadb 10, expect read-only and downtime, see labs-l

July 30

  • 23:27 logmsgbot: maxsem Synchronized php-1.24wmf15/extensions/MwEmbedSupport/: (no message) (duration: 00m 03s)
  • 23:27 logmsgbot: maxsem Synchronized php-1.24wmf15/extensions/Wikidata/: (no message) (duration: 00m 08s)
  • 23:26 logmsgbot: maxsem Synchronized php-1.24wmf15/extensions/SyntaxHighlight_GeSHi/: (no message) (duration: 00m 05s)
  • 23:23 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/Wikidata: (no message) (duration: 00m 11s)
  • 23:13 logmsgbot: maxsem Synchronized wmf-config: (no message) (duration: 00m 05s)
  • 21:04 AaronSchulz: Started populateBacklinkNamespace.php on wikidata and commons
  • 21:02 bblack: turned icinga email/sms back on
  • 20:24 bblack: icinga back online again
  • 19:57 bblack: shutting off icinga to make some optimizations
  • 19:20 bblack: icinga is now substantially back online. email/sms still disabled for now, and downtimes/acks need to be re-added for known issues
  • 19:06 logmsgbot: csteipp Synchronized php-1.24wmf14/includes/: (no message) (duration: 00m 05s)
  • 19:04 logmsgbot: csteipp Synchronized php-1.24wmf15/includes/: (no message) (duration: 00m 07s)
  • 18:59 bblack: icinga coming back up again for the first time, expect random strangeness to be ignored
  • 18:46 bblack: temporarily hard-disabling email/sms from icinga via 'mv /usr/bin/mail /usr/bin/mail-disabled' on neon to prevent icinga spam on next startup attempt
  • 17:55 bblack: stopping icinga service for now while working out other details
  • 17:25 tacotuesday: repooled elastic1018 and elastic1019 as well
  • 17:21 Coren: labmon1001 rebooting (final check for proper raid+lvm autodetection)
  • 17:08 bblack: working on bringing up new neon install (first puppet run, etc)
  • 17:01 Coren: labmon1001 rebooting (partitioning changes on primary disks)
  • 16:53 tacotuesday: elastic1017 repooled, shards allocating
  • 16:13 bd808: scap and dologmsg from tin won't work until neon is back up and running tcpircbot
  • 16:07 bd808|deploy: Synchronized touch: no-op sync to test scap update (duration: 00m 05s)
  • 16:06 bd808|deploy: scap announce failed -- timeout connecting to tcpircbot on neon.wikimedia.org
  • 16:04 bd808|deploy: Updated scap to 4871208 (rely on $PATH for scap scripts)
  • 15:21 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/Wikidata/extensions/Wikibase/lib/resources/wikibase.js: touch (duration: 00m 20s)
  • 15:17 hashar: upgrading php5 on jenkins slaves
  • 15:07 cmjohnson1: shutting down neon
  • 14:46 logmsgbot: demon Synchronized wmf-config/CirrusSearch-production.php: (no message) (duration: 00m 04s)
  • 14:35 logmsgbot: demon Synchronized wmf-config/PrivateSettings.php: Swift config for Cirrus (duration: 00m 08s)
  • 14:30 godog: rolling restart of ms-fe* to pick up search backup user
  • 14:17 bblack: rebooting neon again, trying to fix the disk situation
  • 14:11 Coren: reinstalling labmon1001 -> change disk partitioning scheme
  • 13:50 springle: neon read-only fs. fsck + reboot
  • 13:16 manybubbles: rebuiding Cirrus index for commons to pick up weighted all field
  • 11:17 _joe_: enabling puppet on all mw* servers
  • 11:15 _joe_: re-enabling puppet on mw1019, last bunch of tests, then re-enabling globally
  • 10:58 _joe_: re-enabling puppet on mw1018, testwiki upgraded to the new config and looks fine
  • 09:25 godog: set weight for ms-be1014 and ms-be1015 to 2300
  • 08:58 _joe_: stopping puppet on the appservers, in preparation for releasing change 148099
  • 08:30 _joe_: powercycling neon, doesn't respond to requests, ssh hangs, console dark
  • 06:41 springle: labsdb1001 work in progress; it may misbehave. see labs-l for updates
  • 04:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 30 04:27:56 UTC 2014 (duration 27m 55s)
  • 03:39 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-30 03:38:28+00:00
  • 02:51 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-30 02:50:14+00:00
  • 01:47 bblack: ip addr del for cp4017's ip6_mapped addr on cp4018 (no idea why it was there...)

July 29

  • 23:37 logmsgbot: catrope Finished scap: SWAT updates for wmf15, I'm lazy (duration: 07m 02s)
  • 23:30 AaronSchulz: Updated /srv/jobrunner to d2298139ea22bf8e48de066a73f28024b140ea33
  • 23:30 logmsgbot: catrope Started scap: SWAT updates for wmf15, I'm lazy
  • 23:28 logmsgbot: catrope Synchronized php-1.24wmf14/extensions/VisualEditor: (no message) (duration: 00m 05s)
  • 23:28 logmsgbot: catrope Synchronized php-1.24wmf14/extensions/MobileFrontend: (no message) (duration: 00m 05s)
  • 23:18 logmsgbot: catrope Synchronized wmf-config/: Do not put OCG in sidebar (duration: 00m 04s)
  • 23:11 logmsgbot: catrope Synchronized wmf-config/: Enable TemplateData GUI on nlwiki (duration: 00m 05s)
  • 23:10 bblack: took OCG service IP out of downtime in icinga, it's live
  • 23:06 logmsgbot: mwalker Synchronized wmf-config: Enabling OCG in production (duration: 00m 04s)
  • 23:05 logmsgbot: aaron Synchronized rpc: 0df032d957155aa475d99e2b887ba98b9a4c32fd (duration: 00m 07s)
  • 23:04 logmsgbot: cscott Synchronized wmf-config: (no message) (duration: 00m 12s)
  • 23:03 logmsgbot: cscott updated /a/common to Iae1ac79d5: Enable OCG in production
  • 22:55 cscott: updated OCG to version aeb8623d6ebe41ae7c7e36c57844bd9ea8e6d595
  • 22:50 RoanKattouw: Fixed ownership of slot0/cache on wikitech (virt1000), was root:root but should have been www-data:www-data
  • 22:24 RoanKattouw: Updated lib/ve submodule inside extensions/VisualEditor on virt1000; wikitechwiki was running a Frankenstein version of VE that was part yesterday's code, part code from April
  • 21:47 logmsgbot: ori Synchronized rpc/RunJobs.php: Ia62e9158f: Added a streamlined RunJobs that can be used by redisJobService (2/2) (duration: 00m 03s)
  • 21:47 logmsgbot: ori Synchronized multiversion: Ia62e9158f: Added a streamlined RunJobs that can be used by redisJobService (1/2) (duration: 00m 03s)
  • 21:44 Reedy: cleared bottuzzu@itwiki watchlist
  • 21:32 spagewmf: spage ran `mwscript namespaceDupes.php --wiki=enwiki --prefix Topic`, 5 pages renamed
  • 21:22 logmsgbot: spage Synchronized wmf-config/InitialiseSettings.php: Enable Flow on Wikimania testing page (duration: 00m 13s)
  • 21:22 logmsgbot: ori updated /a/common to Ia62e9158f: Added a streamlined RunJobs that can be used by redisJobService
  • 21:18 logmsgbot: spage updated /a/common to I3b4622e27: Wikivoyages back to 1.24wmf14
  • 20:54 logmsgbot: aaron Synchronized php-1.24wmf14/includes/media: b45248509c07acb8146d6e735ef68dff193ac290 (duration: 00m 07s)
  • 19:46 Krinkle: Reloading Zuul to deploy I7f80ee0b85d29791b7
  • 19:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages back to 1.24wmf14
  • 19:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages back to 1.24wmf15...
  • 19:14 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
  • 19:09 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikivoyages back to 1.24wmf14
  • 18:43 cmjohnson1: power cycling virt1009
  • 18:29 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf15
  • 18:28 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/Wikidata/extensions/Wikibase/lib/config/WikibaseLib.default.php: touch (duration: 00m 16s)
  • 18:26 bblack: removed "filter { input labs6-in; }" from ae3.1119 (labs-support1-c-eqiad) on cr[12]-eqiad
  • 17:52 logmsgbot: aaron Synchronized php-1.24wmf15/includes/media: 76459cebd9cfbb33e9845f7acd8b8c1382cdae61 (duration: 00m 08s)
  • 16:56 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Bump $wgCacheEpoch for testwikidata (duration: 00m 08s)
  • 16:52 logmsgbot: hoo Synchronized php-1.24wmf15/extensions/Wikidata/: Touch JS (duration: 00m 10s)
  • 16:52 logmsgbot: hoo Synchronized php-1.24wmf14/extensions/Wikidata/: Touch JS (duration: 00m 11s)
  • 16:50 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Only declare "special" sitegroups for testwikidata (duration: 00m 07s)
  • 16:48 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Only declare "special" sitegroups for testwikidata (duration: 00m 08s)
  • 16:47 logmsgbot: hoo Finished scap: Updating Wikidata with various changes for testwikidata and a client bug fix. (duration: 27m 27s)
  • 16:37 cmjohnson1: replacing defective disk virt1009
  • 16:20 logmsgbot: hoo Started scap: Updating Wikidata with various changes for testwikidata and a client bug fix.
  • 16:10 logmsgbot: hoo Synchronized wmf-config/Wikibase.php: Make testwikidata use the "special" sitelink group. Preparations for submodule updates. (duration: 00m 08s)
  • 16:10 bd808: logstash log event volume up after restart
  • 16:09 bd808: restarted logstash on logstash1001.eqiad.wmnet; log volume looked to be down from expected levels
  • 16:08 _joe_: reenabled puppet on mw1053
  • 16:03 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings.php: Enable Wikibase other projects links per default for ruwiki (duration: 00m 07s)
  • 15:13 manybubbles: building cirrus indexes for group0 wikis in place to turn on the weighted all field we'll use for performance improvements later
  • 15:06 logmsgbot: manybubbles Synchronized wmf-config: SWAT - deploy cirrussearch all field stage 2 part 2 (duration: 00m 04s)
  • 15:06 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - deploy cirrussearch all field stage 2 part 1 (duration: 00m 04s)
  • 13:54 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: added Universiteits Museum Utrecht to the wgCopyUploadsDomains array 150163 (duration: 00m 04s)
  • 13:38 ottomata: restarted gmetad on nickel, seems to have brought ganglia back up
  • 11:30 _joe_: upgrading packages on mw1053, for testing hhvm with pcre-jit enabled
  • 10:35 _joe_: puppet re-enabled on the appservers
  • 10:29 _joe_: temporarily stopping puppet on appservers, releasing a potentially dangerous puppet change
  • 09:10 _joe_: stopping jobrunner on mw1053, disabling puppet as well - running tests
  • 09:02 hashar: restarted zuul-server and zuul-merger on gallium (new version though that is a noop)
  • 09:00 hashar_: Zuul bumping Zuul cloner from patchset 21 to patchset 23. Deploying with tag wmf-deploy-2014-07-29-1
  • 07:51 akosiaris: uploaded PHP 5.3.10-1ubuntu3.13+wmf1 on apt.wikimedia.org. Puppet will upgrade it across the fleet within 20 mins
  • 03:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 29 03:47:39 UTC 2014 (duration 47m 38s)
  • 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-29 03:10:31+00:00
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-29 02:35:18+00:00
  • 00:44 logmsgbot: aaron Synchronized php-1.24wmf15/maintenance/runJobs.php: fcfa3153e53dc70e6cd190a087e7bd577fe380fb (duration: 00m 03s)
  • 00:27 logmsgbot: aaron Synchronized php-1.24wmf15/maintenance: f754c239ce93fc5f2db19e93f4fe8a1d1ba7bc27 (duration: 00m 04s)
  • 00:27 logmsgbot: aaron Synchronized php-1.24wmf15/includes: f754c239ce93fc5f2db19e93f4fe8a1d1ba7bc27 (duration: 00m 06s)

July 28

  • 23:58 logmsgbot: ori Finished scap: I42c07b64: Update MobileFrontend (duration: 17m 37s)
  • 23:41 logmsgbot: ori Started scap: I42c07b64: Update MobileFrontend
  • 23:33 logmsgbot: ori Synchronized php-1.24wmf15/extensions/VisualEditor: Update VisualEditor to I944f8fbfa (duration: 00m 04s)
  • 23:25 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I369dbad6e: Allow crats to add/remove petitiondata group on foundationWiki (duration: 00m 04s)
  • 23:21 AaronS: Updated /srv/jobrunner to 0bb0ad62dd9240e0f67b2ded4519f125de13dfbc
  • 23:12 mutante: temp. disabled puppet on neon and ircecho
  • 23:06 mutante: graceful apache on palladium
  • 21:12 hashar: Gerrit: allowed JenkinsBot to submit patches on wikimedia/bots (and thus on all child repositories)
  • 20:50 hashar: operations/puppet.git manifests should no more have leading tabulations I69ddc7
  • 20:08 bblack: intermittent 5xx are most likely varnish restarts off and on rest of today
  • 19:51 hashar: Zuul: stopped / started process to clear up obsoletes changes stuck in queue
  • 19:47 hashar: Jenkins/Zuul lost connection somehow. Disabled/Reenabled gearman client in Jenkins
  • 19:44 hashar: Jenkins: updated qunit jobs to roam on both gallium and lanthanum (were previously tied to run only on gallium)
  • 19:42 ottomata: restarted varnishkafka on some esams hosts that have old misconfigured vk processes
  • 19:13 ottomata: restarting varnishkafka on amssq31
  • 19:08 ottomata: restarting varnishkafka on cp3013
  • 17:46 logmsgbot: aaron Synchronized php-1.24wmf14/includes/jobqueue/JobQueueFederated.php: 87e7bfceb795d065d6157ac8ce3381a7814000b5 (duration: 00m 03s)
  • 17:38 logmsgbot: aaron Synchronized php-1.24wmf15/includes/jobqueue/JobQueueFederated.php: 12ce1dc1ec46b06d1160e142ddfaf8dcb1c9f131 (duration: 00m 04s)
  • 16:30 andrewbogott: updated wikitech to 1.24wmf15; turned on OAuth
  • 16:05 Nemo_bis: andrewbogott> Nikerabbit: I'm upgrading it [wikitech wiki], it'll be flaky for a bit
  • 16:00 manybubbles: deone with SWAT
  • 15:57 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/VisualEditor/: SWAT - fix visual editor bug - Changes made after reviewing changes are not sent (when caching is enabled) (duration: 00m 07s)
  • 15:46 logmsgbot: manybubbles Synchronized php-1.24wmf15/extensions/VisualEditor/: SWAT - fix visual editor bug - Changes made after reviewing changes are not sent (when caching is enabled) (duration: 00m 08s)
  • 15:41 hoo: Removed all right holders from closed and inaccessible ukwikimedia (bug 68737)
  • 15:39 logmsgbot: manybubbles Synchronized php-1.24wmf15/includes/specials/SpecialRevisiondelete.php: SWAT - fix fatal on revision delete (duration: 00m 08s)
  • 15:33 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings.php: SWAT load Mantle before MobileFrontent (duration: 00m 07s)
  • 15:31 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/Echo/: SWAT fix bad variable name in echo (duration: 00m 08s)
  • 15:23 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - update some permissions on eswiki (duration: 00m 08s)
  • 15:17 logmsgbot: manybubbles Synchronized php-1.24wmf15/extensions/Echo/: SWAT - fix incorrect variable name (duration: 00m 08s)
  • 15:14 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - add import sources to bhwiki (duration: 00m 08s)
  • 15:10 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow/: SWAT update fundraising to fix botched deploy
  • 12:28 hashar: Upgrading our Jenkins Job Builder fork ( d833015..666e953 )
  • 03:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 28 02:59:35 UTC 2014 (duration 59m 34s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-28 02:25:34+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-28 02:15:00+00:00

July 27

  • 05:24 springle: mysqldump s6 dbstore1002 to dbstore1001
  • 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 27 02:58:15 UTC 2014 (duration 58m 14s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-27 02:24:10+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-27 02:13:44+00:00

July 26

  • 21:29 hashar: restarting Zuul to clear up some stalled changes.
  • 02:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 26 02:57:52 UTC 2014 (duration 57m 50s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-26 02:25:46+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-26 02:15:31+00:00

July 25

  • 22:52 mutante: Bugzilla - upgraded to 4.4.5
  • 22:41 mutante: ocg - deleted old log dirs
  • 19:28 hashar: Jenkins : disabling gearman plugin and reenabling it (just uncheck/save/check a box in https://integration.wikimedia.org/ci/configure )
  • 19:25 hashar: zuul@gallium:/etc/zuul/wikimedia$ echo status|nc -q 3 localhost 4730|wc -l ... Yields: 0 . Which mean jobs are no more registered for some reason.
  • 19:24 hashar: Jenkins stalled again yeahhhhh
  • 16:59 mutante: powercycled ms-be1010 - unresponsive to ssh, nothing on mgmt
  • 16:28 MaxSem: Updating PageImages data for mainspace on Commons from terbium
  • 13:09 _joe_: re-enabling puppet, test run on the test host was fine.
  • 13:03 _joe_: stopping puppet on all appservers - will reactivate after testing
  • 11:26 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 13s)
  • 10:50 hashar: contint: manually cleared /tmp on the 3 labs jenkins slaves.
  • 10:46 hashar: integration-slave1001.eqiad.wmflabs is out of disk space ( / /dev/vda1)
  • 07:29 springle: shutdown tantalum per mwalker request
  • 04:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 25 04:18:45 UTC 2014 (duration 18m 44s)
  • 03:31 logmsgbot: LocalisationUpdate completed (1.24wmf15) at 2014-07-25 03:30:33+00:00
  • 02:48 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-25 02:47:17+00:00
  • 01:21 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ic29ae11fa: On Labs, disable LuaSandbox's profiling feature to isolate bug 68413 (duration: 00m 04s)
  • 00:15 mutante: imported jouncebot from github - https://gerrit.wikimedia.org/r/#/q/project:wikimedia/bots/jouncebot,n,z
  • 00:03 K4-713: updated fundraising civicrm to 0639c11636d9

July 24

  • 23:26 mutante: created gerrit project for jouncebot
  • 23:06 logmsgbot: maxsem Synchronized wmf-config: https://gerrit.wikimedia.org/r/149180 (duration: 00m 05s)
  • 21:53 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 20:41 mutante: rebooted wm-bot instance
  • 20:21 bblack: restarted backend varnish for parsoid on cp1058
  • 20:20 bblack: restarted backend varnish for parsoid on cp1045
  • 20:08 logmsgbot: reedy Synchronized php-1.24wmf15/extensions/Translate: (no message) (duration: 00m 15s)
  • 20:00 logmsgbot: reedy Synchronized php-1.24wmf15: (no message) (duration: 00m 59s)
  • 19:58 logmsgbot: reedy Synchronized php-1.24wmf14: (no message) (duration: 01m 11s)
  • 19:24 hashar: restarted Zuul
  • 18:44 ori: restarted jobrunners for 01c70b1a892ac3944655f84449e89e4508894101
  • 18:41 AaronSchulz: Updated jobrunners to 01c70b1a892ac3944655f84449e89e4508894101
  • 18:39 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 18:34 logmsgbot: aaron Synchronized php-1.24wmf14/includes/jobqueue/aggregator/JobQueueAggregatorRedis.php: ca031131396ee1830e239d0b6a314bb571840c11 (duration: 00m 06s)
  • 18:26 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
  • 18:24 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf15
  • 18:23 ori: Purged apache from SSL cluster; provisioned as a side-effect of I0b02a46f3 + I76a0d237f
  • 18:21 godog: updated swift ring to bring ms-be1013 weight to 2300
  • 18:17 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf14
  • 18:03 logmsgbot: reedy Finished scap: testwiki to 1.24wmf15 and build l10n cache (duration: 31m 12s)
  • 17:32 logmsgbot: reedy Started scap: testwiki to 1.24wmf15 and build l10n cache
  • 16:38 hashar: restarting Jenkins it is broken again
  • 16:10 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Swap wikimania2013wiki to wikimania2014wiki in wmgCentralAuthLoginIcon (duration: 00m 14s)
  • 15:55 bd808|deploy: Fetched de8022b to /a/common on tin; prod no-op change needed for beta
  • 15:40 bd808|deploy: Fetched c7ae85e to /a/common on tin; prod no-op needed for beta
  • 15:39 ottomata: temporarily stopping puppet on analytics1027
  • 15:14 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings-labs.php: Fix reference thumbnail settings syntax (duration: 00m 13s)
  • 15:13 cmjohnson1: swapping disk 8 es1001
  • 15:10 hashar: Clearing out old Zuul references on operations/puppet.git might cause merge errors
  • 15:10 logmsgbot: yurik Synchronized php-1.24wmf14/extensions/ZeroBanner: (no message) (duration: 01m 07s)
  • 15:08 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/ZeroBanner: (no message) (duration: 01m 11s)
  • 14:30 logmsgbot: yurik Synchronized wmf-config/mobile.php: Font for zero banner (duration: 01m 10s)
  • 13:38 hashar: Deleting old Zuul references in the Zuul maintained repository /srv/ssd/zuul/git/mediawiki/core/ on gallium bug 68481 . Should speed up merge operations on that repository.
  • 10:10 hashar: Zuul code being installed on lanthanum.eqiad.wmnet Will let us use a merger daemon there and the Zuul cloner client. 141758
  • 05:44 springle: labsdb1002 work in progress; it may misbehave. see labs-l for updates
  • 03:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 24 03:56:32 UTC 2014 (duration 56m 31s)
  • 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-24 03:08:05+00:00
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-24 02:36:35+00:00
  • 00:44 ori: installing linux-tools on mw1053 to run perf on jobrunner

July 23

  • 23:59 logmsgbot: maxsem Finished scap: Pick up messages forgotten during Zero deployment (duration: 26m 42s)
  • 23:39 ori: running sync-common on mw1053.eqiad.wmnet
  • 23:32 logmsgbot: maxsem Started scap: Pick up messages forgotten during Zero deployment
  • 23:26 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/MultimediaViewer/: (no message) (duration: 00m 03s)
  • 23:26 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/VisualEditor/: (no message) (duration: 00m 04s)
  • 23:19 logmsgbot: maxsem Synchronized php-1.24wmf13/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
  • 23:18 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
  • 22:39 mutante: removed platinum from icinga
  • 22:36 _joe_: installed mw1053 as the first hhvm jobrunner, currently stopped. Puppet disabled so that it won't restart the jobrunner automatically
  • 21:49 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I2f366fa93: Use luastandalone on HHVM (duration: 00m 03s)
  • 21:17 hashar: Zuul is all good. It just receives too many patches :-]
  • 20:31 bd808|deploy: Updated /a/common to 07834a9 (beta cluster: use luastandalone); no sync needed
  • 20:30 subbu: deployed parsoid version 47d4bc83
  • 20:27 hashar: Having no idea how to fix zuul. Restarting it and killing the whole queue :-/
  • 20:14 mutante: contacts.wm - set $base_url in default/settings.php to https URL, and $is_https='on' in bootstrap.inc (unpuppetized?)
  • 19:49 logmsgbot: awight Synchronized php-1.24wmf14/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 05s)
  • 19:49 logmsgbot: awight Synchronized php-1.24wmf13/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 05s)
  • 19:28 logmsgbot: awight Synchronized php-1.24wmf14/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 04s)
  • 19:27 logmsgbot: awight Synchronized php-1.24wmf13/extensions/CentralNotice: push many CentralNotice fixes, including GeoIP cookie and hide cookie (duration: 00m 04s)
  • 18:57 hashar: reenabled Gearman plugin in Jenkins. Jobs have been reregistered and seems to be proceeding again
  • 18:55 hashar: back. attempting to fix jenkins
  • 18:38 logmsgbot: yurik Synchronized php-1.24wmf14/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 24s)
  • 18:36 hashar: can't fix jenkins / zuul right now. Will be stalled for at least half an hour
  • 18:35 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 27s)
  • 18:33 hashar: Jenkins disabled and reenabled Gearman plugin. The jobs were no more registered in Zuul gearman server :-(
  • 18:32 hashar: Jenkins stalled
  • 17:45 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 07s)
  • 17:38 godog: launched a script on ms-fe1001 to collect thumb stats, no impact expected
  • 17:11 logmsgbot: awight Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow: automatic translate workflow fix for Fundraising/ pages on meta.wmo (duration: 00m 04s)
  • 15:38 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/Wikidata: touch (duration: 00m 15s)
  • 15:34 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/Wikidata: Fix css issue in entity suggester on Wikidata (duration: 00m 17s)
  • 15:19 logmsgbot: reedy Synchronized php-1.24wmf14/resources/Resources.php: Fixing forgotten OOUI messages (duration: 00m 15s)
  • 15:11 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Remove flickrApiUrl from (duration: 00m 15s)
  • 14:19 akosiaris: upgraded php5 on mw1017 (test.wikipedia.org) deployment-apache0{1,2} (beta) to 5.3.10-1ubuntu3.13+wmf1
  • 12:42 hashar: upgraded gdnsd on gallium (used to lint operations/dns.git changes)
  • 09:57 hashar: Zuul migrated to zuul user :)
  • 09:43 hashar: zuul changing file ownership on gallium for /srv/ssd/zuul/git from jenkins:root to zuul:zuul
  • 09:42 hashar: breaking zuul
  • 05:29 springle: clone mariadb 10 labsdb1002 to labsdb100[13]
  • 04:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 23 04:09:54 UTC 2014 (duration 9m 53s)
  • 03:21 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-23 03:20:45+00:00
  • 02:50 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-23 02:49:34+00:00

July 22

  • 23:37 logmsgbot: ebernhardson Finished scap: Update flow in wmf/1.24wmf14 (duration: 17m 08s)
  • 23:20 logmsgbot: ebernhardson Started scap: Update flow in wmf/1.24wmf14
  • 21:57 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/WikimediaMessages/: Fix fatal for dumps (duration: 00m 15s)
  • 21:52 logmsgbot: reedy Synchronized php-1.24wmf13/extensions/WikimediaMessages/: Fix fatal for dumps (duration: 00m 13s)
  • 18:46 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf14 again
  • 18:45 logmsgbot: reedy Synchronized php-1.24wmf14/extensions/CirrusSearch/: Fix fatal (duration: 00m 15s)
  • 18:13 Reedy: Running sync-common on mw1081
  • 18:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias back to 1.24wmf13 due to Wikidata and Cirrus fatals
  • 18:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf14
  • 17:48 logmsgbot: mwalker Finished scap: Deploying Petition extension to the cluster (duration: 28m 27s)
  • 17:19 logmsgbot: mwalker Started scap: Deploying Petition extension to the cluster
  • 17:12 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata/extensions/ValueView/lib/jquery.ui/jquery.ui.suggester.js: touch jquery.ui.suggester.js for Wikidata (duration: 00m 05s)
  • 17:06 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata: Update Wikidata submodule for test wikidata, for real! (duration: 00m 06s)
  • 17:02 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata/extensions/Wikibase/lib/resources/wikibase.js: touch wikibase.js for test wikidata only, fix caching issues (duration: 00m 05s)
  • 16:55 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata/extensions/ValueView/lib/jquery.ui/jquery.ui.suggester.js: touch jquery.ui.suggester.js for Wikidata (duration: 00m 05s)
  • 16:48 logmsgbot: aude Synchronized php-1.24wmf14/extensions/Wikidata: Update Wikidata: js and json dump fixes (duration: 00m 11s)
  • 16:26 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: enable WikibaseClient on test wikidata (duration: 00m 07s)
  • 16:25 logmsgbot: aude Synchronized wmf-config/Wikibase.php: add settings for enabling WikibaseClient on test wikidata (duration: 00m 04s)
  • 16:18 logmsgbot: reedy Purged l10n cache for 1.24wmf12
  • 16:18 logmsgbot: reedy Purged l10n cache for 1.24wmf11
  • 16:17 logmsgbot: reedy Purged l10n cache for 1.24wmf10
  • 15:12 manybubbles: done with SWAT
  • 15:11 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - touching InitializeSettings.php to make dblist change go (duration: 00m 06s)
  • 15:10 logmsgbot: manybubbles Synchronized commonsuploads.dblist: SWAT add mrwiki to commonsuploads list (duration: 00m 08s)
  • 15:06 logmsgbot: manybubbles Synchronized php-1.24wmf14/extensions/CirrusSearch/: SWAT small cirrus fixes (duration: 00m 08s)
  • 14:48 _joe_: removed old, unused puppet 2.7 packages from reprepro for trusty
  • 14:00 _joe_: reinstalling mw1053 in 5 minutes, downtime on icinga, puppet disabled, setting to 'false' everywhere in pybal
  • 05:31 bblack: authdns servers (mexia, rubidium, eeden) updated to gdnsd-1.11.4~precise1
  • 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 22 03:12:10 UTC 2014 (duration 12m 9s)
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-22 02:36:27+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-22 02:13:40+00:00
  • 00:56 mutante: tungsten,fluorine, search1001-1006 - upgraded libssl

July 21

  • 23:44 mutante: graceful apache on magnesium
  • 23:42 legoktm: cleaned up stalled global rename of Felipegaspars --> L'editeur
  • 23:39 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/148270/ - revert previous change (duration: 00m 04s)
  • 23:36 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/TimedMediaHandler/: https://gerrit.wikimedia.org/r/#/c/148241/ (duration: 00m 04s)
  • 23:32 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/148249/ (duration: 00m 04s)
  • 23:30 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/148128 (duration: 00m 06s)
  • 23:29 logmsgbot: maxsem Synchronized php-1.24wmf14/extensions/AbuseFilter/: https://gerrit.wikimedia.org/r/#/c/148027/ (duration: 00m 06s)
  • 23:27 logmsgbot: maxsem Synchronized php-1.24wmf14/resources/: https://gerrit.wikimedia.org/r/#/c/147854/ (duration: 00m 05s)
  • 22:21 mutante: installing package upgrades on bast1001
  • 21:50 RobH: shutting down ms1002 for reclaim into labstore1003
  • 21:37 ottomata: running kafka preferred-replica-election to rebalance topics
  • 21:27 hashar: beta: removed build timeout from beta-update-databases-eqiad Jenkins jobs. There is a huge schema change being processed by update.php
  • 20:09 subbu: deployed parsoid version 1c9277d6
  • 17:24 mutante: elastic1009,analytics1004,silver, various misc. boxes - upgrading libssl
  • 17:16 mutante: installing package upgrades on iron
  • 16:19 godog: restarted uwsgi on tungsten
  • 16:02 andrewbogott: updated OpenStackManager on wikitech
  • 15:48 logmsgbot: demon Synchronized wmf-config: Undeploying CommunityVoice/ClientSide extensions (duration: 00m 08s)
  • 15:30 logmsgbot: demon Synchronized wmf-config/flaggedrevs.php: ukwiki gets FR for NS_MODULE (duration: 00m 04s)
  • 15:25 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Set $wgUploadNavigationUrl for plwikisource (duration: 00m 04s)
  • 15:25 logmsgbot: demon Synchronized php-1.24wmf14/extensions/CirrusSearch: CirrusSearch to master for 1.24wmf14 (duration: 00m 07s)
  • 15:22 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Set $wgForceUIMsgAsContentMsg for zhwikivoyage (duration: 00m 05s)
  • 15:17 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: TemplateData for fiwiki (duration: 00m 06s)
  • 13:41 _joe_: restarted apache on palladium
  • 12:01 apergos: started /usr/local/bin/dumpwikidatajson.sh in root screen session on snapshot1003
  • 03:11 springle: restarted apache on strontium
  • 02:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 21 02:57:03 UTC 2014 (duration 57m 2s)
  • 02:50 logmsgbot: krinkle Synchronized wmf-config/InitialiseSettings.php: I27c6f82af5e9b (duration: 00m 06s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-21 02:25:08+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-21 02:13:42+00:00

July 20

  • 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 20 02:57:59 UTC 2014 (duration 57m 58s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-20 02:25:54+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-20 02:14:47+00:00
  • 01:40 ori: synced docroot/default/index.html (I005f43b96: Add width/height attributes to img to fix reflow)

July 19

  • 23:33 logmsgbot: aaron Synchronized php-1.24wmf4/maintenance: 926e1997b53f563a4e7f3c540e32b45ddb24b3c5 & 017891ba41cc72987bf3cb441004a847d20105b4 (duration: 00m 08s)
  • 23:33 logmsgbot: aaron Synchronized php-1.24wmf4/includes: 926e1997b53f563a4e7f3c540e32b45ddb24b3c5 & 017891ba41cc72987bf3cb441004a847d20105b4 (duration: 00m 09s)
  • 15:43 bblack: restarted gitblit service on antimony
  • 05:02 Krinkle: Ungracefully restarting Zuul to clear the items stuck in the queue (picked a moment with no real items waiting in the queue).
  • 03:25 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool es1001 (duration: 00m 06s)
  • 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 19 02:56:30 UTC 2014 (duration 56m 29s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-19 02:26:26+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-19 02:15:20+00:00
  • 01:09 logmsgbot: awight Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 4) (duration: 00m 04s)
  • 01:08 logmsgbot: awight Synchronized php-1.24wmf13/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 4) (duration: 00m 04s)
  • 01:07 logmsgbot: awight Synchronized php-1.24wmf12/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 4) (duration: 00m 04s)
  • 00:36 logmsgbot: awight Synchronized php-1.24wmf14/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 3) (duration: 00m 04s)
  • 00:36 logmsgbot: awight Synchronized php-1.24wmf13/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 3) (duration: 00m 04s)
  • 00:36 logmsgbot: awight Synchronized php-1.24wmf12/extensions/FundraisingTranslateWorkflow: update FundraisingTranslateWorkflow submodule content (take 3) (duration: 00m 04s)

July 18

  • 22:04 logmsgbot: awight Synchronized php-1.24wmf14: update FundraisingTranslateWorkflow submodule (take 2) (duration: 00m 58s)
  • 22:03 logmsgbot: awight updated /a/common/php-1.24wmf14 to I1036dae02: Update mediawiki/core/vendor to head to 1.24wmf14
  • 21:00 logmsgbot: awight Synchronized php-1.24wmf13: update FundraisingTranslateWorkflow submodule (duration: 01m 04s)
  • 20:58 awight: for the record, I actually updated to ade90e0e22492d87e6069db3a359b22ef56401a6
  • 20:57 logmsgbot: awight updated /a/common/php-1.24wmf13 to Id3462554b: Made --maxtime a soft limit again
  • 20:50 logmsgbot: awight Synchronized php-1.24wmf12: update FundraisingTranslateWorkflow submodule (duration: 00m 49s)
  • 20:48 logmsgbot: awight Synchronized php-1.24wmf12: update FundraisingTranslateWorkflow submodule (duration: 00m 21s)
  • 20:48 logmsgbot: awight updated /a/common/php-1.24wmf12 to Idf3f49941: Updating ZeroBanner
  • 20:41 MaxSem: Load testing GeoData
  • 19:11 mutante: restarted apache on strontium.. sigh
  • 18:17 logmsgbot: aaron Synchronized php-1.24wmf13/maintenance/runJobs.php: ae053860dc36a07f05ab9e31299f2da0d2f66e85 (duration: 00m 03s)
  • 18:16 logmsgbot: aaron Synchronized php-1.24wmf14/maintenance/runJobs.php: 684c21c325370aa3baac631ae9a006fc8861b952 (duration: 00m 03s)
  • 18:05 logmsgbot: aaron Synchronized wmf-config/jobqueue-eqiad.php: Set "daemonized" flag for the redis job queue (duration: 00m 04s)
  • 17:33 cmjohnson: replacing disk 2 es1005
  • 17:25 mutante: temp. stopped icinga-wm to avoid channel spam
  • 17:24 mutante: puppetmaster on strontium had 'Unexpected error in mod_passenger" causing puppet fails all over the place with error 500 on master, resumed normal after graceful
  • 17:21 mutante: graceful'ed apache on strontium
  • 14:37 godog: rolling reload of proxy-server on swift ms-fe1* to pick up changes
  • 13:19 _joe_: re-enabling puppet, applying on a sample of hosts created no change according to my tests.
  • 13:13 _joe_: temporarily disabling puppet on mw servers, will re-enable when I'm done with testing (again) the change
  • 11:20 godog: restart proxy-server on ms-fe1003, as suspected it wasn't running the latest version
  • 11:14 godog: restart proxy-server on ms-fe1003, double checking for a change in numbers reported to graphite
  • 10:04 godog: stagger reload swift {account,object,container} server in ms-be.eqiad to pick up recon changes
  • 06:01 AaronSchulz: Updated /srv/deployment/jobrunner to 4cddd5033efadf431e138c399b5d86542e32f196
  • 03:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 18 03:53:55 UTC 2014 (duration 53m 54s)
  • 03:22 ori: Updated jobrunner to d9520c9 and restarted service on all jobrunners
  • 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf14) at 2014-07-18 03:08:02+00:00
  • 02:45 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: Repool db1021, context RT 7916, warm up (duration: 00m 08s)
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-18 02:36:54+00:00

July 17

  • 23:49 logmsgbot: mwalker Finished scap: SWAT for 146651, 147102, 146925, 147331, 147332, and 147206
  • 23:19 logmsgbot: mwalker Started scap: SWAT for 146651, 147102, 146925, 147331, 147332, and 147206
  • 21:02 csteipp: deployed fix for bug68187
  • 20:29 ori: updated jobrunner to 71d84ea18d and restarted service
  • 18:36 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf14
  • 18:30 springle: db1021 raid write-cache failure, BBU at 9%
  • 18:14 springle: db1021 disabled sync_binlog, thread tied up on fsync
  • 18:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf13
  • 18:09 logmsgbot: reedy Synchronized wmf-config/: De-pool db1021 due to increasing replag (duration: 00m 14s)
  • 17:40 logmsgbot: reedy Finished scap: testwiki to 1.24wmf14 take 2 (duration: 33m 02s)
  • 17:30 Jeff_Green: payments1002 dist upgrade & reboot
  • 17:21 mutante: nickel (ganglia) apt-get upgrading packages
  • 17:13 Jeff_Green: dist-upgrade and reboot payments1003
  • 17:07 logmsgbot: reedy Started scap: testwiki to 1.24wmf14 take 2
  • 17:04 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.VsoJsYY6Q2" ' returned non-zero exit status 1 (duration: 02m 46s)
  • 17:03 RobH: payments4 is kernel updating (per jgreen)
  • 17:01 logmsgbot: reedy Started scap: testwiki to 1.24wmf14
  • 15:05 logmsgbot: manybubbles Synchronized php-1.24wmf13/extensions/MultimediaViewer/: SWAT - Moving repo icon back to the right-hand side in Media Viewer (duration: 00m 05s)
  • 15:03 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings-labs.php: SWAT deploy to keep us synced, but this is a noop in prod. only anything in beta. (duration: 00m 05s)
  • 07:27 springle: mariadb 10 on labsdb1002:3309 cloning s5 from sanitarium db1054:3308
  • 03:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 17 03:32:25 UTC 2014 (duration 32m 24s)
  • 02:47 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-17 02:46:24+00:00
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-17 02:23:08+00:00

July 16

  • 23:55 logmsgbot: maxsem Synchronized private: Clean up old mobile cruft (duration: 00m 05s)
  • 23:17 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 03s)
  • 23:13 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 14s)
  • 22:34 andrewbogott: temporarily fixed puppet on tin by restarting salt-master and salt-minion. A proper fix would involve upgrading to a salt version that fixes https://github.com/saltstack/salt/issues/6306
  • 22:29 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/ZeroBanner: (no message) (duration: 03m 55s)
  • 22:27 ori: restarted jobrunner service on all job runners
  • 22:18 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/ZeroBanner: (no message) (duration: 04m 31s)
  • 21:50 AaronSchulz: Updated job runners to 186b9b33
  • 21:08 legoktm: clearing Magog the Ogre's watchlist on enwp per request (173668 entries)
  • 21:01 logmsgbot: yurik Synchronized php-1.24wmf13/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 53s)
  • 20:56 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 04m 54s)
  • 20:22 subbu: deploy parsoid 060dcb54
  • 19:56 ottomata: reenabling puppet on analytics1027
  • 19:21 ottomata: temp disabling puppet on analytics1027
  • 17:57 akosiaris: clean puppet stored config database for osm-db100{1,2}.eqiad.wmnet, updating icinga
  • 16:49 Reedy: Restarted jenkins again
  • 16:12 Reedy: Restarted jenkins
  • 16:11 Reedy: Killed jenkins
  • 14:34 _joe_: moving the stale conf-enabled directory away on jobrunners, or when we upgrade to trusty all hell will break loose
  • 13:06 logmsgbot: oblivian gracefulled all apaches
  • 12:14 logmsgbot: oblivian gracefulled all apaches
  • 12:01 _joe_: removed stale files from /etc/apache2/conf-enabled on all mw hosts
  • 11:25 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: Take Cirrus as default from more wikis while we figure out load issues (duration: 00m 06s)
  • 10:32 _joe_: releasing a new apache config to all mediawikis
  • 08:54 godog: repool ms-fe1004
  • 08:51 godog: repool ms-fe1003 and depool ms-fe1004
  • 08:46 godog: repool ms-fe1002 and depool ms-fe1003
  • 08:39 godog: depool ms-fe1002 for swift upgrade
  • 05:54 springle: resuming page content model schema changes, osc_host.sh processes on terbium ok to kill in emergency
  • 04:22 springle: restarted gitblit on antimony
  • 03:04 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 16 03:03:41 UTC 2014 (duration 3m 40s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-16 02:26:12+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-16 02:14:32+00:00
  • 01:34 manybubbles: moving shards off of elastic101[789]

July 15

  • 23:20 logmsgbot: maxsem Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/146615/ (duration: 00m 04s)
  • 23:16 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/CirrusSearch/: https://gerrit.wikimedia.org/r/#q,146471,n,z (duration: 00m 05s)
  • 23:14 logmsgbot: maxsem Synchronized php-1.24wmf13/includes/specials/SpecialVersion.php: (no message) (duration: 00m 04s)
  • 23:13 logmsgbot: maxsem Synchronized php-1.24wmf13/extensions/CirrusSearch/: https://gerrit.wikimedia.org/r/#q,146471,n,z (duration: 00m 04s)
  • 22:35 K4-713: synchronized payments to afa12be34769000bf8
  • 21:34 _joe_: disabling puppet on mw1001, tests
  • 21:26 logmsgbot: aude Synchronized php-1.24wmf13/extensions/Wikidata: Update submodule to fix entity search issue on Wikidata (duration: 00m 21s)
  • 21:15 ori: to test r146607, locally modified upstart conf for jobrunner on mw1001 to log to /var/log/mediawiki, and restarted service
  • 20:24 ori: restarted jobrunner on all jobrunners
  • 20:23 AaronSchulz: Deployed /srv/jobrunner to 31e54c564d369e89613db48977eec0a5891b6498
  • 20:21 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 21s)
  • 20:18 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf13
  • 20:12 Krinkle: Reloading Zuul to deploy If2312bcf18bdbe8dee
  • 20:12 bd808: log volume up after logstash restart
  • 20:10 bd808: restarted logstash on logstash1001; log volume looked to be down from "normal"
  • 19:55 Reedy: Applied extensions/UploadWizard/UploadWizard.sql to rowiki (re bug 59242)
  • 18:53 manybubbles: bouncing elastic1018 to pick up new merge policy. hopefully that'll help with io thrashing
  • 17:58 ori: _joe_ deployed jobrunner to all job runners
  • 17:40 manybubbles: my last attempt to lower the concurrent traffic for recovery was a failure - tried again and succeeded. that seems to have fixed the echo service disruption from taking elastic1017 out of service
  • 17:37 ori: updated jobrunner to bef32b9120
  • 17:29 manybubbles: elastic1017 went nuts again. just shutting elasticsearch off on it for now
  • 17:17 manybubbles: lowered Elasticsearch concurrent recovery streams to 2 (from 3) and total write rate across those streams to 20MB/sec (from 4MB/sec). This should prevent io thrash on recovery which looked to cause echo distruptions in service while recovering from some other disruption.
  • 16:25 _joe_: all mw servers updated
  • 16:10 _joe_: mw1100 and onwards updated
  • 16:00 _joe_: mw1060-mw1099 updated
  • 15:57 manybubbles: restarting Elasticsearch on elastic1017 - its thrashing the disk again. I'm still not 100% sure why
  • 15:56 _joe_: mw1020-mw1059 updated
  • 15:53 _joe_: mw101[0-9] updated
  • 15:51 manybubbles: elasticsearch1017 is freaking out again - maybe there is something wrong with it. odds aren't good it picked up the same shard again after restart and that shard is somehow poison just for it and not the other two nodes with the same shard....
  • 15:47 _joe_: starting rolling update of all appservers to apache2 2.2.22-1ubuntu1.6, half of them are on 2.2.22-1ubuntu1.5 now
  • 15:42 manybubbles: setting the filter cache on one node in the cluster set it on all. yay, I guess. Anyway, I'm going to let it soak for a while.
  • 15:32 manybubbles: setting filter cache size to 20% on elastic1001 to see if it takes/helps us
  • 15:19 logmsgbot: anomie Synchronized wmf-config/: SWAT: Remove dead ULS variable gerrit:145861 (duration: 00m 10s)
  • 15:18 anomie: anomie actually committed a live hack someone left on tin (removing db1035)
  • 15:16 logmsgbot: anomie updated /a/common to I7ca6a16d5: Switch jawiki back to lsearchd
  • 13:52 manybubbles: after switching jawiki back to lsearchd by default load is mostly recovered. the cluster is still healing from bouncing elastic1017 and that'll take a while. the load will be a bit high during that but searches are coming back in a reasonably amount of time again
  • 13:42 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: jawiki back to lsearchd (duration: 00m 05s)
  • 13:38 manybubbles: elastic1017 had a load average of 60 - was thashing in io. bounced Elasticsearch. lets see if it recovers on its own
  • 09:09 _joe_: restarting mailman on sodium, again, for testing
  • 08:50 godog: restart mailman on sodium after inodes freed
  • 07:27 _joe_: restarted mailman on sodium
  • 07:22 _joe_: stopping mailman on sodium for repairing
  • 06:54 _joe_: killed jenkins stale process on gallium, stuck in a futex while shutting down
  • 04:48 springle: db1035 crash cycle. down for memtest and stuff
  • 03:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 15 03:33:38 UTC 2014 (duration 33m 37s)
  • 03:01 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-15 03:00:03+00:00
  • 02:34 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1035, crashed (duration: 00m 13s)
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-15 02:29:02+00:00
  • 02:27 springle: powercycle db1035 unresponsive

July 14

  • 23:52 logmsgbot: mwalker Finished scap: Updating for SWAT 146304, 146306, 146149, 146165, 146166, 146282, and 146281. Also finishing awight's deploy of FundraisingTranslateWorkflow. (duration: 19m 42s)
  • 23:32 logmsgbot: mwalker Started scap: Updating for SWAT 146304, 146306, 146149, 146165, 146166, 146282, and 146281. Also finishing awight's deploy of FundraisingTranslateWorkflow.
  • 20:22 cscott: updated Parsoid to version d51e64097bb1b18e356584d4f3ddcfd90a6071ba
  • 19:57 ori: postponing jobrunner deployment to tomorrow; ran over time
  • 19:45 _joe_: doing the same on mw1064, segfaulted for the same reason
  • 19:44 _joe_: killed a lone apache2 child on mw1152, stuck in a futex, after a segfault of another apache process. Restarted apache, now working correctly
  • 19:03 godog: re-enabling mailman on sodium, missing list config restored
  • 18:49 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow on metawiki (t
  • 18:45 logmsgbot: awight Synchronized php-1.24wmf13/extensions/FundraisingTranslateWorkflow: Update FundraisingTranslateWorkflow extension (wmf13) (duration: 00m 05s)
  • 18:43 logmsgbot: awight Synchronized php-1.24wmf12/extensions/FundraisingTranslateWorkflow: Update FundraisingTranslateWorkflow extension (duration: 00m 05s)
  • 18:15 logmsgbot: awight Synchronized wmf-config: Revert: Deploying FundraisingTranslateWorkflow on metawiki (duration: 00m 04s)
  • 18:03 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow on metawiki (duration: 00m 05s)
  • 18:03 logmsgbot: awight updated /a/common to Ie7599fb6e: jawiki gets Cirrus as primary search
  • 17:43 Krinkle: npm-cache for integration slaves got corrupted again. Depooling/Repooling integration-slave100{1,2,3} onoe by one to clear cache and let it warm up again.
  • 17:35 Krinkle: Jenkins slaves in labs are unable to reach zuul.eqiad.wmnet
  • 17:10 andrewbogott: purging old local-* service group entries from labs ldap (via purgeOldServiceGroups.php)
  • 17:05 godog: started mailman on sodium post-reboot
  • 17:04 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: nlwiki getting cirrus as primary (duration: 00m 04s)
  • 15:11 logmsgbot: manybubbles Synchronized wmf-config: SWAT update cirrus settings for commons (duration: 00m 04s)
  • 15:04 logmsgbot: manybubbles Synchronized wmf-config: SWAT update cirrus settings for commons (duration: 00m 04s)
  • 15:01 logmsgbot: manybubbles Synchronized wmf-config: SWAT update cirrus settings for commons (duration: 00m 05s)
  • 14:39 _joe_: rebooted nescio, stuck and with console showing just a truncated log (timestamp only)
  • 14:33 mutante: powercycling sodium
  • 14:02 mutante: stat1002 - "Could not find declared class ::oozie"
  • 09:36 legoktm: ran initSiteStats.php on all wikivoyages for bug 64370
  • 09:02 godog: repool ms-fe1001 after upgrade, basic testing successful
  • 08:33 godog: depool ms-fe1001 for swift icehouse upgrade
  • 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 14 02:56:22 UTC 2014 (duration 56m 21s)
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-14 02:23:39+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-14 02:12:54+00:00

July 13

  • 22:12 ori: stopping puppet on rcs1001 to debug nginx issue
  • 21:03 Krinkle: git-deploy: Deploying integration/slave-scripts I7f2b476807465
  • 02:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 13 02:53:33 UTC 2014 (duration 53m 32s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-13 02:23:56+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-13 02:13:32+00:00
  • 02:12 legoktm: migratePass0.php finished a while back

July 12

  • 22:21 legoktm: running foreachwiki extensions/CentralAuth/maintenance/migratePass0.php (bug 67350)
  • 22:04 legoktm: checkLocalNames/checkLocalUser finished a few hours ago, I don't have a timestamp (bug 67350)
  • 13:51 godog: reboot ms-be1007, xfs problems on sdn, load at 300+
  • 07:39 legoktm: started running checkLocalUser.php --delete=1 on all CentralAuth wikis for bug 67350
  • 07:37 legoktm: started running checkLocalNames.php --delete=1 on all CentralAuth wikis for bug 67350
  • 02:52 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 12 02:51:47 UTC 2014 (duration 51m 46s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-12 02:25:47+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-12 02:15:33+00:00

July 11

  • 23:44 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow to labs (take 2) (duration: 00m 04s)
  • 23:36 logmsgbot: awight Synchronized wmf-config: Deploying FundraisingTranslateWorkflow to labs (duration: 00m 05s)
  • 23:34 logmsgbot: awight updated /a/common to I862a4afed: Fixup highlightTest.php
  • 22:44 mutante: upgraded libssl on wtp*
  • 22:33 Krinkle: Restarting Jenkins
  • 22:33 Krinkle: Pooled/depooled Jenkins slave on gallium
  • 22:31 Krinkle: jenkins/gallium's weekly w(h)ine hour is here.
  • 21:31 Krinkle: Reloading Zuul to deploy config change I993eba5ab7b70f924a2b925fea7c196db27c4cc3
  • 20:57 ottomata: disabling puppet on analytics1004 (AGH!)
  • 20:51 ottomata: bringing up some hadoop journalnodes (and datanodes)
  • 20:33 mutante: wikitech - graceful apache for ssl cipher list change
  • 18:19 mutante: OTRS - enabled STS, updated SSL cipher list, restarted Apache on iodine
  • 15:15 logmsgbot: hoo Synchronized php-1.24wmf13/extensions/Wikidata/: Fix the wbsearchentities API (duration: 00m 13s)
  • 15:14 logmsgbot: hoo Synchronized php-1.24wmf12/extensions/Wikidata/: Fix the wbsearchentities API (duration: 00m 16s)
  • 13:52 hashar: Jenkins: mediawiki/core change being queued while Jenkins is busy proceeding some history. That is normal, will resume soon ™
  • 12:07 hashar: Jenkins: dropping history of mwext-Wikibase-testextensions-master as well
  • 12:05 hashar_: Jenkins: manually removing history of mwext-Wikibase-client-tests and mwext-Wikibase-repo-tests . They are no more used since January
  • 08:54 hoo: Started rebuildItemsPerSite for wikidatawiki on terbium
  • 03:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 11 03:30:11 UTC 2014 (duration 30m 10s)
  • 03:01 logmsgbot: LocalisationUpdate completed (1.24wmf13) at 2014-07-11 03:00:20+00:00
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-11 02:30:24+00:00

July 10

  • 23:38 logmsgbot: mwalker Finished scap: Updating Core, VE, and GuidedTour for scap, 145400, 145401, 145431, and 145460 (duration: 16m 26s)
  • 23:22 logmsgbot: mwalker Started scap: Updating Core, VE, and GuidedTour for scap, 145400, 145401, 145431, and 145460
  • 20:00 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 19:46 logmsgbot: reedy Synchronized private: (no message) (duration: 00m 14s)
  • 19:45 csteipp: deployed patch for bug65778
  • 19:43 hashar: Jenkins upgrading Gearman plugin from 0.0.6 to 0.0.7 . That fix the way jobs labels are registered with Gearman
  • 19:16 hashar: Killed jenkins :-(
  • 18:37 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 13s)
  • 18:36 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 14s)
  • 18:10 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf13
  • 18:02 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf12
  • 17:25 logmsgbot: hoo Synchronized php-1.24wmf13/extensions/Wikidata/: Fix a UI issue and two API related flaws (same version as for wmf12) (duration: 00m 09s)
  • 17:21 logmsgbot: hoo Synchronized php-1.24wmf12/extensions/Wikidata/: Fix a UI issue and two API related flaws (duration: 00m 14s)
  • 16:04 godog: restarted pdns in turn on virt1000 and virt0 after opendj ulimit change
  • 15:56 hashar: gallium running a rather long du command in a screen. Need to have a good figure at how much disk space each jobs consume
  • 15:50 logmsgbot: reedy Finished scap: testwiki to 1.24wmf13 and build l10n cache (duration: 32m 09s)
  • 15:18 logmsgbot: reedy Started scap: testwiki to 1.24wmf13 and build l10n cache
  • 15:15 ottomata: reinstalling analytics1026 and analytics1027
  • 14:10 godog: ran swift-dispersion-populate on eqiad and esams swift clusters
  • 14:04 godog: cycle-restarting swift proxy-server on ms-fe to apply config updates
  • 13:09 godog: restart pdns on virt1000
  • 12:48 springle: ongoing schema changes: pl_from_namespace gerrit 117373. on terbium, osc_host.sh processes ok to kill in emergency
  • 12:43 godog: restart opendj on virt1000 with higher ulimit -n
  • 12:29 godog: restarted opendj on virt1000, ran out of fd
  • 10:29 godog: restart profiler-to-carbon on tungsten, seemingly cpu spinning
  • 09:48 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 13s)
  • 09:30 logmsgbot: oblivian gracefulled all apaches
  • 09:23 _joe_: doing a tagged run to sync apache config
  • 09:07 hashar: gallium err was July 5th and file was from a minute ago ... ignore me
  • 09:06 hashar: gallium deleted /var/lib/puppet/state/agent_catalog_run.lock from July 5th. Was preventing me to run puppet agent -tv
  • 08:02 logmsgbot: oblivian gracefulled all apaches
  • 07:52 _joe_: doing a tagged run of puppet on all appservers to sync apache config
  • 06:40 bblack: all normally-ulsfo traffic is back on ulsfo
  • 05:53 awight: edit CRM Drupal permissions
  • 03:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 10 03:46:36 UTC 2014 (duration 46m 35s)
  • 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-10 03:11:48+00:00
  • 02:49 mutante: argon,netmon1001, graceful'led apaches
  • 02:48 mutante: netmon1001 - DocumentRoot [/etc/apache2/undef] does not exist
  • 02:42 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-10 02:41:29+00:00
  • 02:38 mutante: argon,calcium,iron,rhenium,bast1001,oxygen,netmon1001 - upgraded SSL
  • 01:47 mutante: argon - Ignoring file 'puppet_base_2.7' in directory '/etc/apt/preferences.d/
  • 01:41 awight: update crm schema to wmf_civicrm 7020
  • 01:40 awight: update civicrm from 108802336e4d5f4aab9a6dbfa0ea434bddae0060 to 15cf86cb109a448f1982da9c91215eec73f28499
  • 01:38 mutante: potassium,hydrogen,search1016,nitrogen,analytics1024,chromium - upgrade SSL
  • 01:06 bblack: cleared icinga downtimes for ulsfo (we now have some traffic back there)
  • 00:50 logmsgbot: mattflaschen Synchronized php-1.24wmf11/extensions/GuidedTour/: GuidedTour cherry-pick to 1.24wmf11 in support of GettingStarted anonymous editor acquisition test (duration: 00m 09s)
  • 00:05 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/144938 (duration: 00m 04s)

July 9

  • 23:57 logmsgbot: maxsem Finished scap: SWAT, GettingStarted introduced a new message (duration: 26m 31s)
  • 23:44 mutante: deleted systemusers group on neon & mw1077 (to check it doesnt break anything
  • 23:31 logmsgbot: maxsem Started scap: SWAT, GettingStarted introduced a new message
  • 23:22 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/GettingStarted/: (no message) (duration: 00m 03s)
  • 23:17 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/144857/ (duration: 00m 04s)
  • 22:42 mark: Enabling PAIX BGP sessions on cr2-ulsfo
  • 22:40 mark: Enabling WMF HQ BGP sessions on cr1-ulsfo
  • 22:38 mark: Enabling TiNet transit links on cr1-ulsfo
  • 22:35 mark: Enabling WMF HQ BGP sessions on cr2-ulsfo
  • 22:34 mark: Enabling NTT and HE transit links on cr2-ulsfo
  • 22:05 mutante: restarted apache on zirconium for config change
  • 20:07 subbu: deployed parsoid 1632288d
  • 18:36 logmsgbot: yurik Synchronized php-1.24wmf12/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 22s)
  • 18:29 logmsgbot: yurik Synchronized php-1.24wmf11/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 24s)
  • 17:17 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: eswiki cirrus (duration: 00m 04s)
  • 16:51 logmsgbot: csteipp Finished scap: Update CentralAuth for Global Rename (duration: 28m 46s)
  • 16:22 logmsgbot: csteipp Started scap: Update CentralAuth for Global Rename
  • 16:17 mark: ulsfo is now offline
  • 16:16 mark: Shutdown NTT BGP sessions on cr2-ulsfo
  • 16:13 mark: Shutdown TiNet BGP sessions on cr1-ulsfo
  • 16:10 mark: Shutdown IXP BGP sessions on cr2-ulsfo
  • 16:10 mark: Shutdown WMF HQ BGP sessions on cr2-ulsfo
  • 16:09 mark: Shutdown WMF HQ BGP sessions on cr1-ulsfo
  • 16:02 logmsgbot: hoo Synchronized php-1.24wmf12/extensions/Wikidata/: Update Wikibase to fix a fatal and various JS things (duration: 00m 14s)
  • 14:13 hashar: Jenkins: bringing back puppet-compiler02.eqiad.wmflabs node online. /tmp get filled when running huge catalog compilations which causes Jenkins to unpool the node :/
  • 13:30 godog: reboot ms-be1005, raid controller confused (?) after disk replacement
  • 12:52 godog: umounted sdg1 on ms-be1005, device disappeared, errors in dmesg
  • 12:35 bblack: enabled amssq47 text frontend cache in pybal for esams
  • 09:39 hashar: Jenkins had a bit of failure earlier due to the massive configuration update of mediawiki-core and mwext jobs. If that fails again the best thing is to stop Jenkins on gallium , wait for it to be killed or force kill -9 the java process then restart Jenkins. Should sort it out
  • 09:30 hashar: restarted Zuul to clear out stalled items in queue
  • 09:12 hashar: Jenkins being slow because the mediawiki-core* jobs history cache has been wiped out while updating their configuration. Jenkins is busy processing the history :(
  • 09:02 hashar: Jenkins killing slave process on lanthanum. Some job is stalled and unrecoverable.
  • 08:53 godog: upgrade ms-be1013/1014/1015 (zone5) to icehouse swift
  • 08:51 hashar: Jenkins migrating jobs to use $ZUUL_URL instead of git://zuul.eqiad.wmnet Preparing to scale out Zuul merger to several nodes
  • 08:19 godog: upgrade ms-be1009/1010/1011 (zone4) to swift icehouse
  • 08:04 hashar: Jenkins: granted matanya the ability to manually trigger builds. Use case: the puppet compiler!
  • 08:02 godog: upgrade ms-be1005/1006/1007 (zone3) to swift icehouse
  • 03:37 mutante: ran puppet on neon - false puppet failure alarms
  • 02:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 9 02:54:37 UTC 2014 (duration 54m 36s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-09 02:25:33+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-09 02:14:38+00:00
  • 01:26 mutante: Bugzilla - enabled https://en.wikipedia.org/wiki/HTTP_Strict_Transport_Security
  • 00:50 mutante: restarted gitblit service

July 8

  • 22:50 mutante: radon (phab)- package and kernel upgrades, rebooting
  • 20:22 legoktm: finished running migrateAccount.php --attachbroken --attachmissing (bug 61876)
  • 20:07 legoktm: finished migrateAccount.php --safe, now starting migrateAccount.php --attachbroken
  • 20:05 mutante: restarted apache on ytterbium
  • 19:47 K4-713: updated payments fraud filters again
  • 19:47 legoktm: running migrateAccount.php --safe for accounts only existing on one wiki (bug 39817)
  • 19:27 mutante: this should have fixed all the services behind misc. varnish now getting an actual "A" rating on ssllabs
  • 19:20 mutante: arr, i meant "nginx", not varnish
  • 19:15 mutante: restarting varnish on cp1043/cp1044 (misc cluster)
  • 18:55 cmjohnson1: disconnecting serial cable from psw1-c2-eqiad
  • 18:50 csteipp: patch for bug66608 deployed to wmf11/12
  • 18:50 K4-713: updated fraud filters on payments cluster
  • 18:28 logmsgbot: reedy Synchronized robots-private.txt: (no message) (duration: 00m 14s)
  • 18:27 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 15s)
  • 18:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf12
  • 15:22 logmsgbot: reedy Purged l10n cache for 1.24wmf10
  • 15:21 logmsgbot: reedy Purged l10n cache for 1.24wmf9
  • 15:15 logmsgbot: anomie Synchronized php-1.24wmf11/extensions/Scribunto/: SWAT: Fix regression in os.date and os.time at module scope gerrit:144559 (duration: 00m 10s)
  • 15:14 logmsgbot: anomie Synchronized php-1.24wmf12/extensions/Scribunto/: SWAT: Fix regression in os.date and os.time at module scope gerrit:144511 (duration: 00m 11s)
  • 15:10 logmsgbot: anomie Synchronized php-1.24wmf11/extensions/UploadWizard/UploadWizard.config.php: SWAT: Flickr API is https-only now gerrit:144584 (duration: 00m 10s)
  • 15:04 logmsgbot: anomie Synchronized php-1.24wmf12/extensions/UploadWizard/UploadWizard.config.php: SWAT: Flickr API is https-only now gerrit:144583 (duration: 00m 10s)
  • 13:34 springle: slow transaction rollback in progress on db1001 librenms. other databases not affected, but librenms writes are timing out
  • 13:32 cmjohnson1: replacing disk disk 6 ms-be1005
  • 13:30 cmjohnson1: replacing disk 4 ms-be1007
  • 12:38 YuviPanda: disregard previous log message, was meant for labs
  • 12:37 YuviPanda: graphite reduced metrics count from 65k to 25k, monitoring io performance
  • 06:57 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: raise db traffic samplers to normal load (duration: 00m 06s)
  • 05:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1010, warm up (duration: 00m 06s)
  • 04:51 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1010 for upgrade (duration: 00m 06s)
  • 03:26 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 8 03:25:51 UTC 2014 (duration 25m 50s)
  • 03:00 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-08 02:59:33+00:00
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-08 02:29:00+00:00
  • 01:17 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1061, warm up (duration: 00m 06s)
  • 00:57 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1061 for upgrade (duration: 00m 07s)
  • 00:02 ^d: gerrit upgraded to 2.8.1-4-ga1048ce from 2.8.1-2-g724b796, back up. Might be slow for a bit while caches warm.

July 7

  • 23:38 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/ParserFunctions/: https://gerrit.wikimedia.org/r/#q,144510,n,z (duration: 00m 03s)
  • 23:38 logmsgbot: maxsem Synchronized php-1.24wmf12/includes/StubObject.php: https://gerrit.wikimedia.org/r/#/c/144509/ (duration: 00m 03s)
  • 23:22 logmsgbot: maxsem Synchronized visualeditor-default.dblist: (no message) (duration: 00m 03s)
  • 23:19 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/GWToolset: (no message) (duration: 00m 03s)
  • 23:18 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/GWToolset: (no message) (duration: 00m 03s)
  • 23:17 logmsgbot: maxsem Synchronized php-1.24wmf12/extensions/GWToolset: (no message) (duration: 00m 04s)
  • 23:12 logmsgbot: maxsem Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#q,144476,n,z & https://gerrit.wikimedia.org/r/#q,139569,n,z (duration: 00m 05s)
  • 23:04 logmsgbot: maxsem Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#q,144476,n,z & https://gerrit.wikimedia.org/r/#q,139569,n,z (duration: 00m 03s)
  • 22:41 logmsgbot: ori Synchronized wmf-config/mc.php: I8b66e9339: Make app servers connect to nutcracker on port 11212 (duration: 00m 03s)
  • 20:31 logmsgbot: ori Synchronized wmf-config/mc.php: Iea24b092b: Make mw1041 connect to nutcracker on port 11212 (duration: 00m 09s)
  • 20:03 subbu: deployed Parsoid 8ef7b6fe
  • 17:52 legoktm: deleted rows in centralauth's localnames and localuser tables for bug 67548
  • 17:02 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Cirrus on commons as primary (duration: 00m 04s)
  • 16:34 logmsgbot: aude Finished scap: Update Wikidata to mw1.24-wmf12 branch for group0 wikis (duration: 22m 33s)
  • 16:22 manybubbles: (Cirrus) load tested commons and eswiki over the last hour - both look fine.
  • 16:11 logmsgbot: aude Started scap: Update Wikidata to mw1.24-wmf12 branch for group0 wikis
  • 15:49 bd808: Logstash event volume looks better after restart. Probably related to bug 63490.
  • 15:32 bd808: Restarted logstash on logstash1001 because log volume looked lower than I though it should be.
  • 15:16 cmjohnson1: reseating PEM2 cr1-eqiad
  • 15:08 godog: powercycled ms-be1007, unresponsive on console and remnants of a stack trace
  • 14:49 manybubbles: (Cirrus) Applying cache warmer configuration that went out last Thursday to all wikipedias.
  • 12:11 hashar: Jenkins job builder e1ddd23 fails for us :/ Moving back to parent commit
  • 12:09 hashar: Updated our Jenkins job builder fork 0972985..e1ddd23
  • 09:40 godog: upgrade ms-be1003/1004/1012 (zone2) to swift icehouse
  • 09:16 _joe_: restarting rhenium, pings but no ssh since 2 days, serial console is blank and unresponsive
  • 09:15 godog: upgrade ms-be1002/1008 (zone1) to swift icehouse
  • 02:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 7 02:52:10 UTC 2014 (duration 52m 9s)
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-07 02:23:48+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-07 02:12:44+00:00

July 6

  • 02:50 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 6 02:49:21 UTC 2014 (duration 49m 20s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-06 02:24:08+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-06 02:13:07+00:00

July 5

  • 02:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 5 02:52:04 UTC 2014 (duration 52m 3s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-05 02:26:08+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-05 02:15:05+00:00
  • 01:22 springle: ongoing osc_host.sh schema change jobs on terbium. fine to kill in an emergency

July 4

  • 20:05 hoo: Ran sync-common on fenari to update the docs on noc.wikimedia.org
  • 15:40 _joe_: restarting salt-minion, killing io hungry job on fenari running since jun 30, 00 AM
  • 12:28 akosiaris: executed dist-upgrade on virt1000. Keystone configure phase failed in keystone-manage db-sync and hence dpkg configure failed. It was trying to create an already existing index in the database. Dropped the index, ran dpkg --configure -a to recreate the index (and whatever else keystone-manage db_sync does). All is back to normal.
  • 03:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 4 03:28:29 UTC 2014 (duration 28m 28s)
  • 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf12) at 2014-07-04 03:02:49+00:00
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-04 02:32:29+00:00
  • 00:28 gwicke: deployed parsoid config change e21a534 to support VE on the OTRS wiki

July 3

  • 23:40 mutante: osmium - libboost-dev : Depends: libboost1.54-dev but it is not going to be installed
  • 23:33 mutante: rhenium (pmacct / flow) Out of memory: Kill process 3123 (pmacctd) score 1 or sacrifice child
  • 23:22 K4-713: updated payments to c5689f385b2f0a7bdc55c5752010e9eb
  • 23:17 logmsgbot: mwalker Synchronized php-1.24wmf12/extensions/VisualEditor/: Updating VisualEditor for 144081 (duration: 00m 12s)
  • 21:07 logmsgbot: oblivian gracefulled all apaches
  • 20:45 mutante: deleted analytics/kraken branch from ops/puppet via gerrit ui, ack'ed by ottomata
  • 20:12 bd808|deploy: Updated scap to ff04431 (restart-nutcracker script)
  • 19:53 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 14s)
  • 19:48 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 19:48 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 30s)
  • 19:47 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 20s)
  • 19:36 jgage: rebooting analytics1012 for bios change: cpufreq governor
  • 19:27 ottomata: disabling puppet on hadoop related analytics nodes, preparing for reinstall
  • 19:21 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 19:16 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf12
  • 19:12 logmsgbot: reedy Synchronized php-1.24wmf11/languages/Language.php: I039547b867b2eab47692dcc018c95b89975bc65d (duration: 00m 40s)
  • 18:49 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedia to 1.24wmf11
  • 18:41 logmsgbot: reedy Finished scap: testwiki to 1.24wmf12 and build l10n cache (duration: 30m 47s)
  • 18:18 ottomata: doing rolling restarts of zookeeper servers and kafka brokers to load up new zk timeout changes
  • 18:10 logmsgbot: reedy Started scap: testwiki to 1.24wmf12 and build l10n cache
  • 18:10 logmsgbot: reedy scap aborted: testwiki to 1.24wmf12 and build l10n cache (duration: 27m 26s)
  • 17:53 godog: reloading librenms, semi-broke it with a syslog search (again)
  • 17:46 godog: reloading librenms, semi-broke it with a syslog search
  • 17:42 logmsgbot: reedy Started scap: testwiki to 1.24wmf12 and build l10n cache
  • 16:38 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/EventLogging/: bug 67420 (duration: 00m 35s)
  • 16:34 paravoid: apt: uploading nutcracker backport for precise
  • 08:07 hashar: Jenkins restarted
  • 08:00 hashar: upgrading Jenkins (minor version bump 1.554.2 -> 1.554.3)
  • 03:39 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 3 03:38:46 UTC 2014 (duration 38m 45s)
  • 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-03 03:02:15+00:00
  • 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-07-03 02:31:35+00:00

July 2

  • 23:06 logmsgbot: maxsem Synchronized wmf-config/: (no message) (duration: 00m 07s)
  • 22:37 jgage: rebooting analytics1021 to change bios "system profile" from PPW (OS) to PPW (DAPC)
  • 22:19 logmsgbot: ebernhardson Finished scap: (no message) (duration: 36m 25s)
  • 22:16 jgage: rebooting analytics1022 to check bios cpufreq setting
  • 21:43 logmsgbot: ebernhardson Started scap: (no message)
  • 21:42 logmsgbot: ebernhardson Synchronized php-1.24wmf10/extensions/Mantle/: Sync new Mantle extension in 1.24wmf10 (duration: 00m 20s)
  • 21:40 robh: blog updated to newest release, no downtime
  • 21:38 jgage: rebooting analytics1021 to check bios cpufreq setting
  • 20:56 paravoid: pfw1-eqiad: s/mchenry/lead/; all smtp_out rules have [ polonium lead ] as destination-address now
  • 20:49 paravoid: switching non-wikimedia.org MX to polonium/lead (from polonium/mchenry)
  • 20:16 cscott: updated Parsoid to version 6afcb8df
  • 19:08 logmsgbot: yurik Synchronized php-1.24wmf11/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal - bug fix (duration: 01m 40s)
  • 19:00 logmsgbot: yurik Synchronized php-1.24wmf10/extensions/: update to JsonConfig, ZeroBanner, ZeroPortal - bug fix (duration: 01m 15s)
  • 18:42 logmsgbot: yurik Synchronized php-1.24wmf10/extensions/: Reverting previous update to JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 20s)
  • 18:37 logmsgbot: yurik Synchronized php-1.24wmf10/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 55s)
  • 18:33 logmsgbot: yurik Synchronized php-1.24wmf11/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 38s)
  • 18:33 paravoid: reprepro include, trusty-wikimedia (main/universe): nutcracker, libicu 4.8, libzip 0.11, hhvm, {php,hhvm}-wikidiff2, {php,hhvm}-fss, {php,hhvm}-luasandbox, ffmpeg2theora
  • 18:28 yurikR2: yurik ^ was a noop - comment fix
  • 18:28 logmsgbot: yurik Synchronized wmf-config/CommonSettings.php: (no message) (duration: 01m 04s)
  • 18:26 logmsgbot: yurik Synchronized docroot/bits/WikipediaMobileFirefoxOS/: (no message) (duration: 01m 03s)
  • 16:30 mutante: upgrading jenkins to jenkins_1.554.3_all.deb on the apt repo
  • 15:19 manybubbles: done with SWAT for real this time
  • 15:17 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 28s)
  • 15:16 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 18s)
  • 15:08 manybubbles: *SWAT* complete
  • 15:07 manybubbles: swap complete - logged off of tin
  • 15:04 logmsgbot: manybubbles Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 05s)
  • 15:04 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 06s)
  • 15:00 logmsgbot: manybubbles Synchronized wmf-config: SWAT Remove two permissions from some editors on ruwiki (duration: 00m 07s)
  • 13:10 hashar: Jenkins being busy deleting history files
  • 13:02 hashar: Jenkins: dropping history of puppet related jobs after 90 days. 136992
  • 12:18 akosiaris: upgraded PH5 to 5.3.10-1ubuntu3.12+wmf1 on deployment-apache01 and deployment-apache02 (beta)
  • 12:09 akosiaris: upgraded PHP5 to 5.3.10-1ubuntu3.12+wmf1 on test.wikipedia.org
  • 11:05 logmsgbot: hashar Synchronized wmf-config/InitialiseSettings.php: additional upload domain for Erasmus University 143593 bug 67355 (duration: 00m 06s)
  • 08:00 godog: upgrading ms-be1001 to swift icehouse
  • 07:45 godog: umounted (empty and broken) sdk1 from ms-be3003 and wipe its first sectors, no more remounts
  • 03:00 paravoid: rebooting lead
  • 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 2 02:56:33 UTC 2014 (duration 56m 32s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-02 02:26:24+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-07-02 02:14:53+00:00

July 1

  • 23:49 K4-713: updated payments cluster to c5689f385b2f0a7
  • 23:43 robh: any francium errors can be ignored, as the software doesn't fully deploy from puppet and its not in service
  • 23:36 logmsgbot: maxsem Synchronized wmf-config/FeaturedFeedsWMF.php: https://gerrit.wikimedia.org/r/#/c/136316/ now for realz (duration: 00m 04s)
  • 23:34 logmsgbot: maxsem Synchronized wmf-config/FeaturedFeedsWMF.php: https://gerrit.wikimedia.org/r/#/c/136316/ (duration: 00m 04s)
  • 23:09 logmsgbot: maxsem Synchronized php-1.24wmf11/extensions/CentralAuth/: https://gerrit.wikimedia.org/r/#/c/143473/ (duration: 00m 05s)
  • 23:05 logmsgbot: maxsem Synchronized php-1.24wmf11/resources/: https://gerrit.wikimedia.org/r/#/c/142975/ (duration: 00m 05s)
  • 23:04 logmsgbot: maxsem Synchronized php-1.24wmf10/resources/: https://gerrit.wikimedia.org/r/#/c/142975/ (duration: 00m 19s)
  • 21:39 hoo: Set email for re-renamed dewiki account "Kolimak". Email and password got lost during a screwed rename.
  • 20:36 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/WikimediaMessages: bug 67387 (duration: 00m 15s)
  • 20:31 mutante: restarting apache on mw1217
  • 20:27 manybubbles: Adding cache warmers to all Cirrus indexes for group1 wikis with more then one shard except commons (commons is busy, it'll have to wait:)
  • 19:53 logmsgbot: aude Synchronized wmf-config/Wikibase.php: adjust property suggester setting for wikidata (duration: 00m 11s)
  • 19:14 logmsgbot: ori Synchronized php-1.24wmf10/resources/src/jquery.ui-themes/vector/jquery.ui.core.css: Ib09928248: vector/jquery.ui.core.css: Update rule for .ui-helper-hidden-accessible (bug 67243) (duration: 00m 05s)
  • 19:14 logmsgbot: ori Synchronized php-1.24wmf11/resources/src/jquery.ui-themes/vector/jquery.ui.core.css: Ib09928248: vector/jquery.ui.core.css: Update rule for .ui-helper-hidden-accessible (bug 67243) (duration: 00m 06s)
  • 18:42 andrewbogott: adding virt1008 to labs compute pool
  • 18:41 andrewbogott: switching puppet canary from virt1008 to virt1009
  • 18:38 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable property suggester on Wikidata (duration: 00m 10s)
  • 18:38 logmsgbot: aude Synchronized wmf-config/Wikibase.php: (no message) (duration: 00m 15s)
  • 18:30 logmsgbot: aaron Synchronized wmf-config/PrivateSettings.php: removed obsolete swift tampa config (duration: 00m 07s)
  • 18:15 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 18s)
  • 18:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf11
  • 17:54 logmsgbot: demon Synchronized php-1.24wmf10/extensions/Elastica: Updating to master, fixes fatal error (duration: 00m 07s)
  • 17:45 manybubbles: rebuilding cirrus index for commons to put it into fewer shards - it should be faster this way
  • 17:24 mutante: antimony: git.wikimedia.org]: Ensure set to :present but file type is link so no content will be synced
  • 17:24 logmsgbot: hoo Synchronized wmf-config/: Typos typos typso (duration: 00m 08s)
  • 17:21 mutante: restarting apache on antimony
  • 17:21 mutante: fixing svn.wikimedia.org apache site manually
  • 17:08 springle: restarted mysqld on db1046 m2 slave
  • 17:03 logmsgbot: demon Synchronized cirrus.dblist: Move remaining pool 4 lsearchd wikis (except commons) to Cirrus (duration: 00m 07s)
  • 15:09 manybubbles: done with SWAT deploy
  • 15:06 logmsgbot: manybubbles Synchronized php-1.24wmf11/extensions/CirrusSearch/: SWAT code to set up cache warmers (duration: 00m 05s)
  • 15:04 logmsgbot: manybubbles Synchronized wmf-config: SWAT - cirrus settings - cache warmers and shard counts (duration: 00m 06s)
  • 15:04 ottomata: temporarily disabling puppet on hafnium to test an eventlogging alert
  • 14:27 hashar: Stopping Jenkins it has some corrupted threads
  • 13:16 Jeff_Green: dist-upgrade and reboot tellurium
  • 13:08 Jeff_Green: dist-upgrade and reboot boron
  • 12:23 logmsgbot: reedy Synchronized multiversion/: (no message) (duration: 00m 23s)
  • 12:22 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 18s)
  • 12:16 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings-labs.php: (no message) (duration: 00m 20s)
  • 12:03 logmsgbot: reedy Synchronized wmf-config/interwiki.cdb: (no message) (duration: 00m 13s)
  • 12:00 Reedy: Manually created Echo tables on extension1
  • 11:55 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 13s)
  • 11:55 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 28s)
  • 11:53 Reedy: Manually created wikimania2015wiki database on 10.64.16.18
  • 11:48 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
  • 11:48 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 14s)
  • 11:47 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 16s)
  • 10:49 _joe_: nginx restarted on all ulsfo hosts as well, we should be PFS-enabled now
  • 10:38 _joe_: esams restart finished, moving to ulsfo
  • 10:30 _joe_: all eqiad SSL terminators are now PFS enabled. Moving to rolling restarting esams
  • 10:09 _joe_: restarting nginx on ssl100* servers in sequence, to activate PFS
  • 08:47 godog: ms-be3003 sdk1 disk to 0 weight
  • 07:22 legoktm: finished running checkLocalNames.php and checkLocalUser.php for some wikivoyages to clean up bug 66535
  • 07:16 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1067 (duration: 00m 12s)
  • 07:06 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1067 during schema changes (duration: 00m 06s)
  • 06:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1063 (duration: 00m 06s)
  • 06:40 legoktm: starting to run checkLocalNames.php and checkLocalUser.php for some wikivoyages to clean up bug 66535
  • 06:34 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1063 during schema changes (duration: 00m 06s)
  • 06:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1060 (duration: 00m 06s)
  • 06:29 legoktm: ran fixInvalidStudent.php --wiki=enwiki --courseId=359 for bug 66624
  • 06:13 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1060 during schema changes (duration: 00m 07s)
  • 02:51 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 1 02:50:05 UTC 2014 (duration 50m 4s)
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-07-01 02:23:49+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-07-01 02:14:11+00:00

June 30

  • 23:03 awight: update tools from e894f1f77674b6b101ae0e1644e363ca52e319d9 to d605bdc2aaaef2d4b296a4d9567ed2831db86756
  • 23:02 logmsgbot: ori Synchronized wmf-config: Iba41a37a1: Keep thumbnail guessing enabled (duration: 00m 05s)
  • 22:14 mutante: re-enabled puppet on caesium
  • 21:43 mutante: disabling puppet on caesium
  • 21:35 Reedy: running mwscript updateSpecialPages.php --wiki=enwiki --only=Mostlinkedtemplates --override on terbium
  • 21:25 mutante: fixing releases.wikimedia.org Apache site, delete sites-enabled file broken by puppet, add symlink, graceful
  • 21:00 subbu: deployed parsoid 0b365d516
  • 19:44 _joe_: restarting pybal on lvs1005
  • 19:16 awight: updated payments from a04e536b6923f2228bb7f5fbf2caeed64a888742 to 2b6c527617dcde154cc298dd9697c9d57c9f3620
  • 18:41 awight: updated payments from a8138fefd940ba41812e5c07ca6bc74b63cb9bcf to a04e536b6923f2228bb7f5fbf2caeed64a888742
  • 17:38 manybubbles: Cirrus reindex update! all wikipedias finished their in place reindex except ruwiki - that one is running now. all group1 wikis finished their from mediawiki reindex except commons and mgwiktionary which are running now. started from mediawiki reindex of all wikipedias exception for enwiki, itwiki, and cawiki which are already long done.
  • 17:12 logmsgbot: manybubbles Synchronized cirrus.dblist: Enabled CirrusSearch as the default search backend on 30 more wikis - take five (duration: 00m 04s)
  • 17:08 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis - take four (duration: 00m 04s)
  • 17:08 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis - for real for real (duration: 00m 04s)
  • 17:07 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis - for real (duration: 00m 04s)
  • 17:05 logmsgbot: manybubbles Synchronized wmf-config/: Enable CirrusSearch as the default search backend on 30 more wikis (duration: 00m 05s)
  • 15:43 logmsgbot: manybubbles Synchronized php-1.24wmf11/extensions/Wikidata/: (no message) (duration: 00m 09s)
  • 15:35 logmsgbot: manybubbles Synchronized php-1.24wmf11/extensions/VisualEditor/: SWAT Correctly VisualEditor - update full size in MediaSizeWidget (duration: 00m 07s)
  • 15:26 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - disable local uploads on Malay Wiktionary (duration: 00m 04s)
  • 15:23 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - remove completed mediaviewer surveys (duration: 00m 04s)
  • 15:19 _joe_: restarted profiler-to-carbon, stuck since _9_ days, will see that my patch gets deployed.
  • 15:15 logmsgbot: manybubbles Synchronized php-1.24wmf10/extensions/ProofreadPage: SWAT - fix ProofreadPage number of pages (duration: 00m 09s)
  • 14:48 godog: installed new swift ring on esams, decrease ms-be3003/sdk1 weight
  • 14:41 hoo: Cleared out a watchlist with 126652 entries on warwiki to resolve https://bugzilla.wikimedia.org/show_bug.cgi?id=67123
  • 13:31 godog: upgrade ms-fe300[12] to swift icehouse
  • 10:20 hashar: restarting zuul after a puppet change for /etc/zuul/zuul.conf
  • 07:53 godog: upgrading ms-be300[2-4] to swift icehouse
  • 02:49 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 30 02:48:28 UTC 2014 (duration 48m 27s)
  • 02:28 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1061, warm up (duration: 00m 07s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-30 02:23:56+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-30 02:14:23+00:00
  • 01:41 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1061 during schema changes (duration: 00m 07s)

June 29

  • 22:26 hoo: Manually cleared a watchlist on shwikt with 819846 entries, see https://bugzilla.wikimedia.org/show_bug.cgi?id=67123#c7
  • 22:10 hoo: Manually cleared a watchlist with 289436 entries, see https://bugzilla.wikimedia.org/show_bug.cgi?id=67123#c5
  • 16:44 hoo: Jenkins/ Zuul not reacting for at least half an hour now
  • 16:43 awight: update tools from 3a35482ab1fede2ccfcc49a64ec661b0cb013b81 to e894f1f77674b6b101ae0e1644e363ca52e319d9
  • 16:09 awight: updated payments from 6d74002f2634f41f7038daa7357ff6de55ee4880 to a8138fefd940ba41812e5c07ca6bc74b63cb9bcf
  • 02:45 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 29 02:44:35 UTC 2014 (duration 44m 34s)
  • 02:22 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-29 02:21:01+00:00

June 28

  • 17:16 ori: restarted lucene on search1016 per _joe_
  • 12:58 manybubbles: Cirrus reindex status: enwiki has almost finished its in place reindex, alphabetical wikipedias are at frwiki, all group1 wikis have finished their in place reindex. all group1 wikis are running from mediawiki reindex. itwiki and cawiki both finished both the in place and from mediawik reindex. Haven't started alphabetical from mediawiki reindex yet for wikipedias. that is the only
  • 10:40 _joe_: restarting lucene on search1015, stuck. again.
  • 02:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 28 02:46:49 UTC 2014 (duration 46m 48s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-28 02:24:12+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-28 02:15:38+00:00

June 27

  • 23:15 awight: deploymed payments config
  • 22:57 logmsgbot: csteipp Synchronized php-1.24wmf11/extensions/OAuth/frontend/specialpages/SpecialMWOAuth.php: Fix OAuth Logins for wmf11 (duration: 00m 18s)
  • 20:57 awight: updated crm from 340c43a15a84a9392ad5ef9fc2782243ff140deb to 17439326ca4488ece843a263fc14859b38cff0e9
  • 19:33 hashar: puppet-compiler: removed modules/varnish at root@puppet-compiler02:/opt/wmf/software/compare-puppet-catalogs/external/puppet and resetted repo.
  • 19:07 awight: update crm from e2fe03a9cd51e30206d9a1114d62dfbd6960816b to 340c43a15a84a9392ad5ef9fc2782243ff140deb
  • 18:57 logmsgbot: aaron Synchronized wmf-config/PoolCounterSettings-eqiad.php: Pre-set FileRenderExpensive config
  • 18:34 bblack: updated puppet repo on virt0
  • 18:11 mutante: osmium - hhvm : Depends: libdouble-conversion1 but it is not going to be installed
  • 16:49 bblack: updated carbon repo varnish pkg to 3.0.5plus~x-wm6
  • 14:18 hashar: Updated our Jenkins Job Builder fork: e9db73d..0972985
  • 03:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 27 03:30:00 UTC 2014 (duration 29m 59s)
  • 03:06 logmsgbot: LocalisationUpdate completed (1.24wmf11) at 2014-06-27 03:05:31+00:00
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-27 02:35:20+00:00

June 26

  • 23:32 manybubbles: Cirrus rebuild progress - started large/high cirrus visibility wikis in group2 - enwiki, cawiki, and itwiki.
  • 23:31 manybubbles: Cirrus rebuild progress - alphabetical wikis in group2 are 2/3 of the way done with reindex - from mediawiki rebuild is maybe 20% done there
  • 23:31 manybubbles: Cirrus rebuild progress - big wikis in group1 are finished with in place reindex and well into from mediawiki rebuild.
  • 23:27 ori: Previous scap included I2cfcfaf06 as well
  • 23:23 logmsgbot: ori Finished scap: CirrusSearch updates: Iefe340729, Ie12418e54, Ie21fb352 (duration: 04m 59s)
  • 23:18 logmsgbot: ori Started scap: CirrusSearch updates: Iefe340729, Ie12418e54, Ie21fb352
  • 23:07 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Ie96265c4f: Add an Erasmus University domain to whitelist (duration: 00m 05s)
  • 23:07 logmsgbot: ori updated /a/common to Ie96265c4f: Add an Erasmus University domain to whitelist
  • 21:51 hashar: Zuul/Jenkins back up and operational.
  • 21:43 hashar: hardkilled Zuul :-( 6 events lost.
  • 21:38 hashar: restarting Zuul it has a bunch of stalled changes
  • 21:32 bblack: enabled cp301[78] frontends in pybal
  • 21:27 hashar: restarting Jenkins
  • 21:26 hashar: Zuul/Jenkins stalled apparently
  • 20:59 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Enable property suggester on testwikidata (duration: 00m 07s)
  • 20:58 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/WikimediaMessages/: (no message) (duration: 00m 15s)
  • 20:57 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/OAuth/: (no message) (duration: 00m 15s)
  • 20:48 logmsgbot: aude Finished scap: Update Wikidata, for enabling property suggester on testwikidata (duration: 31m 57s)
  • 20:16 logmsgbot: aude Started scap: Update Wikidata, for enabling property suggester on testwikidata
  • 19:18 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 19:14 logmsgbot: reedy Synchronized php-1.24wmf11/extensions/OAuth/: (no message) (duration: 00m 14s)
  • 19:06 RobH: blog is back online after a number of reboots due to raid rebuild issues
  • 18:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf11
  • 18:16 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf10
  • 18:15 logmsgbot: reedy Synchronized php-1.24wmf10/includes/api/ApiQueryRecentChanges.php: Id9c316733896a27ce3f6c3e0e5efdf62f7d1ff1b (duration: 00m 14s)
  • 18:08 ottomata: starting new elasticsearch nodes 1017,1018,1019
  • 18:04 RobH: aware of holmium issue (old varnish), in process of repair, blog is down
  • 17:05 logmsgbot: reedy Synchronized php-1.24wmf11/resources/Resources.php: I1237909d7e058137d55e5de9fa4d64fe1f7f9472 (duration: 00m 14s)
  • 17:04 logmsgbot: reedy Finished scap: l10nupdate for 1.24wmf11 for Skins I9395b0e1983122b12bedf003d6398da5ddfd5651 (duration: 16m 35s)
  • 16:48 logmsgbot: reedy Started scap: l10nupdate for 1.24wmf11 for Skins I9395b0e1983122b12bedf003d6398da5ddfd5651
  • 16:46 logmsgbot: reedy Purged l10n cache for 1.24wmf4
  • 16:45 logmsgbot: reedy Purged l10n cache for 1.24wmf5
  • 16:45 logmsgbot: reedy Purged l10n cache for 1.24wmf6
  • 16:44 logmsgbot: reedy Purged l10n cache for 1.24wmf7
  • 16:44 logmsgbot: reedy Purged l10n cache for 1.24wmf8
  • 16:32 logmsgbot: reedy Finished scap: testwiki to 1.24wmf11 and build l10n cache (duration: 27m 20s)
  • 16:05 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
  • 16:01 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.XetXfk5RPi" ' returned non-zero exit status 1 (duration: 00m 18s)
  • 16:00 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
  • 15:56 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.9SaYNRzegr" ' returned non-zero exit status 1 (duration: 00m 24s)
  • 15:55 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
  • 15:55 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.EjEynr9oww" ' returned non-zero exit status 1 (duration: 00m 55s)
  • 15:54 logmsgbot: reedy Started scap: testwiki to 1.24wmf11 and build l10n cache
  • 15:24 cmjohnson1: shutting down holmium to replace disk
  • 14:35 bblack: restarted nova-network on labnet1001
  • 14:26 hashar: updated zuul cloner in git repo and deployed zuul ( tag is wmf-deploy-20140626-1 )
  • 13:54 godog: remounted (broken) sdk1 on ms-be3003
  • 13:32 cmjohnson1: powering down dataset1001 -relocating to 10G rack
  • 13:26 logmsgbot: reedy Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/142142/ No-op for labs (duration: 00m 16s)
  • 12:55 hashar: Jenkins: updates jobs for extensions (phpunit and qunit) to use the mw-run-update-script.sh instead of update.php . That runs update.php twice, the first time logging sql to a file that can be archived. 141851
  • 12:48 mark: Deactivated BGP session to AS13030
  • 11:01 hashar: Replacing operations-puppet-validate job with operations-puppet-pplint-HEAD which is faster and can run concurrently on multiple boxes. 142223
  • 10:52 godog: stopping swift on ms-be3003
  • 10:12 godog: upgrading ms-be3001 to swift icehouse
  • 06:26 springle: ran operations/software maintain-replicas.pl and fedtables.pl on labsdbs for bug 59683
  • 05:54 Tim: on mw1014: reformatted the /tmp partition
  • 05:50 Tim: on mw1014: stopped job runner due to bad /tmp
  • 04:44 ori: mw1014 is sad, has filesystem issues: "Attempt to read block from filesystem resulted in short read while trying to open /tmp". Puppet can't run. Should be depooled.
  • 03:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 26 03:33:19 UTC 2014 (duration 33m 18s)
  • 03:02 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-26 03:01:43+00:00
  • 02:32 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-26 02:31:50+00:00

June 25

  • 23:43 awight: updated crm from f3389daa94e9ad924175bdf0d5bc09c4a26aeb8c to e2fe03a9cd51e30206d9a1114d62dfbd6960816b
  • 23:27 logmsgbot: catrope Finished scap: Updating Wikidata and TimedMediaHandler (duration: 04m 24s)
  • 23:23 logmsgbot: catrope Started scap: Updating Wikidata and TimedMediaHandler
  • 21:22 hashar: puppet fixed on gallium / lanthanum . It was missing a group definition. All fixed! Thanks Chase.
  • 20:53 hashar: puppet broken on gallium.wikimedia.org and lanthanum.eqiad.wmnet . That is being looked at.
  • 20:34 subbu: deployed parsoid 4ef9d6be
  • 19:38 manybubbles: restarted Cirrus scripts after incident - the index rebuilds had to be completely restarted - sanity checking was simply paused
  • 18:54 logmsgbot: yurik Synchronized wmf-config/PrivateSettings.php: Removed obsolete ZRMA user/pswd (duration: 01m 06s)
  • 18:46 logmsgbot: yurik Finished scap: Removing ZeroRatedMobileAccess ext settings, depl latest JsonConfig/ZeroBanner/Portal (duration: 29m 09s)
  • 18:17 logmsgbot: yurik Started scap: Removing ZeroRatedMobileAccess ext settings, depl latest JsonConfig/ZeroBanner/Portal
  • 17:41 logmsgbot: demon Synchronized wmf-config/: Cirrus back on for wikis that had it before. Back to square 1 (duration: 00m 04s)
  • 17:29 mwalker: updating fundraising tools from 5f3a7316b636c0723ce3fa353186d4041b662872 to cdc4b73bd59d27c8d386b6df629b1c574cfed85f
  • 17:06 manybubbles: success!
  • 17:06 logmsgbot: manybubbles Synchronized wmf-config/: try to fix cirrus (duration: 00m 04s)
  • 16:51 andrewbogott: restarted apache on palladium -- _that_ helped
  • 16:49 andrewbogott: it didn't help
  • 16:49 andrewbogott: restarting puppetmaster on palladium
  • 16:42 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Disable Cirrus everywhere but testwiki (duration: 00m 04s)
  • 16:23 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Roll back previous Cirrus deploy (duration: 00m 05s)
  • 16:23 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: Roll back previous Cirrus deploy (duration: 00m 04s)
  • 16:16 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: I4c54357a: most remaining wikis getting Cirrus as primary (duration: 00m 04s)
  • 16:16 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: I4c54357a: most remaining wikis getting Cirrus as primary (duration: 00m 04s)
  • 15:27 logmsgbot: manybubbles Synchronized php-1.24wmf9/extensions/Wikidata/: SWAT - fix rtl issue (duration: 00m 10s)
  • 15:23 ottomata: reinstalling elastic1017,1018,1019
  • 15:20 logmsgbot: manybubbles Synchronized php-1.24wmf10/extensions/Wikidata/: SWAT - fix rtl issue (duration: 00m 12s)
  • 14:10 Krinkle: Upgrade npm from v1.4.5 to v1.4.16 on integration-slave1001 and integration-slave1002
  • 14:10 Krinkle: Upgraded npm from v1.4.13 to v1.4.16 on integration-slave1003 to fix https://github.com/npm/npm/issues/5472 and repooling
  • 13:30 Krinkle: Depooling integration-slave1003 as almost every other -npm build on this node fails due to corrupted ~/.npm cache
  • 12:52 manybubbles: cirrus rebuild update: starting from mediawiki reindex step for all alphabetical wikis that have finished so far
  • 12:48 manybubbles: cirrus rebuild update: started rebuilding group1's indexes yesterday. commons and wikidata finished their in place pass and started their from mediawiki pass. The remaining wikis are running their in place pass in alphabetical order and currently on frwiktionary.
  • 12:25 hashar: Upgraded Zuul 9839edb..b7fc126 Brings patchset 20 of Zuul cloner ( https://review.openstack.org/#/c/70373/ )
  • 12:02 akosiaris: upgraded etherpad.wikimedia.org to etherpad-lite 1.4.0
  • 11:12 paravoid: switching inbound email for wikimedia.org to polonium/mchenry
  • 10:35 _joe_: restarted lucene on search1016 as it was stuck there as well, once search1015 is up and running
  • 10:06 _joe_: restarted lucene on search1015, it was stuck
  • 07:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: incremental LB bump on db1009 and db1021 traffic samplers (duration: 00m 07s)
  • 06:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1021 with traffic sampling (duration: 00m 09s)
  • 06:01 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1021, db1049 to normal load (duration: 00m 07s)
  • 05:05 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1049, warm up (duration: 00m 08s)
  • 02:50 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 25 02:48:53 UTC 2014 (duration 48m 52s)
  • 02:39 springle: xtrabackup clone db1005 to db1049
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-25 02:25:57+00:00
  • 02:19 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1049 (duration: 00m 11s)
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-25 02:13:27+00:00
  • 00:57 chasemp: added dns for wikimania 2015 (gerrit 140186)

June 24

  • 23:28 ori: apache-graceful-all was for Ifc9596cc7
  • 23:28 logmsgbot: ori gracefulled all apaches
  • 23:12 logmsgbot: maxsem Synchronized visualeditor.dblist: https://gerrit.wikimedia.org/r/141702 (duration: 00m 03s)
  • 23:11 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MultimediaViewer: (no message) (duration: 00m 05s)
  • 23:02 ori: apache graceful done by me for I543efda24, I29b34689e, and I1c269433e
  • 23:00 logmsgbot: root gracefulled all apaches
  • 20:53 hashar: Jenkins / Zuul deploying experimental pipeline 141827
  • 20:29 RoanKattouw: Restarting Apache on mw1220, getting lots of "Unable to allocate memory for pool" errors
  • 20:29 ottomata: rebooting analytics1021
  • 20:25 ottomata: reinitializing varnish topics with replication factor of 3
  • 20:02 hashar: updated our Jenkins Job Builder copy 416ee7d..e9db73d
  • 19:58 hashar: Upgraded Zuul on gallium.wikimedia.org to install the zuul-cloner of doom. 4f9fd51..9839edb Tagged wmf-deploy-20140624-1 in our repo.
  • 19:39 manybubbles: rebuilding search index for group1 wikis after upgrade today
  • 18:27 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
  • 18:25 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf10
  • 17:52 logmsgbot: manybubbles Synchronized wmf-config: Drop Cirrus indexes to five shards on rebuild and switch all wikis to new highlighter (duration: 00m 04s)
  • 17:44 logmsgbot: aaron Synchronized wmf-config/InitialiseSettings.php: Maintenance reports limit incremental increase (duration: 00m 08s)
  • 17:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: reduce db1021 load (duration: 00m 10s)
  • 17:06 akosiaris: restarted hadoop yarn on analytics1013
  • 15:36 bblack: VCL compilation is now in-sync everywhere but bits caches...
  • 15:21 logmsgbot: manybubbles Synchronized php-1.24wmf9/extensions/CirrusSearch/: SWAT Stop Cirrus from breaking RandomRootPage (duration: 00m 06s)
  • 15:15 logmsgbot: manybubbles Synchronized php-1.24wmf10/extensions/CirrusSearch/: SWAT Stop Cirrus from breaking RandomRootPage (duration: 00m 04s)
  • 15:06 logmsgbot: manybubbles Synchronized wmf-config/: SWAT - visual editor config changes and retire some beta features (duration: 00m 04s)
  • 15:05 logmsgbot: manybubbles Synchronized visualeditor-default.dblist: SWAT - Enable VisualEditor by default on Wikimania 2014 wiki (duration: 00m 04s)
  • 15:05 logmsgbot: manybubbles Synchronized visualeditor.dblist: SWAT - Enable VisualEditor by default on Wikimania 2014 wiki (duration: 00m 06s)
  • 15:03 logmsgbot: manybubbles Synchronized php-1.24wmf10/includes/config/GlobalVarConfig.php: SWAT - GlobalVarConfig should not throw exceptions for null-valued config settings (duration: 00m 05s)
  • 14:53 logmsgbot: hoo Synchronized wmf-config/CommonSettings.php: Enable Wikibase property suggester on beta (duration: 00m 07s)
  • 14:15 hashar: Jenkins set SMTP server to wiki-mail.wikimedia.org smtp.pmtpa.wmnet got deleted
  • 14:07 hashar: Jenkins is back
  • 13:59 Krinkle: Build logs in Jenkins incorrectly render ansi color codes since it was upgraded to 0.4.0. Downgrading to 0.3.1 and restarting Jenkins.
  • 09:55 godog: removing old salt master cache on palladium, moved yesterday out of the way
  • 06:59 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1049 (duration: 00m 08s)
  • 06:23 Nemo_bis: FYI no gerrit mail since yesterday 15 UTC, https://bugzilla.wikimedia.org/show_bug.cgi?id=67018
  • 02:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 24 02:47:14 UTC 2014 (duration 47m 13s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-24 02:25:43+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-24 02:13:38+00:00

June 23

  • 23:12 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/141102/ (duration: 00m 06s)
  • 23:12 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/141102/ (duration: 00m 05s)
  • 23:03 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/140897/ (duration: 00m 04s)
  • 20:05 subbu: deployed parsoid 392435a2 (deploy sha db94f88c)
  • 19:22 hashar: gallium / zuul : deleting /var/lib/zuul/git old Zuul repositories. They have been migrated to /srv/ssd/zuul/git/ ages ago
  • 19:20 jgage: ms-be3003 full root partition fixed, swift had written to /srv/swift-storage/sdk1 onto root due to umounted sdk1
  • 17:38 bblack: lvs1005:eth3 was negotiated to 100mbps (???) - disable -> enable on switch fixed it
  • 17:36 godog: restarted salt-master on palladium, suspected job cleanup stuck
  • 17:04 bd808: Fixed dangling symlink for /etc/apache2/sites-enabled/logstash.wikimedia.org on logstash1001 by deleting symlink and forcing puppet run
  • 16:49 godog: added mw1149-52 back to pybal apache
  • 16:33 paravoid: switched inbound mail for all non-wikimedia.org domains from mchenry/sodium to polonium/mchenry (~16:00 + <= 1h TTL UTC)
  • 15:13 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Add a Library of Congress domain to wgCopyUploadsDomains gerrit:141308 (duration: 00m 14s)
  • 15:11 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Adjust group rights on ruwiki gerrit:140910 (duration: 00m 14s)
  • 15:10 logmsgbot: anomie Synchronized php-1.24wmf9/includes/api/ApiExpandTemplates.php: SWAT: Fix fatal in API action=expandtemplates with Scribunto gerrit:141416 (duration: 00m 15s)
  • 15:04 logmsgbot: anomie Synchronized php-1.24wmf10/includes/api/ApiExpandTemplates.php: SWAT: Fix fatal in API action=expandtemplates with Scribunto gerrit:141417 (duration: 00m 14s)
  • 14:55 andrewbogott: reenabling puppet on labstore1001, hoping it doesn't break labs
  • 14:38 hashar: Further upgraded Zuul up to upstream b8c24ce + our local hacks. Git tag is wmf-deploy-20140623-4
  • 14:14 hashar: upgraded Zuul by one commit (that introduces swift supports though disabled it on our setup via a custom hack)
  • 13:20 paravoid: switching outbound email to polonium
  • 12:17 manybubbles: rebuilding Cirrus index on group0 wikis to pick up changes like results boosting from categories and wikitext search
  • 10:37 godog: powering down maerlant, decom-med
  • 10:05 godog: hardreset maerlant, stuck on console and no ssh
  • 09:40 paravoid: killing sodium's lighttpd compress cache
  • 07:21 _joe_: powercycled cp4018, stuck with a blank console
  • 02:59 springle: moving lighttpd compressed archives on sodium off / to regain inodes
  • 02:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 23 02:45:24 UTC 2014 (duration 45m 23s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-23 02:25:53+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-23 02:13:53+00:00
  • 00:38 legoktm: mail is stuck, lots of mails queued in exim

June 22

  • 22:25 _joe_: restarted apache on strontium, passenger crashed (again).
  • 21:06 logmsgbot: hoo Synchronized wmf-config/InitialiseSettings-labs.php: For cluster consistency... (duration: 00m 08s)
  • 19:24 godog: silenced LVS healthcheck on rendering.svc until 23:23 UTC
  • 02:42 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 22 02:41:30 UTC 2014 (duration 41m 29s)
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-22 02:23:50+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-22 02:12:50+00:00

June 21

  • 16:12 _joe_: restarted ms-be1012, see http://paste.debian.net/106247/ for console output
  • 02:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 21 02:45:17 UTC 2014 (duration 45m 16s)
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-21 02:28:59+00:00
  • 02:18 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-21 02:17:18+00:00

June 20

  • 22:58 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: Load nostalgia from skins rather than extensions when it exists (duration: 00m 04s)
  • 20:23 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: wfGetIP removal, code cleanup (duration: 00m 04s)
  • 20:22 logmsgbot: demon Synchronized wmf-config/throttle.php: wfGetIP removal, code cleanup (duration: 00m 05s)
  • 17:11 godog: expanded palladium's root to avoid filling up, suspected salt-master (RT #7721)
  • 16:53 bd808: Ran /usr/local/bin/sync-common on fenari to verify fix for bug 66844. It works!
  • 15:16 logmsgbot: reedy Synchronized wmf-config/extension-list-labs: (no message) (duration: 00m 16s)
  • 11:00 _joe_: restarted apache on palladium, passenger was dead and filling error logs
  • 03:35 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 20 03:34:06 UTC 2014 (duration 34m 5s)
  • 03:19 logmsgbot: LocalisationUpdate completed (1.24wmf10) at 2014-06-20 03:18:36+00:00
  • 02:35 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-20 02:34:14+00:00
  • 00:06 MaxSem: Running clearMessageBlobs.php

June 19

  • 23:52 MaxSem: that was a touch
  • 23:51 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MultimediaViewer/: (no message) (duration: 00m 04s)
  • 23:38 logmsgbot: maxsem Finished scap: Mark Traceur made me do it! (duration: 15m 14s)
  • 23:23 logmsgbot: maxsem Started scap: Mark Traceur made me do it!
  • 23:20 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MobileFrontend/: (no message) (duration: 00m 03s)
  • 23:19 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: (no message) (duration: 00m 04s)
  • 23:18 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/CirrusSearch/: (no message) (duration: 00m 03s)
  • 23:18 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/VisualEditor/: (no message) (duration: 00m 04s)
  • 23:16 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/VisualEditor/: (no message) (duration: 00m 04s)
  • 23:14 bd808: Restarted logstash service on logstash1001
  • 23:06 logmsgbot: maxsem Synchronized php-1.24wmf10/extensions/MultimediaViewer/: (no message) (duration: 00m 05s)
  • 23:06 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MultimediaViewer/: (no message) (duration: 00m 05s)
  • 22:52 bd808: Updated scap to 792a572
  • 21:21 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Set wmgMediaViewerBeta to false everywhere (duration: 00m 15s)
  • 21:16 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf10
  • 21:07 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf9 take 2
  • 19:31 mutante: started mysql on pc1002
  • 19:17 MatmaRex: <RobH> powercycled pc1002
  • 19:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias back to 1.24wmf8
  • 19:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf9
  • 19:00 logmsgbot: reedy Finished scap: scap 1.24wmf10 take 2... (duration: 22m 59s)
  • 18:37 ori: neon, logstash100x, zirconium, stat1001, netmon1001: replaced sites-enabled symlinks with their targets and forced puppet-run to clean up after Iddc778a28
  • 18:37 logmsgbot: reedy Started scap: scap 1.24wmf10 take 2...
  • 18:08 logmsgbot: reedy Started scap: testwiki to 1.24wmf10 and build l10n cache
  • 17:29 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix the entity selector (duration: 00m 09s)
  • 15:51 mutante: powercycling elastic1017 (went down and no console output)
  • 15:13 godog: removed old pmtpa swift stats from graphite
  • 15:04 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Put testwiki namespaces in the right place gerrit:140261 (duration: 00m 14s)
  • 15:04 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Put testwiki namespaces in the right place gerrit:140261 (duration: 00m 15s)
  • 15:02 logmsgbot: anomie Synchronized wmf-config/throttle.php: SWAT: Raise account creation limit for Telugu Wikipedia workshop on June 23 gerrit:140669 (duration: 00m 15s)
  • 14:30 cmjohnson1: replacing failed disk slot3 es1006
  • 13:01 _joe_: re-enable puppet on lvs1003
  • 11:26 logmsgbot: reedy Synchronized wmf-config/: touch (duration: 00m 15s)
  • 11:25 logmsgbot: reedy Synchronized commonsuploads.dblist: (no message) (duration: 00m 15s)
  • 11:00 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 14s)
  • 10:53 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 15s)
  • 10:52 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 15s)
  • 10:28 Reedy: manually ran sync-common tin on fenari
  • 10:09 logmsgbot: reedy Synchronized docroot/noc: (no message) (duration: 00m 15s)
  • 10:07 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
  • 10:04 logmsgbot: reedy Synchronized wmf-config/: I248fa7b98a8a0eea943c6643d1bf9c2ed36296b8 (duration: 00m 15s)
  • 03:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 19 03:33:36 UTC 2014 (duration 33m 35s)
  • 02:46 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-19 02:45:51+00:00
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-19 02:23:42+00:00

June 18

  • 23:09 awight: update crm from 26460d6eaec26861661322df8e9f07a8b0519677 to f3389daa94e9ad924175bdf0d5bc09c4a26aeb8c
  • 23:05 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/VisualEditor/: https://gerrit.wikimedia.org/r/#/c/140563/ (duration: 00m 03s)
  • 23:03 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/140250/ (duration: 00m 04s)
  • 22:30 bblack: rebooting lvs1004 + lvs1005
  • 22:10 bblack: turning lvs1003 pybal back on
  • 21:52 bblack: disable pybal on lvs1003, since 1006 seems to have all its interfaces :P
  • 21:34 bblack: rebooting lvs1003 for kernel/bios stuff
  • 21:00 bblack: rebooting lvs1006 for kernel/bios stuff
  • 20:23 subbu: deployed Parsoid 88a61f81 (deploy repo sha 470a5ef2)
  • 17:39 logmsgbot: yurik Synchronized docroot/bits/WikipediaMobileFirefoxOS/: (no message) (duration: 01m 09s)
  • 17:35 logmsgbot: yurik Synchronized php-1.24wmf9/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 14s)
  • 17:32 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/: Updating JsonConfig, ZeroBanner, ZeroPortal (duration: 01m 15s)
  • 17:26 logmsgbot: yurik Synchronized docroot/bits/WikipediaMobileFirefoxOS/: (no message) (duration: 01m 04s)
  • 17:10 RobH: magnesium back to proper function
  • 17:09 RobH: apache2ctl restart on magnesium, racktables wasn't working
  • 16:24 bblack: rebooting lvs4001 for kenerl + num_queues
  • 16:19 bblack: rebooting lvs4002 for kenerl + num_queues
  • 15:20 bblack: rebooting lvs4003 for kernel / num_queues updates
  • 15:17 bblack: rebooting lvs4004 for kernel / num_queues updates
  • 15:10 logmsgbot: anomie Synchronized php-1.24wmf9/extensions/Scribunto/engines/LuaCommon/SiteLibrary.php: SWAT: Fix Scribunto-related exceptions on testwiki gerrit:140370 (duration: 00m 14s)
  • 13:40 _joe_: restarted profiler-to-carbon, stuck (again) waiting for mwprof
  • 13:25 springle: script rt-7708.pl hitting m2-master eventlogging from terbium for RT #7708. fine to kill if necessary
  • 10:01 hashar: Updated our Jenkins job builder fork: 8cbc93a..416ee7d
  • 08:26 _joe_: disk is gone, powering down ms-be1007, opening ticket for disk replacement
  • 08:24 _joe_: stopped swift on ms-be1007, unmounting volume to check for repair
  • 06:01 springle: restarted gmetad on nickel while unbreaking the mysql graphs I broke on ganglia
  • 04:30 ori: enabled puppet on polonium (was disabled but nothing in SAL)
  • 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 18 02:58:22 UTC 2014 (duration 58m 21s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-18 02:25:03+00:00
  • 02:23 MaxSem: searchidx1001 outta sync - running sync-common
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-18 02:13:34+00:00
  • 02:05 Krinkle: Nevermind, graphite.wikimedia.org going down is due to overload which recovers eventually (it just has). Has become SNAFU/FIXME.
  • 02:02 Krinkle: graphite.wikimedia.org is down with HTTP 502 Bad Gateway errors
  • 01:49 ori: puppet freshness on tungsten and stat1001 can be fixed with https://gerrit.wikimedia.org/r/#/c/140269/

June 17

  • 20:19 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/140178/ (duration: 00m 04s)
  • 20:17 logmsgbot: maxsem Synchronized php-1.24wmf8/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/140178/ (duration: 00m 05s)
  • 20:01 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix editing site links (duration: 00m 24s)
  • 18:23 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 16s)
  • 18:22 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf9
  • 18:05 logmsgbot: demon Synchronized wmf-config/PoolCounterSettings-eqiad.php: Limit regex searches before they start landing on wikis (duration: 00m 04s)
  • 16:32 bblack: enabled amssq31-46 esams text frontend varnishes in pybal (were misconfigured; wrong domainname)
  • 15:18 logmsgbot: manybubbles Synchronized php-1.24wmf8/extensions/CirrusSearch/: SWAT - Fix Cirrus Special:Random (duration: 00m 04s)
  • 15:13 logmsgbot: manybubbles Synchronized php-1.24wmf9/extensions/CirrusSearch/: SWAT - Fix Cirrus Special:Random (duration: 00m 04s)
  • 15:02 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT - lower event logging rate for mediaviewer (duration: 00m 05s)
  • 13:51 _joe_: production puppet masters upgraded to puppet 3
  • 07:12 springle: starting updateCollation on s3 frwikinews from tin
  • 07:07 logmsgbot: springle Synchronized wmf-config/InitialiseSettings.php: $wgCategoryCollation to uca-fr on frwikinews (duration: 00m 07s)
  • 03:20 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 17 03:19:12 UTC 2014 (duration 19m 11s)
  • 02:35 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-17 02:34:09+00:00
  • 02:23 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-17 02:22:46+00:00

June 16

  • 23:12 logmsgbot: maxsem Synchronized php-1.24wmf8/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/139562/ (duration: 00m 05s)
  • 23:11 logmsgbot: maxsem Synchronized php-1.24wmf9/extensions/MobileFrontend/: https://gerrit.wikimedia.org/r/#/c/139562/ (duration: 00m 06s)
  • 23:05 logmsgbot: maxsem Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/139888/ (duration: 00m 08s)
  • 21:30 ori: upgraded eventlogging to 3012aad
  • 20:45 ori: updated eventlogging to b4b42effc6
  • 17:36 logmsgbot: csteipp Synchronized php-1.24wmf8/extensions/EducationProgram/includes/api/ApiAddStudents.php: Bug66631 (duration: 00m 05s)
  • 17:34 logmsgbot: csteipp Synchronized php-1.24wmf9/extensions/EducationProgram/includes/api/ApiAddStudents.php: (no message) (duration: 00m 05s)
  • 15:59 godog: manually ran update-ubuntu-mirror on carbon, successful
  • 15:57 awight: updated crm from e52a4eb1bfab622f612dc84f687678fff1fdbc04 to 26460d6eaec26861661322df8e9f07a8b0519677
  • 15:30 ottomata: reinstalling analytics1018
  • 13:38 twkozlowski: _joe_ also working on recovering the list which was deleted by mistake
  • 13:37 _joe_: closed wikimedia-de-by list
  • 13:13 _joe_: removing chip-l mailing list as for bug #63877
  • 13:03 godog: restarting swift-proxy-server on ms-fe1001 to test statsd metrics
  • 10:47 godog: restarting swift-proxy-server on ms-fe3002 to test statsd metrics
  • 10:23 hoo: Touched all 1.24wmf8 extension/wikidata files and ran sync-common after that on mw1070
  • 10:18 logmsgbot: hoo Synchronized php-1.24wmf8/extensions/Wikidata/: Update Wikidata to fix a suggester bug (duration: 00m 09s)
  • 10:16 godog: restarting swift-proxy-server on ms-fe3001 to test statsd metrics
  • 10:12 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix a suggester bug (duration: 00m 13s)
  • 09:29 apergos: restarted search1015 about 15 mns ago, it's now recovered afaict, restarted search1016, it's doing index setup now
  • 03:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 16 02:59:43 UTC 2014 (duration 59m 42s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-16 02:26:05+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-16 02:14:38+00:00

June 15

  • 17:44 logmsgbot: hoo Synchronized php-1.24wmf8/extensions/Wikidata/: Touched various JavaScripts (duration: 00m 09s)
  • 14:26 Reedy: Job runners were restarted on tmh100[12] and are now processing jobs
  • 14:15 godog: extended palladium root partition by +20G
  • 13:50 _joe|away: restarted mw-job-runner on tmh1001
  • 10:02 paravoid: nuked ms-be1001 sdj with zeros, reformatting and placing into production again
  • 02:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 15 02:58:21 UTC 2014 (duration 58m 20s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-15 02:26:03+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-15 02:14:46+00:00

June 14

  • 22:27 bawolff: video scalers seem to have stopped doing webVideoTranscode jobs
  • 20:24 legoktm: ran "delete from ep_students where student_user_id =0 limit 1;" on enwiki for bug 66624
  • 20:10 legoktm: ran "delete from ep_users_per_course where upc_user_id=0 limit 1" on enwiki for bug 66624
  • 19:19 paravoid: unmounting ms-be1001's sdj1, corrupted filesystem
  • 18:46 paravoid: rebooting ms-be1001, XFS: Internal error XFS_WANT_CORRUPTED_RETURN, lots of processes in D
  • 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 14 03:07:14 UTC 2014 (duration 7m 13s)
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-14 02:36:35+00:00
  • 02:36 bblack: enabled amssq43-46 frontends (esams text varnish) in pybal
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-14 02:16:38+00:00
  • 00:46 bblack: enabled amssq39-42 frontends (esams text varnish) in pybal

June 13

  • 22:01 manybubbles: logstash1002 seems to be properly restoring nodes to itself. I'll monitor it for the next few minutes but I believe my work here is done.
  • 21:55 manybubbles: bouncing logstash1002 because it seems stuck. not sure why. no useful logs.
  • 21:07 bblack: turned on amssq35-38 text frontends in esams (in pybal)
  • 20:57 awight: update crm from c38296add61421f87e12cb5b4f3dd68bdf2340db to e52a4eb1bfab622f612dc84f687678fff1fdbc04
  • 20:23 bblack: turned on amssq31-34 text frontends in esams
  • 18:41 mutante: DNS update - removing manutius' public IP
  • 18:31 mutante: shutting down manutius, decom
  • 18:22 logmsgbot: ori Synchronized php-1.24wmf9/extensions/Math: I498053de4: Fix the VisualEditor parts of Math-wmf9 with a working cherry pick of I7d5e1174 (duration: 00m 08s)
  • 16:55 logmsgbot: hoo Synchronized php-1.24wmf8/extensions/Wikidata/: Update Wikidata to fix JavaScript issues (duration: 00m 09s)
  • 16:45 logmsgbot: hoo Synchronized php-1.24wmf9/extensions/Wikidata/: Update Wikidata to fix JavaScript issues (duration: 00m 10s)
  • 16:31 Reedy: Finished creating mathoid tables on all wikis
  • 16:26 Reedy: Creating mathoid tables on all wikis
  • 16:11 mutante: manutius - decom, delete salt key, puppet cert, stopped services...
  • 15:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 13 15:16:09 UTC 2014 (duration 53m 11s)
  • 14:59 logmsgbot: reedy Synchronized wmf-config/: Disable MW_MATH_SOURCE for now (duration: 00m 15s)
  • 14:46 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-13 14:45:40+00:00
  • 14:36 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-13 14:35:41+00:00
  • 13:00 bblack: moved ge-3/0/0 - 3/0/15 from public to private vlan on cs2-esams (amssq31-46)
  • 10:02 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1049 (duration: 00m 12s)
  • 09:56 paravoid: deactivating eqiad<->HE, excessive packet loss/latency
  • 09:33 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1071 (duration: 00m 07s)
  • 08:10 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1070, depool db1071 (duration: 00m 12s)
  • 07:48 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1066, depool db1070 (duration: 00m 07s)
  • 07:19 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1065, depool db1066 (duration: 00m 13s)
  • 06:51 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1062, depool db1065 (duration: 00m 09s)
  • 06:09 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1062 (duration: 00m 12s)
  • 05:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1051 (duration: 00m 14s)
  • 03:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 13 03:53:17 UTC 2014 (duration 53m 16s)
  • 03:12 logmsgbot: LocalisationUpdate completed (1.24wmf9) at 2014-06-13 03:11:28+00:00
  • 02:35 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-13 02:34:41+00:00
  • 00:45 logmsgbot: ori Synchronized php-1.24wmf8/extensions/Math: Reverting Extension:Math to 1bb3bfa3b5656 (duration: 00m 05s)
  • 00:44 logmsgbot: ori Synchronized php-1.24wmf9/extensions/Math: Reverting Extension:Math to 1bb3bfa3b5656 (duration: 00m 06s)
  • 00:41 ori: removed Physikerwelt and Frédéric Wang from extension-Math group in Gerrit pending further inquiry into recent changes
  • 00:38 logmsgbot: ori Finished scap: fix any lingering inconsistencies in the state of the app servers (see https://gerrit.wikimedia.org/r/139089) (duration: 26m 59s)
  • 00:11 logmsgbot: ori Started scap: fix any lingering inconsistencies in the state of the app servers (see https://gerrit.wikimedia.org/r/139089)

June 12

  • 23:35 logmsgbot: ori Synchronized php-1.24wmf8/extensions/MobileFrontend: Re-syncing after submodule update (duration: 00m 06s)
  • 23:34 ori: ran sync-common on mw1151
  • 23:17 logmsgbot: catrope Synchronized php-1.24wmf9/extensions/MobileFrontend: (no message) (duration: 00m 04s)
  • 23:17 logmsgbot: catrope Synchronized php-1.24wmf8/extensions/MobileFrontend: (no message) (duration: 00m 05s)
  • 23:17 logmsgbot: catrope Synchronized php-1.24wmf8/extensions/VisualEditor: (no message) (duration: 00m 04s)
  • 23:07 Krinkle: integration-slave1003 is failing npm-test builds due to a cache corruption (filed as https://github.com/npm/npm/issues/5472). Manually cleared /mnt/home/jenkins-deploy/.npm/async on integration-slave1003.eqiad.wmflabs for now.
  • 23:05 MaxSem: Purging PageImages data from Wikibooks and Wikisource
  • 22:59 logmsgbot: catrope Synchronized wmf-config/: (no message) (duration: 00m 04s)
  • 22:46 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: disable MW_MATH_MATHML until mathoid table is created (BUG 66492) (duration: 00m 04s)
  • 22:31 logmsgbot: ori Synchronized php-1.24wmf8/extensions/WikimediaEvents: Update WikimediaEvents for Ibd36da416 (duration: 00m 03s)
  • 22:30 logmsgbot: ori Synchronized php-1.24wmf9/extensions/WikimediaEvents: Update WikimediaEvents for Ibd36da416 (duration: 00m 03s)
  • 21:11 logmsgbot: yurik Synchronized php-1.24wmf9/extensions/JsonConfig/: JsonConfig ext update, fixing bug 66555 (duration: 01m 03s)
  • 21:10 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/JsonConfig/: JsonConfig ext update, fixing bug 66555 (duration: 01m 04s)
  • 19:25 ottomata: stopping puppet on an18
  • 19:19 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf9
  • 19:19 ottomata: starting hadoop decom of analytics1018. This node will become a Kafka broker
  • 19:04 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf8
  • 19:04 MaxSem: Dropping old GeoData tables from everywhere
  • 18:52 logmsgbot: reedy Finished scap: 1.24wmf9 staging take 2... (duration: 15m 20s)
  • 18:37 logmsgbot: reedy Started scap: 1.24wmf9 staging take 2...
  • 18:06 logmsgbot: reedy Started scap: testwiki to 1.24wmf9 and build l10n cache
  • 17:49 ottomata: disabling puppet on analytics1012 and analytics1022
  • 17:48 ottomata: starting some kafka failure tests, I have scheduled downtime for some service checks in icinga, hopefully this will not be noisy
  • 17:41 ottomata: restarting elasticsearch on logstash servers
  • 17:34 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Enabling new zero ext on all wikis (duration: 01m 03s)
  • 17:22 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Attempting to enable new zero ext on zerowiki & ruwiki - take3 (duration: 01m 04s)
  • 17:06 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/: (no message) (duration: 01m 12s)
  • 17:05 greg-g: yurik's blank sync message could have been: Deploying new JsonConfig,ZeroBanner,ZeroPortal extensions (refactoring ZeroRatedMobileAccess ext)
  • 17:04 logmsgbot: yurik Synchronized php-1.24wmf7/extensions/: (no message) (duration: 01m 15s)
  • 15:31 logmsgbot: manybubbles Synchronized wmf-config/throttle.php: SWAT: Raise account creation limit for eswiki outreach event (duration: 00m 05s)
  • 13:39 bblack: enabling cp301[34] esams mobile frontends in pybal
  • 11:18 hashar: Gerrit: created mediawiki/services/cxserver/deploy repository for Nikerabbit and kart_
  • 05:52 paravoid: cr1-esams/cr2-knams: dismantling amslvs BGP peerings
  • 05:46 paravoid: amslvs[1234]: stopping pybal
  • 03:40 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 12 03:39:07 UTC 2014 (duration 39m 6s)
  • 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-12 03:02:09+00:00
  • 02:58 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1051 (duration: 01m 08s)
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-12 02:32:07+00:00
  • 01:47 ori: graceful'd appservers for I0e66ee0a1: 2.4 compat: load mod_filter for AddOutputFilterByType
  • 00:44 bblack: ran "puppetca -s palladium.eqiad.wmnet" on palladium to get agent running again, someone borked/regenerated the key there 6 hours ago?
  • 00:20 mwalker: clearMessageBlobs.php killed because we fixed the problem in a more different way
  • 00:17 logmsgbot: mwalker Synchronized php-1.24wmf8/extensions/MultimediaViewer/resources/mmv/ui/mmv.ui.canvasButtons.js: poking cache for multimediaviewer messages (duration: 00m 04s)
  • 00:05 logmsgbot: aaron Synchronized php-1.24wmf8/includes/EditPage.php: e11d41dd366b039bff79e247368b6bff1245ea5e (duration: 00m 07s)

June 11

  • 23:50 mwalker: clearing resourceloader blobs on commonswiki to try and force a multimediaviewer message "mwscript extensions/WikimediaMaintenance/clearMessageBlobs.php --wiki=commonswiki"
  • 23:49 awight: updated SmashPig from 98b1f348aa55f6a3aac441db08a59ca309fade7a to 22e2923a3a030b17815181574f9ca99b38c5f2dc
  • 23:41 logmsgbot: mwalker Finished scap: SWAT deploy for MultimediaViewer, CentralNotice, and testwiki config (duration: 24m 16s)
  • 23:16 logmsgbot: mwalker Started scap: SWAT deploy for MultimediaViewer, CentralNotice, and testwiki config
  • 23:10 Krinkle: Running deleteEqualMessages.php on trwiki (bug 43917)
  • 22:58 logmsgbot: yurik Synchronized wmf-config/: Restoring to ZRMA for now (duration: 01m 04s)
  • 22:22 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Attempting to enable new zero ext on zerowiki & ruwiki - take2 (duration: 01m 06s)
  • 22:19 ^d: restarted elasticsearch on logstash1003, complaining about heap.
  • 22:06 logmsgbot: yurik Synchronized wmf-config/InitialiseSettings.php: Attempting to enable new zero ext on zerowiki & ruwiki (duration: 01m 12s)
  • 21:58 logmsgbot: yurik Synchronized php-1.24wmf8/extensions/JsonConfig/: (no message) (duration: 01m 11s)
  • 21:56 logmsgbot: yurik Synchronized php-1.24wmf7/extensions/JsonConfig/: (no message) (duration: 01m 09s)
  • 21:50 logmsgbot: yurik Finished scap: (no message) (duration: 25m 51s)
  • 21:46 ori: Disabling Puppet on mw1149. It's a former bits app server that isn't in PyBal so it isn't getting traffic. Going to stage some proposed changes for apache-config and operations/puppet there.
  • 21:24 logmsgbot: yurik Started scap: (no message)
  • 21:05 logmsgbot: yurik Finished scap: Deploying 3 new ext (JsonConfig, ZeroBanner, ZeroPortal), but they are not enabled anywhere yet (duration: 05m 03s)
  • 21:00 logmsgbot: yurik Started scap: Deploying 3 new ext (JsonConfig, ZeroBanner, ZeroPortal), but they are not enabled anywhere yet
  • 20:07 gwicke: deployed Parsoid 3de0dba15
  • 19:18 bblack: rebooting lvs3003 for 3.13 kernel
  • 19:17 logmsgbot: marktraceur Finished scap: MultimediaViewer fixes for cards 630, 429, and 697 (duration: 18m 45s)
  • 19:17 greg-g: mw1151 *still* giving permission denied errors (publickey), what's the status, yo?
  • 19:03 bblack: rebooting lvs3002 for 3.13 kernel + XPS
  • 18:59 logmsgbot: marktraceur Started scap: MultimediaViewer fixes for cards 630, 429, and 697
  • 18:44 ottomata: disabling puppet on analytics1012 to allow for more replica threads to catch up with current broker replicas...maybe :)
  • 18:41 awight: updated crm from b6815d29de97b80a0ab65db576213a604f0c7cb9 to c38296add61421f87e12cb5b4f3dd68bdf2340db
  • 18:03 Krinkle: Reloading Zuul to deploy I5d154a4002d08
  • 16:43 bblack: shutting off lvs3002.esams pybal to test XPS balancing of live traffic on lvs3004.esams + 3.13
  • 16:30 bblack: rebooting lvs3004 (inactive uploads LVS) for 3.13 again
  • 14:52 hashar: Jenkins restarting (plugin upgrades)
  • 14:48 bblack: rebooting lvs3004.esams (inactive uploads LVS) for 3.13 kernel
  • 14:41 _joe_: manually ran 'planet' on en.planet to restore technews
  • 14:40 hashar: Jenkins updating plugins
  • 13:56 paravoid: upgrading mw1153-mw1160, tmh1001-tmh1002 for USN-2244-1
  • 12:21 _joe_: set up a secondary remote named 'readonly' in /a/common on tin, to use with the icinga check for unmerged commits
  • 11:40 akosiaris: manually cleaning librenms tables. db1001 is going to have increased load for some time. The approach is automatable, see http://jira.observium.org/browse/OBSERVIUM-757
  • 11:32 godog: restarted uwsgi on tungsten, a lot of accesses to reqstats.edits.*.submits
  • 10:45 godog: restarted uwsgi on tungsten, hung on fetching many metrics
  • 09:54 _joe_: restarted apache on palladium - passenger crashed
  • 05:26 paravoid: restarting all swift daemons across the cluster to fix runaway threads due to rsyslog restart
  • 05:04 springle: beginning schema changes bug 49193 page_content_model
  • 03:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 11 03:28:14 UTC 2014 (duration 28m 13s)
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-11 02:28:18+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-11 02:14:43+00:00

June 10

  • 23:34 andrewbogott: updated labs Trusty image w/puppet3, made default
  • 23:19 mutante: rebooting unresponsive ms-be1003
  • 21:09 RobH: montly sms credit check: 1,447.36 SMS credits. will check again in 30 days
  • 19:47 hashar: Jenkins restarted apparently properly. Any breakage would probably be related to the version switch :-D
  • 19:45 ottomata: power cycling analytics1012, attempting to reinstall as kafka broker with new kafka partman recipe
  • 19:42 hashar: Jenkins upgraded from 1.532.2 to 1.554.2 (i.e. bumped to a new LTS version).
  • 19:37 hashar: Broke Jenkins by silently upgrading it  :-(
  • 19:09 Krinkle: git-deploy: Deploying integration/slave-scripts I9521890b911714edf2
  • 18:59 logmsgbot: reedy Synchronized php-1.24wmf8/skins/vector/components/tabs.less: (no message) (duration: 00m 14s)
  • 18:58 mutante: shutting down ekrem
  • 18:18 logmsgbot: reedy Synchronized wmf-config/InitialiseSettings.php: Enable data transclusion for wikiquote (duration: 00m 14s)
  • 18:15 logmsgbot: reedy Synchronized docroot and w: Update non Wikipedias to 1.24wmf8 (duration: 00m 16s)
  • 18:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Update non Wikipedias to 1.24wmf8
  • 18:14 logmsgbot: reedy Synchronized php-1.24wmf8/extensions/Wikidata/: (no message) (duration: 00m 16s)
  • 17:28 _joe|away: restarted profiler-to-carbon, stuck waiting data from mwprof
  • 15:21 mutante: ekrem - rm from stored configs/icinga
  • 15:12 mutante: ekrem - revoke salt,puppet keys, stop agents/minion
  • 07:42 springle: enabled pt-slave-delay for dbstore1001, 24h all shards
  • 06:12 springle: xtrabackup clone db1043 to db1048
  • 04:57 springle: db1048 down for upgrade
  • 03:40 springle: switched mchenry to use m2-master/m2-slave for OTRS address lookups
  • 03:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 10 03:24:19 UTC 2014 (duration 24m 18s)
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-10 02:28:14+00:00
  • 02:27 springle: switched traffic db1048 to db1020. broke gerrit briefly; see ops email
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-10 02:14:41+00:00
  • 01:33 chasemp: restarted gerrit on ytterbium
  • 01:01 manybubbles: upgraded all elasticsearch servers in production to 1.2.1. They are just restoring the last few shards on the last node now and they'll spend a few hours tonight rebalancing after the upgrade but otherwise I'm done.
  • 00:41 mwalker: updating donationinterface on payments from b4c5cf1bceb70d65eae28cdd0873036dc33c8992 to 6d74002f2634f41f7038daa7357ff6de55ee4880 for worldpay form error

June 9

  • 23:58 manybubbles: lied - upgrading elastic1014
  • 23:57 manybubbles: upgrading elastic1015
  • 23:30 Krinkle: Reloading Zuul to deploy 6727b8b
  • 23:12 logmsgbot: maxsem Synchronized php-1.24wmf8/extensions/MobileApp: (no message) (duration: 00m 03s)
  • 23:11 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/MobileApp: (no message) (duration: 00m 03s)
  • 23:03 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://bugzilla.wikimedia.org/66377 (duration: 00m 04s)
  • 20:42 manybubbles: upgraded elastic1007-elastic1010 without issue - starting elastic1010
  • 20:08 subbu: deployed Parsoid 9b673587 (deploy sha 7d0097a1)
  • 19:23 ottomata: disabling puppet on analytics1012
  • 18:59 ottomata: decomissioning analytics1012 in hadoop cluster, this will become a Kafka broker
  • 17:58 manybubbles: elastic1004-1006 upgraded without trouble - cluster is working on filling elatic1006 before moving on to 1007, and the rest
  • 17:04 andrewbogott: switching labs to puppet3
  • 17:03 awight: update crm from b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7 to b6815d29de97b80a0ab65db576213a604f0c7cb9
  • 16:30 manybubbles: upgrading elastic1003 - upgrade is going well so far so I'm going to stop watching it as closely and let it be more automated
  • 15:28 manybubbles: elastic1001 went well, doing 1002 by hand again
  • 15:17 logmsgbot: anomie Synchronized php-1.24wmf8/extensions/Wikidata: SWAT: Wikidata entity suggester bug fixes gerrit:138339 (duration: 00m 16s)
  • 15:12 greg-g: mw1151 still "permission denied" during deploys
  • 15:12 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable TemplateData GUI on Portuguese Wikipedia gerrit:137986 (duration: 00m 14s)
  • 15:09 logmsgbot: anomie Synchronized php-1.24wmf7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWSaveDialog.js: SWAT: VE fix for focus regression gerrit:137978 (duration: 00m 15s)
  • 15:06 andrewbogott: beta updating all instances to puppet 3 via a cherry-pick of https://gerrit.wikimedia.org/r/#/c/137898/ on deployment-salt
  • 15:05 logmsgbot: anomie Synchronized php-1.24wmf8/extensions/VisualEditor/modules/ve-mw/: SWAT: VE fix for focus regression and alignment issues gerrit:137971 gerrit:138122 (duration: 00m 14s)
  • 15:01 manybubbles: successfully synced plugins, upgrading elastic1001 to make sure everything is working ok with it - then we'll run through the others more quickly
  • 14:57 manybubbles: syncing elasticsearch plugins for 1.2.1 - any elasticsearch restart from here on out needs to come with 1.2.1 or the node will break.
  • 14:54 manybubbles: starting Elasticsearch upgrade with elastic1001
  • 07:14 springle: disabled puppet on analytics1021 to avoid kafka broker restarting with missing mount
  • 05:15 springle: xtrabackup clone db1046 to db1020
  • 04:44 springle: umount /dev/sdf on analytics1021, fs in r/o mode, kafka broker not running. no checks yet
  • 03:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 9 03:23:05 UTC 2014 (duration 23m 4s)
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-09 02:28:08+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-09 02:14:46+00:00

June 8

  • 23:27 p858snake|l: icinga has been shitting in the channel for 9+ hours (before I went to bed) about Varnishkafka, nothing noted in SAL. Here be a note about it.
  • 03:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 8 03:21:28 UTC 2014 (duration 21m 27s)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-08 02:27:21+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-08 02:14:10+00:00

June 7

  • 23:48 hoo: Fixed four CentralAuth log entries on meta which were logged for WikiSets/0
  • 21:36 manybubbles: that means I turned off puppet and shut down Elasticsearch on elastic1017 - you can expect the cluster to go yellow for half an hour or so while the other nodes take rebuild the redundency that elastic1017 had
  • 21:35 manybubbles: after consulting logs - elastic1017 has had high io wait since it was deployed - I'm taking it out of rotation
  • 21:31 manybubbles: elastic1017 is sick - thrashing to death on io - restarting Elasticsearch to see if it recovers unthrashed
  • 17:56 godog: restarted ES on elastic1017.eqiad.wmnet (at 17:22 UTC)
  • 03:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jun 7 03:23:32 UTC 2014 (duration 23m 31s)
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-07 02:29:57+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-07 02:16:30+00:00

June 6

  • 23:51 Krinkle: Restarted Jenkins, force stopped Zuul, started Zuul, configure Jenkins via web interface (disable Gearman, save, enable German); Seems to be back up now, finally.
  • 22:52 mutante: same for rhenium, titanium, bast1001, calcium, carbon, ytterbium, stat1003
  • 22:42 RoanKattouw: Restarting Jenkins didn't help, jobs still aren't making it across from Zuul into Jenkins
  • 22:36 RoanKattouw: Restarting stuck Jenkins
  • 22:35 mutante: same for holmium, hafnium, silver, netmon1001, magnesium, neon, antimony
  • 22:17 mutante: upgraded ssl packages on zirconium
  • 21:57 Krinkle: Took Jenkins slave on gallium temporarily offline and back online to resolve possible stagnation
  • 20:56 awight_: updated crm from ded541894a70922e098fb3ea48306c8ec0f0f6aa to b38497a9d0ef75fe2b20b03b649ac13a5e3f47a7
  • 18:24 mwalker: updating payments from e823354822c7a35e6c2069d3e72180a45dbc89dc to b4c5cf1bceb70d65eae28cdd0873036dc33c8992 for globalcollect oid hack
  • 14:04 hashar: Gerrit back. chase rebooted it :)
  • 13:55 hashar: Gerrit having some troubles: error: RPC failed; result=22, HTTP code = 503 (while cloning CirrusSearch )
  • 12:58 cmjohnson1: replacing raid controller db1020
  • 06:12 Tim: on osmium installed nodejs for testing
  • 04:24 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jun 6 04:23:08 UTC 2014 (duration 23m 7s)
  • 03:13 logmsgbot: LocalisationUpdate completed (1.24wmf8) at 2014-06-06 03:12:19+00:00
  • 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-06 02:42:28+00:00
  • 00:38 bblack: nginx restarted on ssl*
  • 00:16 mutante: fixed permissions on bugzilla's index.cgi, sry

June 5

  • 23:18 logmsgbot: maxsem Synchronized php-1.24wmf7/includes/ChangeTags.php: https://gerrit.wikimedia.org/r/#/c/137563/ (duration: 00m 03s)
  • 23:16 logmsgbot: maxsem Synchronized php-1.24wmf8/includes/ChangeTags.php: https://gerrit.wikimedia.org/r/#/c/137563/ (duration: 00m 03s)
  • 23:06 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/TemplateData: https://gerrit.wikimedia.org/r/#/c/137751/ (duration: 00m 04s)
  • 22:15 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ife5081549: Put $wgRCFeeds[rcs100x] config behind $wmfRealm check (duration: 00m 04s)
  • 22:12 logmsgbot: ori updated /a/common to Ife5081549: Put $wgRCFeeds['rcs100x'] config behind $wmfRealm check
  • 21:48 ori: updated eventlogging to a8602c1d879f
  • 21:34 MaxSem: Renaming geo_killlist and geo_updates to *_old
  • 18:36 logmsgbot: reedy Synchronized wmf-config/: (no message) (duration: 00m 14s)
  • 18:35 logmsgbot: reedy Synchronized database lists: (no message) (duration: 00m 13s)
  • 18:17 Reedy: Created FlaggedRevs tables on ckbwiki
  • 18:11 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Update group0 to 1.24wmf8
  • 18:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf7
  • 17:00 logmsgbot: reedy Synchronized wmf-config/: Wrap some long lines, add some docs (duration: 00m 26s)
  • 16:43 bblack: rebooting lvs3002
  • 16:36 paravoid: downpref all of amslvs* in favor of lvs30*
  • 16:17 paravoid: downprefing amslvs1, upprefing lvs3001
  • 16:02 mark: Connected cp3018:eth1 to cr1-esams:xe-0/0/3 (unconfigured)
  • 15:59 _joe_: disabling puppet on virt1000 while we test the puppet3 upgrade on virt0
  • 15:48 logmsgbot: reedy Finished scap: 2nd scap for 1.24wmf8, should be effectively a nooop (duration: 12m 33s)
  • 15:35 logmsgbot: reedy Started scap: 2nd scap for 1.24wmf8, should be effectively a nooop
  • 15:21 logmsgbot: anomie Synchronized php-1.24wmf6/extensions/VisualEditor/modules/ve-mw/ui/dialogs/: SWAT: Use <visualeditor-toolbar-cite-label> correctly in the Media and Reference toolbars gerrit:136783 (duration: 00m 15s)
  • 15:18 logmsgbot: anomie Synchronized php-1.24wmf7/extensions/VisualEditor/modules/ve-mw/ui/dialogs/: SWAT: Use <visualeditor-toolbar-cite-label> correctly in the Media and Reference toolbars gerrit:136782 (duration: 00m 12s)
  • 15:04 logmsgbot: anomie Synchronized php-1.24wmf7/extensions/Popups/resources/: SWAT: Hovercard animation fixes gerrit:137530 gerrit:137531 gerrit:137532 (duration: 00m 14s)
  • 14:57 logmsgbot: reedy Finished scap: testwiki to 1.24wmf8 and build l10n cache (duration: 26m 23s)
  • 14:54 hashar: restarting Zuul
  • 14:31 logmsgbot: reedy Started scap: testwiki to 1.24wmf8 and build l10n cache
  • 14:15 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.hiiCprts7Z" ' returned non-zero exit status 1 (duration: 00m 17s)
  • 14:14 logmsgbot: reedy Started scap: testwiki to 1.24wmf8 and build l10n cache
  • 14:07 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/a/common/wmf-config/extension-list" --output="/tmp/tmp.WtQBrR6JUp" ' returned non-zero exit status 1 (duration: 01m 08s)
  • 14:06 logmsgbot: reedy Started scap: testwiki to 1.24wmf8 and build l10n cahce
  • 14:05 logmsgbot: reedy Purged l10n cache for 1.24wmf5
  • 13:58 hashar: Adding unit tests Jenkins job for most mediawiki extensions 137578
  • 12:05 godog: powercycling ms-be1005, no ssh, no console
  • 10:28 godog: restarted uwsgi on tungsten
  • 09:24 godog: moving bits traffic to the general appserver pool in eqiad
  • 04:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jun 5 04:09:50 UTC 2014 (duration 9m 49s)
  • 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-05 03:02:00+00:00
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-05 02:32:06+00:00
  • 02:23 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1007 (duration: 01m 26s)
  • 00:46 bblack: lvs3002 (live uploads lb for esams) is running ntpd

June 4

  • 23:43 Tim: on searchidx1001: restarting lsearchd and indexer
  • 23:40 logmsgbot: mwalker Finished scap: Scapping for SWAT; MultiMedia viewer and config changes (duration: 22m 16s)
  • 23:20 Tim: on searchidx1001: as a temporary hack to work around scap disk full errors, set up a bind mount at /usr/local/apache/common-local linking to a directory in /a, by local modification of /etc/fstab
  • 23:18 logmsgbot: mwalker Started scap: Scapping for SWAT; MultiMedia viewer and config changes
  • 21:56 logmsgbot: yurik Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/136503/ (duration: 01m 07s)
  • 21:54 logmsgbot: yurik Synchronized mobilelanding.php: (no message) (duration: 01m 07s)
  • 20:47 MaxSem: Truncating geo_killlist everywhere
  • 20:33 subbu: deployed Parsoid 165a2042 (deploy sha fc1b1ed4)
  • 19:04 bd808|deploy: Restarted elasticsearch on logstash1001; JVM OOM
  • 19:00 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/GeoData/: (no message) (duration: 00m 04s)
  • 18:58 logmsgbot: maxsem Synchronized php-1.24wmf6/extensions/GeoData/: (no message) (duration: 00m 03s)
  • 18:43 bd808|deploy: mw1151 gave an ssh denied error for MaxSem during sync-dir
  • 18:40 logmsgbot: maxsem Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/136487/ (duration: 00m 04s)
  • 17:54 mutante: shutting down solr1001-1003
  • 17:47 logmsgbot: yurik Synchronized php-1.24wmf7/extensions/ZeroRatedMobileAccess/: (no message) (duration: 01m 07s)
  • 17:44 logmsgbot: yurik Synchronized php-1.24wmf6/extensions/ZeroRatedMobileAccess/: (no message) (duration: 01m 06s)
  • 17:27 mutante: stopping puppet/salt on solr100[13], removed from icinga
  • 16:36 robh: blog.wikimedia.org updated to latest wp version
  • 16:13 mutante: installing package upgrades on bast1001
  • 16:11 mutante: installing package upgrades on iron
  • 15:59 mutante: killing puppet certs,salt keys for solr100[13].eqiad - decom
  • 15:28 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: close wikimania2013wiki for real (duration: 00m 10s)
  • 15:28 logmsgbot: manybubbles Synchronized closed.dblist: close wikimania2013wiki (duration: 00m 09s)
  • 15:23 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: close wikimania2013wiki (duration: 00m 10s)
  • 15:21 logmsgbot: manybubbles Synchronized php-1.24wmf6/extensions/MobileApp/: (no message) (duration: 00m 10s)
  • 15:15 logmsgbot: manybubbles Synchronized php-1.24wmf7/extensions/MobileApp/: (no message) (duration: 00m 08s)
  • 15:07 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT deploy for media viewer (duration: 00m 13s)
  • 14:57 mutante: cleaning up duplicate cronjobs on terbium - all log to /var/log/mediawiki now
  • 12:53 hashar: Zuul upgraded (git tag wmf-deploy-20140604 ). Merges are now done by an indecent process zuul-merger
  • 12:43 hashar: upgrading Zuul to split the merger part to an independent process. Short unscheduled downtime starting in a few minutes
  • 07:51 _joe_: rebooted ms-be1001, host unresponsive to ping, blank console
  • 06:14 springle: starting online schema change, bug 66089 gerrit 137149
  • 04:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jun 4 04:26:32 UTC 2014 (duration 26m 31s)
  • 03:35 Krinkle: Deploy I882e3fa57b2e5e3de in Zuul and reload config
  • 03:16 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-04 03:15:34+00:00
  • 02:47 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-04 02:46:06+00:00

June 3

  • 23:14 logmsgbot: ori Synchronized php-1.24wmf7/extensions/MobileApp: SWAT cherry-picks for MobileApp (with patch) (duration: 00m 04s)
  • 23:11 logmsgbot: ori Synchronized php-1.24wmf6/extensions/MobileApp: SWAT cherry-picks for MobileApp (duration: 00m 04s)
  • 23:10 logmsgbot: ori Synchronized php-1.24wmf7/extensions/MobileApp: SWAT cherry-picks for MobileApp (duration: 00m 03s)
  • 23:06 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I9dac0dc6a80: Set $wgIncludejQueryMigrate = true; for all wikis (duration: 00m 03s)
  • 22:41 logmsgbot: marktraceur Finished scap: Update Media Viewer preference string for wmf7 - already backported to wmf6 (duration: 13m 19s)
  • 22:38 Krinkle: git-deploy: Deploying integration/slave-scripts If2e2e675802f
  • 22:27 logmsgbot: marktraceur Started scap: Update Media Viewer preference string for wmf7 - already backported to wmf6
  • 21:49 logmsgbot: marktraceur updated /a/common to I409703a11: Enable MMV by default on dewiki beta.
  • 21:25 logmsgbot: marktraceur Synchronized mediaviewer.dblist: Enable media viewer by default on enwiki (duration: 00m 06s)
  • 21:18 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: Throttle the MMV event logging a bit more for the launch today (duration: 00m 06s)
  • 21:17 logmsgbot: marktraceur updated /a/common to I549906510: Launch Media Viewer for all users on English wikipedia
  • 21:09 logmsgbot: marktraceur Synchronized wmf-config/InitialiseSettings.php: Touch InitialiseSettings.php because that's what we do (duration: 00m 06s)
  • 21:08 logmsgbot: marktraceur Synchronized mediaviewer.dblist: Add dewiki to the on-by-default list for Media Viewer (duration: 00m 06s)
  • 21:08 logmsgbot: marktraceur updated /a/common to Ie237b0ae1: Launch Media Viewer for all users on German wikipedia
  • 20:51 MaxSem: Disabled GeoData updates on terbium
  • 20:41 hashar: repack command: find /srv/ssd/gerrit/ -type d -name '*.git' -print -exec git --git-dir="{}" repack -afd \; -exec git --git-dir="{}" pack-refs --all \;
  • 20:41 hashar: Jenkins repacking gerritslave replicas on gallium and lanthanum. Running in screen as hashar -> gerritslave
  • 18:14 logmsgbot: reedy Synchronized wmf-config/: Stop sending IRC RC to PMTPA (duration: 00m 17s)
  • 18:07 logmsgbot: reedy Synchronized docroot and w: (no message) (duration: 00m 14s)
  • 18:05 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: All non wikipedias to 1.24wmf7
  • 15:45 akosiaris: merged https://gerrit.wikimedia.org/r/#/c/133515/ which enabled ferm on hydrogen/chromium
  • 15:41 logmsgbot: anomie Finished scap: SWAT: Update i18n for MultimediaViewer gerrit:136718 (duration: 17m 56s)
  • 15:23 logmsgbot: anomie Started scap: SWAT: Update i18n for MultimediaViewer gerrit:136718
  • 15:03 logmsgbot: anomie Synchronized wmf-config/InitialiseSettings.php: SWAT: Lower MediaViewer sampling for enwiki and dewiki gerrit:136717 (duration: 00m 14s)
  • 13:05 paravoid: salt * start procps
  • 11:13 _joe_: restarted jobrunners as they were blocked by restarting via cron
  • 10:58 godog: try restarting mw-job-runner on mw1012
  • 03:42 springle: revert to lvm snapshot on db1046, xfs being crotchety
  • 03:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jun 3 03:16:22 UTC 2014 (duration 16m 21s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-03 02:25:48+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-03 02:14:12+00:00
  • 01:32 logmsgbot: reedy Synchronized wmf-config/CommonSettings.php: wgCentralAuthRC to EQIAD rc ircd (duration: 00m 14s)
  • 00:28 awight: update crm from 5f6217d8f4d750087dcd37faca6b41de82d2362e to ded541894a70922e098fb3ea48306c8ec0f0f6aa

June 2

  • 23:34 logmsgbot: maxsem Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/136428/ (duration: 00m 03s)
  • 23:22 logmsgbot: maxsem Synchronized php-1.24wmf6/extensions/Flow/: (no message) (duration: 00m 04s)
  • 23:21 logmsgbot: maxsem Synchronized php-1.24wmf6/extensions/VisualEditor/: (no message) (duration: 00m 03s)
  • 23:20 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/VisualEditor/: (no message) (duration: 00m 04s)
  • 23:18 logmsgbot: maxsem Synchronized php-1.24wmf7/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/136936/ (duration: 00m 05s)
  • 22:49 Krinkle: Repooled integration-slave1003 in Jenkins.
  • 22:37 Krinkle: Hack-patching integration-slave1003.eqiad.wmflabs per https://bugzilla.wikimedia.org/show_bug.cgi?id=61508#c2
  • 21:30 mutante: searchidx1001 - low disk space, gzip MegaSAS.log, delete old kernel headers
  • 21:18 awight: updated crm from b6e004f7349507523423c59170274150a44b0aaf to 5f6217d8f4d750087dcd37faca6b41de82d2362e
  • 20:09 gwicke: deployed Parsoid 04a4bf2b
  • 20:08 hashar: Jenkins unpolled integration-slave1003 npm is outdated there and does not trust npmregistry.org ( bug 61508 )
  • 19:29 awight: updated crm from ce64066316e77f6fc3545c6265e2d81e3ef773c4 to b6e004f7349507523423c59170274150a44b0aaf
  • 19:18 awight: update crm from 5b231163e9e880de5b9787d40b679a6723748aca to ce64066316e77f6fc3545c6265e2d81e3ef773c4
  • 18:58 logmsgbot: csteipp Synchronized php-1.24wmf6/includes/upload/UploadBase.php: (no message) (duration: 00m 04s)
  • 18:51 logmsgbot: csteipp Synchronized php-1.24wmf7/includes/upload/UploadBase.php: (no message) (duration: 00m 06s)
  • 18:41 awight: updated tools from d257e8445e028b758b1d1fa90c857667d4faac62 to cbcd14a84f7bc8682822d3b1910b48bfd932b00d
  • 17:15 chasemp: disabling ircd on ekrem
  • 17:05 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: Cache search suggestions for 3 hours instead of 6 (duration: 00m 04s)
  • 17:03 chasemp: moving irc.wikimedia.org to argon
  • 16:27 ottomata: ran preferred-replica-election to fix vk delivery errors
  • 16:24 logmsgbot: demon Synchronized wmf-config/throttle.php: Library of Israel editathon (duration: 00m 04s)
  • 16:07 manybubbles: rebuilding all english non-wikipedias with unicode normalization
  • 15:36 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy - more import sources and upload domains (duration: 00m 04s)
  • 15:34 manybubbles: reindexing all hebrew wikis to switch them from the hebrew analyzer to proper unicode normalization
  • 15:33 ottomata: attempting to powercycle analytics1015, it is not responding to pings, no output on console
  • 15:33 logmsgbot: manybubbles Synchronized wmf-config/: SWAT deploy changing some search settings (duration: 00m 05s)
  • 15:26 hashar: restarted Zuul. All jobs lists :-(
  • 15:25 hashar: Zuul stuck in a loop reporting a change :-(
  • 15:20 hashar: Jenkins/Zuul stuck. Depooling/Repooling some slaves to reregister jobs with Zuul
  • 14:51 ottomata: chown -R datasets /data/xmldatadumps/public/other/pagecounts-ez on dataset1001 to accompany 70a7f61, fixing bug 66005
  • 12:44 akosiaris: manually ran puppet on mw11991
  • 07:21 hashar: restarted Zuul unintentionally
  • 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jun 2 03:11:05 UTC 2014 (duration 11m 4s)
  • 03:04 ori: ..on vanadium.
  • 03:03 ori: moving /var/log/eventlogging/archive/* to /srv/eventlogging-logs to free up space on the root partition. unpuppetized for now, sadly.
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-02 02:23:57+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-02 02:12:53+00:00
  • 02:06 logmsgbot: tstarling Synchronized php-1.24wmf6: Revert "Use square bounding boxes for default-sized thumbnails" (duration: 01m 18s)
  • 02:02 logmsgbot: tstarling Synchronized php-1.24wmf7: (no message) (duration: 01m 31s)

June 1

  • 05:41 awight: updated payments from 7c695e9c4c7386a7585b6067df29b8caaaa089f0 to e823354822c7a35e6c2069d3e72180a45dbc89dc
  • 03:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jun 1 03:09:28 UTC 2014 (duration 9m 27s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-06-01 02:23:58+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-06-01 02:13:23+00:00

May 31

  • 03:14 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat May 31 03:13:47 UTC 2014 (duration 13m 46s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-05-31 02:26:20+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-31 02:15:48+00:00

May 30

  • 22:00 greg-g: ori sync'd out a config change to Add $wgRCFeeds entries for RCStream on rcs100[12].eqiad.wmnet
  • 21:57 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 03s)
  • 21:21 hashar: Zuul restarted (somehow by mistake)
  • 20:39 logmsgbot: bd808 Synchronized php-1.24wmf7/extensions/Elastica/Elastica/test/lib/Elastica/Test/IndexTest.php: Touched to test sync-file (duration: 00m 04s)
  • 20:36 logmsgbot: bd808 Synchronized wmf-config/db-eqiad.php: Touched db-eqiad.php to test sync-file (duration: 00m 04s)
  • 20:35 logmsgbot: bd808 Synchronized wmf-config/throttle.php: Touched to test sync-file (duration: 00m 04s)
  • 20:35 bd808|deploy: Scap updated to 6c0c4f0
  • 19:45 logmsgbot: bd808 Synchronized wmf-config/throttle.php: Touched to test sync-file (duration: 00m 05s)
  • 19:40 logmsgbot: bd808 Synchronized wmf-config/db-eqiad.php: Touched db-eqiad.php to test sync-file (duration: 00m 03s)
  • 19:33 logmsgbot: bd808 Synchronized README: Testing sync-file (duration: 00m 06s)
  • 19:30 bd808|deploy: Scap updated to c4204dd
  • 17:21 andrewbogott: restarted pdns on virt0 and virt1000
  • 16:07 andrewbogott: changes the 'Ops' gid to 700 in ldap
  • 16:02 akosiaris: enabled VT on thallium/mercury for ganeti evaluation purposes
  • 14:54 _joe_: ran scap-rebuild-cdb on mw1163
  • 13:47 hashar: Jenkins: removing label hasBrowserTests from labs slaves 136315
  • 13:43 hashar: Jenkins: removing label hasHhvm from labs slaves 136315
  • 13:42 hashar: Jenkins: removing label hasJenkinsDebianGlue from labs slaves 136315
  • 13:26 hashar: Jenkins lowering number of executors on labs slave from 5 to 4 since they have 4 CPU
  • 13:25 hashar: Jenkins polling a third CI slave integration-slave1003.
  • 09:26 hashar_: Zuul is processing jobs again. For reference bug is bug 63760
  • 09:24 hashar_: Jenkins: disconnecting and reconnecting labs slaves to reregister them with Zuu
  • 09:17 hashar: Jenkins/Zuul locked
  • 03:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri May 30 03:46:09 UTC 2014 (duration 46m 8s)
  • 03:01 logmsgbot: LocalisationUpdate completed (1.24wmf7) at 2014-05-30 03:00:14+00:00
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-30 02:30:51+00:00

May 29

  • 23:36 logmsgbot: csteipp synchronized php-1.24wmf6/includes/specials/SpecialPasswordReset.php
  • 23:19 ^d: cleaned up /var/cache/apt on searchidx1001, freed up ~20% of the disk, should be fine now
  • 23:09 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I09d92987: Flow-ify mw:Talk:Search'
  • 23:07 logmsgbot: ori synchronized cirrus.dblist 'I3f09dff7e: All Wikipedias with 100k pages or less getting Cirrus as primary'
  • 23:05 logmsgbot: ori updated /a/common to I3f09dff7e: All Wikipedias with 100k pages or less getting Cirrus as primary
  • 20:29 logmsgbot: marktraceur synchronized wmf-config/InitialiseSettings.php 'Enable Media Viewer on all wikisources by default'
  • 20:28 logmsgbot: marktraceur updated /a/common to I95348e0d4: Launch Media Viewer for all users on all Wikisources
  • 20:25 logmsgbot: anomie synchronized php-1.24wmf6/includes/api 'Revert revert of gerrit:120827, underlying bug should be fixed now'
  • 20:19 logmsgbot: anomie synchronized php-1.24wmf6/extensions/EducationProgram/includes/api/ApiListStudents.php 'Backport fix for bugzilla:65906'
  • 20:05 logmsgbot: anomie synchronized php-1.24wmf7/extensions/EducationProgram/includes/api/ApiListStudents.php 'Backport fix for bugzilla:65906'
  • 19:40 ^d: hewiki elastic index was missing geodata mappings. re-map + in place reindex failed spectacularly. rebuilding from scratch now.
  • 19:38 ori: mw1163: mkdir -p /usr/local/apache/common-local && chown mwdeploy:mwdeploy /usr/local/apache/common-local
  • 19:09 cmjohnson1: powering down mw1151 for disk replacement
  • 19:01 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf7
  • 18:49 cmjohnson1: removing mw1151 from pybal and dsh groups to replace disk and reinstall
  • 18:47 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf6 take 3
  • 18:44 logmsgbot: reedy synchronized php-1.24wmf6/includes/api/
  • 18:19 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: rv that
  • 18:18 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf6
  • 15:49 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: (no message)
  • 15:42 logmsgbot: reedy Finished scap: testwiki to 1.24wmf7 and build l10n cache (duration: 24m 24s)
  • 15:17 logmsgbot: reedy Started scap: testwiki to 1.24wmf7 and build l10n cache
  • 15:06 logmsgbot: anomie synchronized php-1.24wmf6/extensions/VisualEditor/modules/ve-mw/ 'SWAT: VisualEditor URL decoding and image alignment fixes. gerrit:135922 gerrit:135946'
  • 04:26 logmsgbot: springle synchronized README 'test sync-file'
  • 04:20 ori: updated scap to 9ba9014: Partially revert "Convert sync-dir and sync-file to python"
  • 04:13 ori: re-enabled puppet on tin
  • 03:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu May 29 03:57:59 UTC 2014 (duration 57m 58s)
  • 03:55 logmsgbot: ori Synchronized README: Debugging sync-file (duration: 00m 06s)
  • 03:51 springle: db1009 mariadb 5.5.37 live trial with low load
  • 03:49 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1009 in s2, take #2'
  • 03:45 ori: disabled puppet on tin and copied sync-common-file from mediawiki/tools/scap@8f2a8356c38 into /usr/local/bin to debug sync issue
  • 03:44 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1009 in s2 (duration: 00m 08s)
  • 03:11 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-29 03:10:15+00:00
  • 02:36 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-29 02:35:45+00:00
  • 01:22 awight: updated SmashPig from e9964bfec47b3796dab0a19a9545cc3abb23fde6 to 98b1f348aa55f6a3aac441db08a59ca309fade7a
  • 01:16 awight: (rollback)
  • 01:16 awight: updated SmashPig from 03015f3827fedea9d0f89c791604ad08ec97ba71 to e9964bfec47b3796dab0a19a9545cc3abb23fde6
  • 01:04 awight: update SmashPig from f64f79f13cf4ab560d0bb5bd69690c827a821629 to 03015f3827fedea9d0f89c791604ad08ec97ba71

May 28

  • 23:39 logmsgbot: demon Synchronized wmf-config/CommonSettings.php: Including external CologneBlue/Modern skins, if they exist (duration: 00m 07s)
  • 22:27 awight: updated crm from 65a433b5564f42c3aa4f310cd4bb938ae70f841d to 5b231163e9e880de5b9787d40b679a6723748aca
  • 22:27 awight: updated tools from 1e8029544dc19a84f6d1adf2783266e16d19ef1f to d257e8445e028b758b1d1fa90c857667d4faac62
  • 21:23 RoanKattouw: Restarting Parsoid Varnishes per gwicke's request
  • 20:38 mutante: enabling puppet on osmium
  • 20:18 hashar: Jenkins: killed all phantomjs process on gallium. They were eating all available memory. All three process were VisualEditor qunit tests.
  • 20:13 subbu: deployed parsoid a234af8c0 (deploy sha f17506eb)
  • 20:12 hashar: gallium (Jenkins master) sent to swap somehow :-(
  • 20:02 logmsgbot: bd808 Finished scap: no-op scap deleted.dblist (duration: 10m 40s)
  • 19:52 bd808|deploy: Horrible log message; should be "no-op scap to test code changes"
  • 19:51 logmsgbot: bd808 Started scap: no-op scap deleted.dblist
  • 19:49 logmsgbot: bd808 Synchronized database lists: (no message) (duration: 00m 03s)
  • 19:48 logmsgbot: bd808 Synchronized robots.txt: Testing sync-file in php (duration: 00m 03s)
  • 19:08 bd808|deploy: Symlinks for mergeCdbFileUpdates, mwversionsinuse, refreshCdbJsonFiles, scap-rebuild-cdbs, scap-recompile and sync-common on tin still pointing to /srv/scap/bin instead of /srv/deployment/scap/scap/bin
  • 19:06 robh: mw1053 reinstalling
  • 18:54 logmsgbot: bd808 Synchronized robots.txt: Testing sync-file in php (duration: 00m 05s)
  • 18:28 mutante: running puppet on jobrunners
  • 18:27 ^d: jobrunners back up now, should slowly catch back up
  • 18:24 bd808|deploy: Scap updated to fd7e538; Trebuchet fetch and checkout failed for mw1053.eqiad.wmnet
  • 18:06 bd808|deploy: Restarted logstash on logstash1001; log event volume suspiciously low for the last ~35 minutes
  • 17:59 ^d: all job runners halted at 17:39? graphite shows no jobs being run, runJobs on fluorine also has nothing since the timestamp.
  • 17:41 logmsgbot: yurik synchronized php-1.24wmf6/extensions/ZeroRatedMobileAccess/
  • 17:37 logmsgbot: yurik synchronized php-1.24wmf5/extensions/ZeroRatedMobileAccess/
  • 17:32 mwalker: enabling worldpay in BE (payments from 5136b0b6852f3e949e4dc847f7137f1b7bc3037b to 7c695e9c4c7386a7585b6067df29b8caaaa089f0)
  • 16:47 hashar: Jenkins/Zuul back. Jobs meant to be run on labs instances ended up not being registered anymore with the Zuul Gearman server. That must be a bug in the Jenkins Gearman plugin :-( bug 63760
  • 16:31 hashar: Jenkins / Zuul locked. Looking into it
  • 16:29 _joe_: restarted mwprof/profiler-to-carbon
  • 15:09 logmsgbot: anomie synchronized php-1.24wmf6/extensions/Wikidata 'SWAT: Fix issue with Wikidata rollback gerrit:135767'
  • 13:52 logmsgbot: reedy synchronized wmf-config/CommonSettings.php
  • 13:26 logmsgbot: reedy synchronized wmf-config/ 'Enable REL1_23 in ExtensionDistributor'
  • 13:24 manybubbles: restarting elastic1001 to revert it back to mmapfs - niofs wasn't better. worse, even.
  • 11:23 manybubbles: restarting elastic1001 to try out niofs (instead of mmapfs) on advice from a lucene developer
  • 03:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed May 28 03:24:24 UTC 2014 (duration 24m 23s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-28 02:24:55+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-28 02:13:02+00:00

May 27

  • 23:06 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I7fdafede7: Disable the anonymous signup invite experiment'
  • 22:00 mutante: caesium, release files: changed file owner groups mwupld->releasers-mediawiki, mobileupld->releasers-mobile (to match switch to yaml groups)
  • 18:39 logmsgbot: reedy synchronized docroot and w
  • 18:38 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.24wmf6
  • 18:31 logmsgbot: reedy synchronized php-1.24wmf6/includes/SkinTemplate.php
  • 18:23 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php
  • 17:46 logmsgbot: aude synchronized php-1.24wmf6/extensions/Wikidata 'JS fixes for Wikidata'
  • 17:29 mutante: welcome new deployer cscott
  • 16:04 gwicke: restarted parsoids after another surge in load
  • 15:48 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: move-categorypages permission changes on fawiki gerrit:135426'
  • 15:42 logmsgbot: anomie synchronized php-1.24wmf5/extensions/UniversalLanguageSelector/resources/ 'SWAT: Update ULS to fix beta feature gerrit:135535'
  • 15:40 logmsgbot: anomie synchronized php-1.24wmf6/extensions/UniversalLanguageSelector/resources/ 'SWAT: Update ULS to fix beta feature gerrit:135310'
  • 15:32 logmsgbot: anomie synchronized php-1.24wmf6/includes/Title.php 'SWAT: Check correct message in category moving gerrit:135211'
  • 15:27 logmsgbot: anomie synchronized php-1.24wmf5/includes/Title.php 'SWAT: Check correct message in category moving gerrit:135210
  • 15:21 logmsgbot: anomie synchronized php-1.24wmf6/includes/HistoryBlob.php 'SWAT: Revert another visibility change that causes errors bugzilla:65665 gerrit:135574'
  • 15:15 logmsgbot: anomie synchronized php-1.24wmf6/includes/revisiondelete/ 'SWAT: Revert another visibility change that causes fatal errors bugzilla:65733 gerrit:135389'
  • 15:13 logmsgbot: anomie synchronized php-1.24wmf5/includes/revisiondelete/ 'SWAT: Revert another visibility change that causes fatal errors bugzilla:65733 gerrit:135388'
  • 15:05 logmsgbot: anomie synchronized php-1.24wmf6/extensions/VisualEditor/modules/ve-mw/ 'SWAT: Fix for VisualEditor image alignment regression gerrit:135171'
  • 12:19 Reedy: Created SecurePoll tables on zerowiki, legalteamwiki, zhwikivoyage, viwikivoyage, tyvwiki
  • 11:40 godog: restart apache2 on tungsten, many report.py hung
  • 05:42 gwicke: restarted parsoids after load surge
  • 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue May 27 03:12:29 UTC 2014 (duration 12m 28s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-27 02:24:51+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-27 02:13:08+00:00
  • 01:39 springle: starting updateCollation on s2 cs-wiki from tin
  • 01:36 logmsgbot: springle synchronized wmf-config/InitialiseSettings.php '$wgCategoryCollation to uca-cs on cswiki'

May 26

  • 09:17 hashar: bugzilla.bugs_fulltext bug was bug 65762
  • 09:16 _joe_: repaired table bugzilla.bugs_fulltext on db1001 as it was marked as crashed
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon May 26 03:10:02 UTC 2014 (duration 10m 1s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-26 02:24:39+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-26 02:13:01+00:00

May 25

  • 03:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun May 25 03:08:47 UTC 2014 (duration 8m 46s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-25 02:24:10+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-25 02:13:47+00:00

May 24

  • 03:02 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat May 24 03:01:50 UTC 2014 (duration 1m 49s)
  • 02:21 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-24 02:20:11+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-24 02:12:25+00:00
  • 00:34 mutante: fixing Aaron's and Ariel's file permissions on fenari

May 23

  • 19:43 logmsgbot: anomie synchronized php-1.24wmf6/includes/HistoryBlob.php 'Backport fix for bug 65665 to 1.24wmf6 gerrit:135089'
  • 19:24 awight: updated fr-tools from 73921d4b4a7ba69b703340ed56e513f8ae8e0bb5 to 1e8029544dc19a84f6d1adf2783266e16d19ef1f
  • 18:37 mwalker: updated paymnets wiki from d99177518b741e7fe18ffda86c83f93c72e164a6 for worldpay
  • 18:22 Jeff_Green: ran authdns-update to merge new wikimedia.community dns zone
  • 17:02 bd808: Starting rolling update of elasticsearch for logstash cluster
  • 16:20 bd808: restarted elasticsearch on logstash1002
  • 16:17 bd808: Elasticsearch on logstash1002 dead due to OOM at 2014-05-23T00:34:03Z
  • 14:52 hashar: killed -9 a remaining Jenkins process
  • 14:21 _joe_: killed zuul server, as was stuck
  • 13:50 _joe_: killed & started jenkins, jvm stuck, unresponsive to jstack
  • 13:17 manybubbles: resarting jenkins because it seems stuck
  • 11:05 mark: Setup BFD on Zayo link between cr2-ulsfo and cr1-eqiad
  • 11:01 mark: Setup BFD on GTT link between cr1-ulsfo and cr2-eqiad
  • 07:30 _joe_: powercycling ms-be1007, unresponsive, console blank, no way to debug
  • 04:17 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri May 23 04:16:15 UTC 2014 (duration 16m 14s)
  • 03:23 logmsgbot: LocalisationUpdate completed (1.24wmf6) at 2014-05-23 03:22:07+00:00
  • 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-23 02:38:33+00:00

May 22

  • 23:22 mutante: osmium,mw1151 fixed UID of mwalker (605->2454)
  • 23:16 bd808: Ran sync-common manually on osmium and mw1151
  • 23:14 mwalker: sync-dir failed for osmium and mw1151
  • 23:14 logmsgbot: mwalker synchronized php-1.24wmf6/extensions/VisualEditor 'Syncing the extension manually because of scap failures on osium, mw1010, mw1070, mw1161, mw1201, and mw1151'
  • 23:11 logmsgbot: mwalker Finished scap: SWAT Update to VisualEditor 134941 (duration: 03m 04s)
  • 23:08 logmsgbot: mwalker Started scap: SWAT Update to VisualEditor 134941
  • 21:43 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'Ifd048bafe0eb4af8765cee20a3d93d7663b1bcdf'
  • 21:33 logmsgbot: reedy synchronized multiversion/MWMultiVersion.php
  • 21:29 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'Iecdd8c5e60a142363b40e34d4fe2f27f0e5feef5'
  • 21:22 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'touching'
  • 21:21 logmsgbot: demon synchronized wmf-config/flaggedrevs.php 'Removing FR from mw.org'
  • 21:19 logmsgbot: demon synchronized flaggedrevs.dblist 'Removing FR from mw.org'
  • 20:10 logmsgbot: marktraceur synchronized wmf-config/InitialiseSettings.php 'Sync for mediaviewer.dblist change'
  • 20:06 logmsgbot: marktraceur synchronized mediaviewer.dblist 'Enabling Media Viewer on itwiki and ruwiki by default'
  • 20:04 logmsgbot: marktraceur updated /a/common to I1c658bf65: Remove VE formula editor from BF whitelist (graduated)
  • 19:54 logmsgbot: reedy Purged l10n cache for 1.24wmf4
  • 19:54 logmsgbot: reedy Purged l10n cache for 1.24wmf3
  • 19:54 logmsgbot: reedy Purged l10n cache for 1.24wmf2
  • 19:53 logmsgbot: reedy Purged l10n cache for 1.24wmf1
  • 19:53 logmsgbot: reedy Purged l10n cache for 1.23wmf22
  • 19:52 logmsgbot: reedy Purged l10n cache for 1.23wmf21
  • 19:38 paravoid: cr1/2-ulsfo: BGP peering with AS11820 (WMF Corp HQ)
  • 19:28 Reedy: Ran patch-fr_page_rev-index.sql patch on fawiki
  • 19:27 Reedy: Created flaggedrevs_statistics table on fawiki
  • 19:04 logmsgbot: reedy synchronized wmf-config/
  • 18:30 logmsgbot: reedy synchronized wmf-config/ 'I7a02f2615d98428b6f27514e75d935d36e44fcb1'
  • 18:22 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf6
  • 18:04 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf5
  • 17:17 mutante: powercycling ms-be1012
  • 17:05 logmsgbot: reedy Finished scap: testwiki to 1.24wmf6 and build l10n cache (duration: 28m 31s)
  • 16:37 logmsgbot: reedy Started scap: testwiki to 1.24wmf6 and build l10n cache
  • 16:30 mutante: maerlant - it was done for ~8d, old test host that didn't really do anything, revoked salt/pupppet certs, removing from Icinga..
  • 15:52 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: Fix typo in MultimediaViewer logging config gerrit:134837'
  • 15:31 logmsgbot: anomie synchronized php-1.24wmf5/extensions/MultimediaViewer/ 'SWAT: Deploy new MultimediaViewer logging to wmf5 wikis gerrit:134804'
  • 15:22 logmsgbot: anomie synchronized wmf-config/CommonSettings.php 'SWAT: Disable old MultimediaViewer logging and pre-enable new logging gerrit:134343'
  • 15:21 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: Disable old MultimediaViewer logging and pre-enable new logging gerrit:134343'
  • 15:13 logmsgbot: anomie synchronized php-1.24wmf5/extensions/MultimediaViewer/tests/qunit/mmv/ui/ 'SWAT: Fix qunit tests for MultimediaViewer gerrit:134807'
  • 14:31 paravoid: pushing new swift rings
  • 03:38 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu May 22 03:37:31 UTC 2014 (duration 37m 30s)
  • 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-22 02:38:18+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-22 02:14:49+00:00

May 21

  • 23:28 logmsgbot: maxsem synchronized php-1.24wmf5/extensions/MultimediaViewer 'touch'
  • 23:22 logmsgbot: maxsem synchronized php-1.24wmf5/extensions/MultimediaViewer/ 'https://gerrit.wikimedia.org/r/#/c/134750/'
  • 23:16 logmsgbot: maxsem synchronized php-1.24wmf5/extensions/Flow 'https://gerrit.wikimedia.org/r/#/c/134746/'
  • 22:18 Krinkle: Running deleteEqualMessages.php on guwiki (bug 43917)
  • 21:03 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'GeoData using Elasticsearch everywhere'
  • 20:07 subbu: deployed Parsoid 95929801b (deploy sha ae83633a)
  • 19:56 awight: updated tools from c1f50f6909b04768f3a8faa50b25e88a43f89606 to 73921d4b4a7ba69b703340ed56e513f8ae8e0bb5
  • 19:05 mutante: welcome new deployer tgr
  • 18:22 andrewbogott: restarting gerrit service
  • 15:47 logmsgbot: anomie synchronized php-1.24wmf5/tests/qunit/suites/resources/mediawiki/mediawiki.user.test.js 'May as well sync this too'
  • 15:45 logmsgbot: anomie synchronized php-1.24wmf5/resources/src/mediawiki/mediawiki.user.js 'SWAT: Use mw.log.deprecate to track user() and anonymous()'
  • 15:28 logmsgbot: anomie synchronized php-1.24wmf5/includes/filerepo/file/LocalFile.php 'SWAT: Tweaked timestamp logic in recordUpload2'
  • 15:19 logmsgbot: anomie synchronized php-1.24wmf5/includes/filerepo/file/LocalFile.php 'SWAT: Replace FOR UPDATE with LockManager use in LocalFile::lock()'
  • 15:18 logmsgbot: anomie synchronized php-1.24wmf5/includes/filebackend/FileBackend.php 'SWAT: Replace FOR UPDATE with LockManager use in LocalFile::lock()'
  • 14:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'raise db1068 to normal load'
  • 12:30 hashar: Jenkins: updated sysadmin email address from nobody@integration.wikimedia.org to jenkins-bot@wikimedia.org
  • 12:22 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'pool db1068 in s4, warm up'
  • 07:25 awight: updated tools from d2437564c56881f6b879403f2f6f2f554b6b0391 to c1f50f6909b04768f3a8faa50b25e88a43f89606
  • 07:17 awight: updated tools from ee31fc94b17c11a48ddac19aabfcdaab69fd2f72 to d2437564c56881f6b879403f2f6f2f554b6b0391
  • 03:42 springle: resume xtrabackup db1049 to db1068, throttled
  • 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed May 21 03:12:31 UTC 2014 (duration 12m 30s)
  • 03:13 mutante: merging Change-Id: I2827d1ef347 and starting icinga fixed it
  • 03:09 springle: killed db1068 xtrabackup, saturating db1064 network
  • 03:01 mutante: icinga broken on neon due to missing servicegroup 'analytics_eqiad'
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-21 02:29:03+00:00
  • 02:19 springle: xtrabackup clone db1049 to db1068
  • 02:18 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 reduce db1049 load while cloning'
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-21 02:15:36+00:00
  • 02:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 raise db1056 to normal load, depool db1011'

May 20

  • 23:53 logmsgbot: maxsem synchronized php-1.24wmf4/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/#/c/134504/'
  • 23:52 logmsgbot: maxsem synchronized php-1.24wmf5/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/#/c/134504/'
  • 23:40 logmsgbot: maxsem synchronized php-1.24wmf5/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/134517'
  • 23:39 logmsgbot: maxsem synchronized php-1.24wmf4/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/134517'
  • 23:14 logmsgbot: maxsem synchronized php-1.24wmf4/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/#/c/134405/'
  • 23:11 logmsgbot: maxsem synchronized php-1.24wmf5/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/#/c/134405/'
  • 22:00 logmsgbot: aaron synchronized wmf-config/filebackend.php '69201b4caf703ef1ab52b38be29c80b4e939fdc2 - no-op'
  • 21:57 logmsgbot: aaron synchronized wmf-config/filebackend.php 'Removed old tampa config'
  • 21:41 logmsgbot: bsitu synchronized wmf-config/InitialiseSettings.php 'Re-enable flow on mediawiki:Talk:Design'
  • 21:39 logmsgbot: bsitu updated /a/common to I037cd0a42: Re-enable flow on Talk:Design ( Removed LQT code )
  • 21:30 logmsgbot: bsitu synchronized wmf-config/InitialiseSettings.php 'Disable flow on mediawiki:Talk:Design'
  • 21:28 logmsgbot: bsitu updated /a/common to Idde23abd3: Undo "enable flow on Talk:Design"
  • 21:11 logmsgbot: bsitu synchronized wmf-config/InitialiseSettings.php 'Enable Flow on 3 mediawiki talk pages'
  • 21:08 logmsgbot: bsitu updated /a/common to I549967ca2: Group1 wikis to 1.24wmf5
  • 20:35 Krinkle: Reload zuul to deploy I80496db747a8668be
  • 19:14 hoo: fixed Wikidata for php-1.24wmf5 on mw1138 by manually removing it and then running sync-common
  • 19:09 bd808: ran sync-common on mw1138
  • 19:02 bd808: Updated scap to 7b6fc47
  • 18:44 logmsgbot: aude synchronized php-1.24wmf5/extensions/Wikidata 'Fix jquery error tooltip issue'
  • 18:33 bd808|deploy: Gave up on updating scap with trebuchet
  • 18:32 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.24wmf5
  • 18:22 RobH: restarting salt minions on mw servers
  • 18:12 bd808|deploy: `git deploy sync` for scap ended with "0/230 minions completed fetch"
  • 15:15 andrewbogott: running salt "ms-be*" cmd.run "kill $(ps aux | grep 'find / -user' | awk '{print $2}')" to kill runaway 'finds' on swifts
  • 14:47 andrewbogott: running 'find' commands on many hosts to chown files for users with new UIDs.
  • 13:20 Krinkle: git-deploy: Deploying integration/slave-scripts I4a4e2a4c90fb6
  • 03:38 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue May 20 03:37:32 UTC 2014 (duration 37m 31s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-20 02:33:31+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-20 02:19:33+00:00

May 19

  • 23:37 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'autopatrolled group on dewikivoyage'
  • 23:26 logmsgbot: catrope synchronized php-1.24wmf5/extensions/VisualEditor
  • 23:25 logmsgbot: catrope synchronized php-1.24wmf5/extensions/MobileFrontend
  • 23:24 logmsgbot: catrope synchronized php-1.24wmf5/extensions/Wikidata
  • 22:19 andrewbogott: Coren restarted opendj and I restarted pdns on virt1000. Opendj was refusing connections for unclear reasons
  • 21:34 Krinkle: Running deleteEqualMessages.php on urwiki (bug 43917)
  • 21:33 Krinkle: Running deleteEqualMessages.php on commonswiki (bug 43917)
  • 20:53 superm401: sync-l10nupdate-1 1.24wmf4 had one error on mw1218
  • 20:27 gwicke: updated Parsoid to 3ac048d7c4b
  • 20:05 csteipp: fix deployed for bug 65501
  • 20:00 Krinkle: Running deleteEqualMessages.php on fowiki (bug 43917)
  • 20:00 Krinkle: Running deleteEqualMessages.php on enwikinews (bug 43917)
  • 19:59 Krinkle: Reloading zuul to deploy I0b8051074da39edcac
  • 19:16 bd808: Added display of exception-json events to fatalmonitor logstash dashboard
  • 19:01 logmsgbot: bd808 Purged l10n cache for 1.24wmf3
  • 19:00 logmsgbot: bd808 Purged l10n cache for 1.23wmf22
  • 18:59 logmsgbot: bd808 Purged l10n cache for 1.23wmf21
  • 18:59 logmsgbot: bd808 Purged l10n cache for 1.23wmf21
  • 18:46 logmsgbot: reedy Finished scap: nooop to test for errors (duration: 02m 45s)
  • 18:44 logmsgbot: reedy Started scap: nooop to test for errors
  • 18:43 Reedy: rm -rf /usr/local/apache/common-local/php-1.23wmf20 against all apaches
  • 18:37 Reedy: Ran sync-common locally on mw1015
  • 18:32 greg-g: mw1010.eqiad.wmnet::common for sync-common, not sure for cdb. (sez superm401)
  • 18:23 superm401: 1 server failed for sync-common. 2 servers failed for sync-rebuild-cdbs
  • 18:18 logmsgbot: mattflaschen Finished scap: Deploy GettingStarted and enable experiment for de, en, fr, and it (duration: 18m 53s)
  • 17:59 logmsgbot: mattflaschen Started scap: Deploy GettingStarted and enable experiment for de, en, fr, and it
  • 15:26 logmsgbot: manybubbles synchronized php-1.24wmf5/resources/lib/oojs-ui/ 'fix panellayout'
  • 15:18 logmsgbot: manybubbles synchronized php-1.24wmf4/extensions/CirrusSearch/ 'adding url parameter to suppress snippets and one to suggest suggestions to cirrus'
  • 15:12 manybubbles: SWAT deployed cirrus update for wmf5 and looks good. doing for wmf4 now.
  • 15:10 logmsgbot: manybubbles synchronized php-1.24wmf5/extensions/CirrusSearch/
  • 14:53 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'touched and synced InitializeSettings.php to make update to cirrus.dblist take hold - resyncing to mw1171'
  • 14:51 _joe_: powercycled mw1171, dead and serial console stuck
  • 14:51 logmsgbot: manybubbles synchronized cirrus.dblist 'Switch cirrus to the primary backend for zh-yue wikipedia - resyncing to mw1171'
  • 14:40 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'touched and synced InitializeSettings.php to make update to cirrus.dblist take hold'
  • 14:38 logmsgbot: manybubbles synchronized cirrus.dblist 'Switch cirrus to the primary backend for zh-yue wikipedia'
  • 13:34 Krinkle: Running deleteEqualMessages.php on mtwiki (bug 43917)
  • 13:17 Krinkle: Running deleteEqualMessages.php on zh_min_nanwiki (bug 43917)
  • 13:07 Krinkle: Running deleteEqualMessages.php on zh_yuewiki (bug 43917)
  • 12:01 Krinkle: Running deleteEqualMessages.php on suwiki (bug 43917)
  • 03:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon May 19 03:09:00 UTC 2014 (duration 8m 59s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-19 02:25:51+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-19 02:13:37+00:00
  • 00:30 Tim: on osmium: stopping job runners in order to fix cgroup permissions issue

May 18

  • 03:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun May 18 03:06:03 UTC 2014 (duration 6m 2s)
  • 02:24 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-18 02:23:29+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-18 02:12:46+00:00

May 17

  • 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat May 17 03:07:12 UTC 2014 (duration 7m 11s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-17 02:24:46+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-17 02:14:13+00:00
  • 01:58 mutante: powercycling labsdb1003
  • 00:23 ori: varnishadm on cp1056 confirms that varnish recognizes mw1151 as "sick"
  • 00:20 ori: stopping apache and disabling puppet on mw1151 so that varnish stops forwarding reqs to it
  • 00:16 Krinkle: On mw1151, Gadget::loadStructuredList() returns false, memcached has no value for 'enwiki:gadgets-definition:7' and is unable to store it.
  • 00:11 ori: Krinkle identified weird RL responses as all originating in mw1151; dmesg shows ata1 disk troubles: "failed command: READ DMA EXT", "sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed"
  • 00:04 logmsgbot: krinkle synchronized php-1.24wmf4/includes/resourceloader/ResourceLoader.php 'I718fcf23d'

May 16

  • 23:07 awight: updated crm from 0b8aa8aa046935b6cfc67c10ebe10396d5e42745 to 65a433b5564f42c3aa4f310cd4bb938ae70f841d
  • 22:56 logmsgbot: aaron synchronized php-1.24wmf4/includes/db/Database.php '182e42c173b9ab0c2bc5d753879a000b1ff39e77'
  • 22:54 logmsgbot: aaron synchronized php-1.24wmf5/includes/db/Database.php '8829ffc72d3332d348a1a2e58d525e54e126bad5'
  • 22:17 awight: tools updated from 85bb7293d83517086e3609f03365aecde9f58c71 to ee31fc94b17c11a48ddac19aabfcdaab69fd2f72
  • 21:37 logmsgbot: ori synchronized php-1.24wmf4/extensions/MultimediaViewer 'Update MultimediaViewer for I0df067a61: Add sampling to unsampled event logging'
  • 21:33 logmsgbot: ori synchronized php-1.24wmf5/extensions/MultimediaViewer 'Update MultimediaViewer for I0df067a61: Add sampling to unsampled event logging'
  • 21:10 mwalker: updating fundraising smashpig from 2fdf982b20f1cbeaf9f57af64ef21b5b69a36f6e to f64f79f13cf4ab560d0bb5bd69690c827a821629
  • 20:41 awight: update crm from 243641de631b712c4a29ca1f3618771b78dadeae to 0b8aa8aa046935b6cfc67c10ebe10396d5e42745
  • 18:43 awight: update tools from a40c0caa18a0efd93bc5d3f7f68386fbc36bf1fa to 85bb7293d83517086e3609f03365aecde9f58c71
  • 18:12 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I3c453b0949f4e: Tweak MediaViewer sampling settings'
  • 18:07 logmsgbot: ori synchronized wmf-config 'Ia43821231: Add sampling control setting for MediaViewer event'
  • 18:04 logmsgbot: ori updated /a/common to Ia43821231: Add sampling control setting for MediaViewer event logging
  • 17:47 logmsgbot: demon synchronized wmf-config/CommonSettings.php 'GeoData to Elastic for all wikivoyages'
  • 17:11 mwalker: updating staging payments servers as well from 5e24b953dcff5305099e152139e6e93daba8aeec to d99177518b741e7fe18ffda86c83f93c72e164a6
  • 17:10 mwalker: and updated to 1.22.6
  • 17:10 mwalker: moved fundraising wiki from 6a1d4983319038edeb88dc34a1c220ecaec1cbde to d99177518b741e7fe18ffda86c83f93c72e164a6 -- including json i18n changes
  • 16:55 manybubbles: "in place" reindexing (for cirrus) all the wikipedias after the deploy train hit them yesterday
  • 16:53 logmsgbot: demon synchronized wmf-config/CommonSettings.php 'Removing old WikiEditor settings'
  • 16:44 RobH: partial zirconium downtime
  • 16:44 RobH: i logged into zirconium, but it had recovered by the time I checked it.
  • 16:33 mwalker: updated fundraising civicrm from 7a23465e620211739421cce3ad57c62597eb8cc3 to 75c1a50b8aa7e7b6f218d7c420932a8fc53a0a34 for an exchange rates fix
  • 16:26 qchris: updated gerrit's hooks-bugzilla plugin to version 2.8.1.2 to allow talking to bugzilla-4.4.4
  • 13:54 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1056 in s4, warm up'
  • 13:17 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'raise db1070 and db1071 to normal load'
  • 10:14 springle: xtrabackup clone db1049 to db1056
  • 10:13 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'reduce db1049 load while cloning'
  • 09:40 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'pool db1070 and db1071 in s1, warm up'
  • 06:20 logmsgbot: ori synchronized php-1.24wmf4/maintenance/compareParserCache.php 'Ica69a3ef2: Added a script to compare current parser output to cache (no impact on prod; syncing for consistency)'
  • 03:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri May 16 03:54:03 UTC 2014 (duration 54m 2s)
  • 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf5) at 2014-05-16 03:08:04+00:00
  • 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-16 02:38:41+00:00
  • 02:07 springle: xtrabackup db1070 to db1071
  • 01:02 logmsgbot: ori synchronized php-1.24wmf5/includes/parser 'I12a60b5cc: Revert "Declare visibility on class properties of includes/parser/"'
  • 00:54 hoo: rebuildItemsPerSite finished running for Wikidata (after about 30h).
  • 00:32 hoo: manually ran rebuildEntityPerPage for Wikidata to fix 2 broken records
  • 00:07 logmsgbot: maxsem synchronized wmf-config 'https://gerrit.wikimedia.org/r/#/c/131762/'
  • 00:00 logmsgbot: ori synchronized wmf-config/squid.php 'Id188979c1: Use whole subnets in squid.php list for XFF acceptance'

May 15

  • 23:57 logmsgbot: ori updated /a/common to Id188979c1: Use whole subnets in squid.php list for XFF acceptance
  • 23:38 logmsgbot: ori synchronized php-1.24wmf4/includes 'Ia3b12fb9: Speed up CIDR matching from $wgSquidServersNoPurge'
  • 23:19 logmsgbot: ori synchronized php-1.24wmf5/includes 'Ia3b12fb9: Speed up CIDR matching from $wgSquidServersNoPurge'
  • 23:05 logmsgbot: ori synchronized wmf-config/CirrusSearch-production.php 'Iae07852b1: Elasticsearch plugin juggling'
  • 22:54 logmsgbot: ori synchronized wmf-config 'I51a55c4e2, Ia6c01a913, I594848ce0, and I594848ce0'
  • 22:50 logmsgbot: ori updated /a/common to Ifae836de5: Swapping GeoData backend for enwikivoyage
  • 22:28 logmsgbot: ori synchronized php-1.24wmf4/extensions/EventLogging 'Update EventLogging to master for I89819bd943'
  • 22:26 logmsgbot: ori synchronized php-1.24wmf5/extensions/EventLogging 'Update EventLogging to master for I89819bd943'
  • 22:06 awight: updated tools from 93fda5da99674eca221e0abf53ad499583b27cfb to a40c0caa18a0efd93bc5d3f7f68386fbc36bf1fa
  • 22:04 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'GeoData using elasticsearch on enwikivoyage'
  • 21:04 awight: updated all tools from 47407c16d9922b17af70146416913abfe50b728d to 93fda5da99674eca221e0abf53ad499583b27cfb
  • 20:16 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'I5da4aa5db7b5d3c1843a6fd68d0a7c62a2bbfb4e'
  • 19:56 mwalker: updated fundraising tools repo for screenshots, worldpay auditing, live analysis, and... stomp! from 0eb485c8b6db5f06805976860bce7aa8b0d6444b to 47407c16d9922b17af70146416913abfe50b728d
  • 19:09 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf5
  • 18:55 ori: deploying twemproxy module on mw106*, they may complain for a moment
  • 18:52 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf4
  • 18:46 logmsgbot: reedy Finished scap: testwiki to 1.24wmf5 and build l10n cache (duration: 27m 47s)
  • 18:32 mutante: mw1053 was already disabled in pybal though and RT 7408,7435
  • 18:31 mutante: mw1053 sits at disk partitioning dialog (via mgmt)
  • 18:29 Reedy: mw1053 is pingable but not ssh-able
  • 18:18 logmsgbot: reedy Started scap: testwiki to 1.24wmf5 and build l10n cache
  • 17:53 Jeff_Green: adjusted exim conf on mchenry to route donate.wm.o mail to barium instead of aluminium
  • 16:43 mwalker: disabled qc and put site_offline and maintenance_mode on civicrm to true
  • 15:20 logmsgbot: anomie synchronized php-1.24wmf4/extensions/MultimediaViewer 'SWAT: Deploy change 133446 to fix bug 65225 in MultimediaViewer'
  • 14:03 springle: xtrabackup clone db1056 to db1070
  • 13:59 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1056 while cloning'
  • 13:44 cmjohnson1: sodium going down again for a different disk replacement
  • 13:16 cmjohnson1: shutting down sodium to replace sdb
  • 12:56 godog: restarting gerrit on ytterbium, clones over https seemingly stuck
  • 12:24 manybubbles|away: "in place" reindexing group1 wikis after the deployment train updated cirrus yesterday. They'll need a full reindex after that is done which will take some time but is required to fix issues with redirects not showing up off of the main namespace
  • 11:56 godog: installed openjdk-7-jdk on ytterbium to attempt gerrit thread dump
  • 10:15 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1009 for raid tests'
  • 06:44 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'move s5 api traffic to db1005'
  • 05:19 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'move s4 commonswiki api traffic to db1042'
  • 04:20 springle: installed db1073
  • 03:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu May 15 03:14:04 UTC 2014 (duration 14m 3s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-15 02:26:09+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-15 02:14:31+00:00

May 14

  • 23:42 logmsgbot: mwalker synchronized wmf-config/InitialiseSettings.php 'Poking settings to try and apply them'
  • 23:29 logmsgbot: mwalker synchronized visualeditor.dblist 'Another part of 132409 (visual editor)'
  • 23:27 K4-713: updated payments from 78cc4285bdeb6eecba3efc75e4a04c8b886561e4 to 5e24b953dcff5305099e152139e6e93daba8aeec
  • 23:27 logmsgbot: mwalker synchronized wmf-config/ 'SWAT of 132409 (visual editor) and 130274 (abuse filter)'
  • 22:04 logmsgbot: maxsem synchronized php-1.24wmf3/extensions/MobileFrontend/ 'bug 65042'
  • 22:03 marktraceur: cscott deployed a jenkins job change that pushes parsoid git files to beta-labs for version purposes
  • 22:03 logmsgbot: maxsem synchronized php-1.24wmf4/extensions/MobileFrontend/ 'bug 65042'
  • 20:38 awight: updated crm from 3fd3b94834f94529841ad4a695ecd73c98e487bc to 7a23465e620211739421cce3ad57c62597eb8cc3
  • 20:32 bd808: Restarting logstash on logstash1001.eqiad.wmnet due to missing messages from some (all?) logs
  • 19:58 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'No more LQT on wikimania2011wiki'
  • 18:32 Krinkle: integration-slave1001 had its 8GB / /dev/vda1 100% full. Purging /tmp/perf-*.map brought it back to 41%
  • 18:25 Krinkle: integration-slave1001 is having issues writing to disk
  • 17:50 logmsgbot: yurik synchronized php-1.24wmf4/extensions/ZeroRatedMobileAccess/
  • 17:47 logmsgbot: yurik synchronized php-1.24wmf3/extensions/ZeroRatedMobileAccess/
  • 17:30 logmsgbot: yurik synchronized wmf-config/CommonSettings.php
  • 15:44 chasemp: disabling puppet on tungsten to try tweaking carbon settings to affect queue drops (for the better)
  • 14:28 cmjohnson1: mw1053 going down for disk replacement
  • 13:27 bblack: restarting pybals on lvs300x
  • 12:30 _joe_: restarted uwsgi on tungsten
  • 09:46 mark: Started PyBal on lvs300* and established BGP sessions with the routers
  • 09:43 mark: Setup BGP configuration for lvs300* on cr1-esams and cr2-knams, with elevated MEDs to keep them as last resorts
  • 04:18 Tim: deploying apache configuration change with fixes
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed May 14 03:10:36 UTC 2014 (duration 10m 35s)
  • 03:01 Tim: reverting apache change
  • 02:53 Tim: deploying apache configuration change https://gerrit.wikimedia.org/r/106109
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-14 02:25:08+00:00
  • 02:21 springle: upgrade db1043, rebuild as m3 master
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-14 02:13:23+00:00
  • 00:18 manybubbles: bouncing elasticsearch on elastic1015 to pick up gc logging configuration. it might warn but shouldn't cause any service disrubtion.

May 13

  • 23:27 logmsgbot: maxsem synchronized php-1.24wmf4/includes/exception/MWException.php 'https://gerrit.wikimedia.org/r/#/c/133184/'
  • 23:25 logmsgbot: maxsem synchronized php-1.24wmf3/includes/exception/MWException.php 'https://gerrit.wikimedia.org/r/#/c/133183/'
  • 23:21 MaxSem: Ran namespaceDupes after adding new namespaces to zhwikisource - no problems found
  • 23:18 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/127584'
  • 22:43 chasemp: ms-be1009 rebooted as it had locked up, swift seems to have recoverd
  • 22:35 mutante: created new gerrit projects for phabricator,arcanist and libphutil
  • 22:22 ori: restarting tungsten to verify fix for gdash/graphite initialization
  • 21:33 ori: gdash and graphite currently down; chase & ori debugging
  • 21:29 manybubbles: I caused elasticsearch1015 to drop out of the Elasticsearch cluster by tring to take a heap dump on it. don't do that. It stops the application for many seconds.
  • 19:07 logmsgbot: yurik synchronized php-1.24wmf3/extensions/ZeroRatedMobileAccess/
  • 19:02 logmsgbot: yurik synchronized php-1.24wmf4/extensions/ZeroRatedMobileAccess/
  • 18:49 cmjohnson1: replacing failed disk dataset1001
  • 18:25 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch for I1681addaed690b652822c0296b7a3e9b84de93b6'
  • 18:22 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf4
  • 15:52 logmsgbot: anomie synchronized php-1.24wmf4/resources/Resources.php 'SWAT: Deploy jQuery Migrate to 1.24wmf4'
  • 15:51 logmsgbot: anomie synchronized php-1.24wmf4/resources/lib/jquery/jquery.migrate.js 'SWAT: Deploy jQuery Migrate to 1.24wmf4'
  • 04:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue May 13 04:07:53 UTC 2014 (duration 7m 52s)
  • 03:01 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-13 03:00:07+00:00
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-13 02:30:33+00:00

May 12

  • 23:28 logmsgbot: yurik synchronized php-1.24wmf3/extensions/ZeroRatedMobileAccess/
  • 23:24 logmsgbot: yurik synchronized php-1.24wmf4/extensions/ZeroRatedMobileAccess/
  • 20:11 gwicke: deployed Parsoid d1c778ea3
  • 17:00 mutante: re-enabled mw1186 in pybal
  • 16:58 logmsgbot: manybubbles Finished scap: scapping again to get ms1186 synced up (duration: 00m 56s)
  • 16:57 logmsgbot: manybubbles Started scap: scapping again to get ms1186 synced up
  • 16:49 mutante: disabled mw1186 in pybal
  • 16:46 mutante: mw1186 - down, powercycling
  • 16:39 logmsgbot: manybubbles Finished scap: fix php symlink (duration: 04m 25s)
  • 16:35 logmsgbot: manybubbles Started scap: fix php symlink
  • 16:09 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php
  • 16:07 logmsgbot: manybubbles Finished scap: update visual editor for swat deploy (duration: 28m 58s)
  • 15:38 logmsgbot: manybubbles Started scap: update visual editor for swat deploy
  • 11:01 springle: killed bunch of slow Flow\Formatter\ContributionsQuery::queryRevisions queries on flowdb
  • 03:14 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon May 12 03:13:40 UTC 2014 (duration 13m 39s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-12 02:25:54+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-12 02:14:23+00:00

May 11

  • 15:26 logmsgbot: reedy synchronized php-1.24wmf3/thumb.php
  • 15:23 logmsgbot: reedy synchronized php-1.24wmf4/thumb.php
  • 14:21 cmjohnson1: power cycling asw-d5-eqiad
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun May 11 03:10:37 UTC 2014 (duration 10m 36s)
  • 02:23 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-11 02:22:02+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-11 02:12:49+00:00

May 10

  • 19:58 logmsgbot: hoo synchronized php-1.24wmf3/extensions/Wikidata/ 'Resyncing Wikidata for mw1122'
  • 18:31 hoo: approved an oauth request by Aaron Halfaker by making myself oauth admin for a moment
  • 17:36 logmsgbot: hoo synchronized php-1.24wmf4/extensions/Wikidata/ 'Update Wikidata to fix the JSON dump generation'
  • 17:35 logmsgbot: hoo synchronized php-1.24wmf3/extensions/Wikidata/ 'Update Wikidata to fix the JSON dump generation'
  • 17:11 marktraceur: pushed new uploadwizard qunit job to Jenkins
  • 16:30 logmsgbot: reedy updated /a/common to I415e67919: Memory limit to 235M
  • 15:42 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'I415e679197b97e2babe50544cf1e8c26c13a598a'
  • 13:59 logmsgbot: bsitu synchronized php-1.24wmf4/extensions/Flow 'Update Flow'
  • 13:46 Krinkle: Reloading Zuul to deploy I403760f1f6dd1bc2
  • 12:07 logmsgbot: hoo synchronized php-1.24wmf4/extensions/Wikidata/ 'Update Wikibase to fix performance issues with dumpJson'
  • 12:05 logmsgbot: hoo synchronized php-1.24wmf3/extensions/Wikidata/ 'Update Wikibase to fix performance issues with dumpJson (2nd run)'
  • 12:03 logmsgbot: hoo synchronized php-1.24wmf3/extensions/Wikidata/ 'Update Wikibase to fix performance issues with dumpJson'
  • 08:53 paravoid: rack D5 down, switch unresponsive; minimal impact (mw1201-1203, 1208-1210)
  • 06:56 logmsgbot: aaron synchronized php-1.24wmf3/img_auth.php '264967c58eccb6dae872ab7345d08f8381ac43a7'
  • 06:46 logmsgbot: aaron synchronized php-1.24wmf4/img_auth.php 'b08af402ef2de7b2c79f71d848c2b8ae98b47be0'
  • 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat May 10 03:12:27 UTC 2014 (duration 12m 26s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-10 02:25:31+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-10 02:15:38+00:00
  • 00:02 awight: updated crm from 237d463ed3e275c217a4f497ed30d2f7f20100eb to 3fd3b94834f94529841ad4a695ecd73c98e487bc

May 9

  • 23:09 K4-713: updated payments cluster from 3be44f5d14c00a893a985f3aad86b6b59507a987 to 78cc4285bdeb6eecba3efc75e4a04c8b886561e4
  • 19:24 logmsgbot: reedy synchronized wmf-config/ 'I5265c408443212536a5ed96d910caba50c22e767'
  • 18:56 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'Enable EducationProgram on ukwiki'
  • 18:52 logmsgbot: reedy updated /a/common to I0d1ea1639: Remove 1.23wmf13 through 1.23wmf20
  • 18:48 logmsgbot: reedy synchronized docroot and w
  • 18:44 logmsgbot: reedy updated /a/common to I25d891030: Do not optimize commons for new highlighter
  • 18:41 Reedy: Created EducationProgram tables on ukwiki
  • 15:45 manybubbles: reindexing commons to unbreak file searches on wikis not using the experimental highlighter
  • 15:43 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Do not optimize commons for new highlighter on commons'
  • 15:03 manybubbles: rebuilding hewiki's cirrus index so it can pick up hebmorph too
  • 14:52 logmsgbot: spage synchronized php-1.24wmf4/extensions/Flow 'Fix Flow add new topics and reply in 1.24wmf4'
  • 13:40 logmsgbot: reedy updated /a/common to I721c36406: Add en-rtl to wgExtraLanguageNames for beta
  • 13:24 logmsgbot: reedy updated /a/common to I75a80a998: beta: create a RTL english wiki
  • 10:43 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'raise db1067 and db1066 to normal load. depool db1043'
  • 09:32 logmsgbot: hoo synchronized php-1.24wmf3/extensions/Wikidata/ 'Fix Job injection error handling'
  • 09:24 logmsgbot: hoo synchronized php-1.24wmf3/extensions/Wikidata/ 'Fix Job injection error handling'
  • 08:36 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'I4657fe64572fb3db22e3b48a87df7112b2248e35'
  • 08:21 hashar: apt-get upgraded apache on gallium and lanthanum
  • 08:18 hashar: Jenkins: un pooled integration-slave1001 and rebooting the instance.
  • 08:16 hashar: Jenkins: un pooled integration-slave1002 and rebooting the instance.
  • 08:16 hashar: restarting Zuul (seems some jobs are not properly registered)
  • 06:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'pool db1066 in s2, db1067 in s1, warm up'
  • 05:00 springle: installed db1067, db1070, db1071
  • 04:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri May 9 04:08:07 UTC 2014 (duration 8m 6s)
  • 03:47 springle: xtrabackup clone db1036 to db1067, db1051 to db1066
  • 03:46 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'reduce db1036 and db1051 load while cloning'
  • 03:24 springle: installed db106[678]
  • 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf4) at 2014-05-09 03:08:55+00:00
  • 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-09 02:40:06+00:00
  • 00:03 logmsgbot: maxsem synchronized php-1.24wmf3/extensions/MobileFrontend/ 'bug 65042'
  • 00:01 logmsgbot: maxsem synchronized php-1.24wmf4/extensions/MobileFrontend/ 'bug 65042'

May 8

  • 23:11 logmsgbot: maxsem synchronized php-1.24wmf3/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/132299'
  • 23:08 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Replica count for commonswiki_file -- syncing with whats already live'
  • 23:07 logmsgbot: maxsem synchronized php-1.24wmf4/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/132299'
  • 21:30 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'wmgVectorBetaPersonalBar to true for all wikis'
  • 21:28 logmsgbot: reedy updated /a/common to I44f67444c: group0 to 1.24wmf4
  • 20:37 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf3 and group0 to 1.24wmf4
  • 20:32 logmsgbot: reedy updated /a/common to I11e5ca294: FUTURE: Fifth batch of pilot sites for Media Viewer
  • 20:05 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'touch'
  • 20:04 logmsgbot: demon synchronized mediaviewer.dblist 'mediaviewer for svwiki, eswiki, jawiki, ptwiki'
  • 18:56 DJ-K4: enabled fredge queue consumption + jenkins job
  • 17:10 logmsgbot: reedy Finished scap: Build l10n cache for 1.24wmf4 and move testwiki (duration: 17m 05s)
  • 16:53 logmsgbot: reedy Started scap: Build l10n cache for 1.24wmf4 and move testwiki
  • 16:52 chasemp: rebooted ms-be1006 since it dropped dead
  • 16:52 logmsgbot: reedy Finished scap: Build l10n cache for 1.24wmf4 and move testwiki (duration: 11m 42s)
  • 16:42 manybubbles: reindexing the hebrew wikis other then hewikipedia now that they are on wmf3 so they can have hebmorph
  • 16:40 logmsgbot: reedy Started scap: Build l10n cache for 1.24wmf4 and move testwiki
  • 16:39 manybubbles: rebuilding enwiki's cirrus index to optimize for new highlighter
  • 16:14 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to wmf3
  • 16:13 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'raise db106[45] to normal load'
  • 15:13 logmsgbot: manybubbles synchronized php-1.24wmf3/extensions/CirrusSearch/ 'updating Cirrus to pick up some fixes'
  • 15:08 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'engage new hightlighter on some more wikis'
  • 15:00 manybubbles: rebuilding all hebrew wikis _except_ hebrew wikipedia and hebrew wikisource to pick up hebmorph. hewikisource got it this morning. hewiki will get it this afternoon after the deployment train
  • 13:24 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1064 in s4, db1065 in s1'
  • 12:59 manybubbles: rebuilding cirrus index for hewikisource to pick up hebmorph
  • 10:18 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'reduce db1049 and db1051 load while cloning'
  • 10:16 springle: xtrabackup clone db1051 to db1065
  • 09:35 springle: xtrabackup clone db1049 to db1064
  • 08:55 springle: installed db106[45]
  • 08:35 logmsgbot: reedy synchronized docroot and w
  • 08:18 logmsgbot: reedy synchronized php-1.24wmf4 'staging'
  • 08:01 logmsgbot: reedy updated /a/common to I7f2d2b25d: Allow all users on OfficeWiki to send mass messages
  • 03:44 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu May 8 03:43:39 UTC 2014 (duration 43m 38s)
  • 02:57 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-08 02:56:02+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-08 02:28:17+00:00

May 7

  • 23:36 logmsgbot: demon Finished scap: no-op scap, for ori (duration: 09m 01s)
  • 23:27 logmsgbot: demon Started scap: no-op scap, for ori
  • 23:17 logmsgbot: ori Finished scap: No changes; testing scap to osmium (again) (duration: 03m 25s)
  • 23:14 logmsgbot: ori Started scap: No changes; testing scap to osmium (again)
  • 23:14 logmsgbot: ori scap aborted: No changes; testing scap to osmium (duration: 11m 27s)
  • 23:06 awight: update crm from 1740219e38091ba4f7afe6545ea189b27340bf86 to 237d463ed3e275c217a4f497ed30d2f7f20100eb
  • 23:02 logmsgbot: ori Started scap: No changes; testing scap to osmium
  • 21:37 K4-713: updated payments from a7fa0d64da2c56586c83cf92babb65bac857be2e to 3be44f5d14c00a893a985f3aad86b6b59507a987
  • 21:00 K4-713: synchronized payments from 4811f6d3d80d126c to a7fa0d64da2c56586
  • 20:49 K4-713: revlocked payments to 4811f6d3d80d126c due to strange errors during an attempted deploy of a7fa0d64da2c56586
  • 20:17 awight: drush pm-uninstall wmf_fredge_qc
  • 20:16 awight: bad call: drush en wmf_fredge_qc -- need to rollback the module schema version and try again with "fredge" creds
  • 20:14 awight: crm updated from cfe34fe0b10861167199a8f72bba279b9cac5e6e to 1740219e38091ba4f7afe6545ea189b27340bf86
  • 20:07 subbu: deployed parsoid 71f4e884 (with deploy sha 9a62899d)
  • 17:19 logmsgbot: yurik synchronized php-1.24wmf2/extensions/ZeroRatedMobileAccess/
  • 17:16 logmsgbot: yurik synchronized php-1.24wmf3/extensions/ZeroRatedMobileAccess/
  • 16:33 manybubbles: performing a rolling restart on elasticsearch nodes in production to pick up new plugins: experimental-highlight 0.0.8 and analysis-hebrew 1.1.0
  • 16:30 _joe_: restarted mwprof/profiler-to-carbon on tungsten, stuck somehow
  • 15:40 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Raised redundancy for commonswiki_file back up, config to match'
  • 15:27 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: Remove obsolete $wmgCirrusIsBuilding (no functionality change)'
  • 15:26 logmsgbot: anomie synchronized wmf-config/CirrusSearch-common.php 'SWAT: Remove obsolete $wmgCirrusIsBuilding (no functionality change)'
  • 15:19 anomie: anomie namespaceDupes.php on OfficeWiki done (that was quick)
  • 15:18 anomie: anomie Running maintenance/namespaceDupes.php on OfficeWiki
  • 15:16 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: Change wgMetaNamespace for OfficeWiki and add alias'
  • 15:12 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: Allow all users on OfficeWiki to send mass messages (for real this time)'
  • 15:09 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: Allow all users on OfficeWiki to send mass messages'
  • 15:03 logmsgbot: anomie synchronized wmf-config/InitialiseSettings.php 'SWAT: Set $wgUploadMissingFileUrl for enwiki'
  • 10:21 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1049 to full steam'
  • 09:40 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1049 in s4'
  • 07:54 hashar: Jenkins: installing Claim plugin (allow folks to comment on builds and mark them)
  • 06:48 springle: again
  • 05:59 springle: powercycled unresponsive neon, swapdeath + oom killer
  • 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed May 7 03:11:36 UTC 2014 (duration 11m 35s)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-07 02:27:23+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-07 02:14:44+00:00
  • 00:02 mutante: upgrading libtiff on imagescalers

May 6

  • 23:18 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/131855'
  • 21:57 mutante: gracefull'ing apaches
  • 20:43 logmsgbot: demon synchronized php-1.24wmf3/extensions/CirrusSearch 'Rolling Cirrus back to known-good state'
  • 20:42 logmsgbot: demon synchronized php-1.24wmf2/extensions/CirrusSearch 'Rolling Cirrus back to known-good state'
  • 20:27 logmsgbot: demon synchronized php-1.24wmf3/extensions/CirrusSearch/includes/Hooks.php 'Fix typehinting'
  • 20:26 logmsgbot: demon synchronized php-1.24wmf2/extensions/CirrusSearch/includes/Hooks.php 'Fix typehinting'
  • 20:19 logmsgbot: demon synchronized php-1.24wmf3/extensions/CirrusSearch/CirrusSearch.php 'I2638b695: fix for page moves'
  • 20:19 logmsgbot: demon synchronized php-1.24wmf2/extensions/CirrusSearch/CirrusSearch.php 'I2638b695: fix for page moves'
  • 20:18 logmsgbot: demon synchronized php-1.24wmf3/extensions/CirrusSearch/includes/Hooks.php 'I2638b695: fix for page moves'
  • 20:17 logmsgbot: demon synchronized php-1.24wmf2/extensions/CirrusSearch/includes/Hooks.php 'I2638b695: fix for page moves'
  • 19:00 mutante: enabling puppet on netmon1001
  • 17:38 hoo: Changed email for global account "ElphiBot".
  • 16:48 logmsgbot: mattflaschen synchronized php-1.24wmf3/extensions/GettingStarted/ 'GettingStarted token and logging deployment'
  • 16:45 logmsgbot: mattflaschen synchronized php-1.24wmf2/extensions/GettingStarted/ 'GettingStarted token and logging deployment'
  • 16:19 hoo: Changed email for global account "Elph".
  • 15:42 akosiaris: killed zuul processes on gallium and restarted the service
  • 09:16 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue May 6 09:15:20 UTC 2014
  • 09:08 logmsgbot: LocalisationUpdate completed (1.22wmf15) at Tue May 6 09:07:45 UTC 2014
  • 09:01 logmsgbot: faidon synchronized wmf-config/squid.php 'add Swift to squid.php'
  • 08:59 logmsgbot: faidon updated /a/common to Ica9086dcd: Add Swift frontends to squid.php
  • 08:45 ottomata: re-enabling puppet agent on analytics1022; kafka broker is caught up there and is fully in all ISRs
  • 08:36 ottomata: temporarily disabling puppet on analytics1026 to troubleshoot a camus import problem
  • 06:55 springle: hammering dbstore1001 with dumps in screen session. ignore replag
  • 06:31 ori: ..on vanadium
  • 06:31 ori: deleting rotated logs in /var/log/eventlogging/archive that are older than 90 days
  • 04:47 springle: mydumper/myloader clone db1042 to db1049
  • 04:06 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1049 for maintenance'
  • 03:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue May 6 03:47:41 UTC 2014 (duration 47m 40s)
  • 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-06 03:02:03+00:00
  • 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-06 02:27:21+00:00
  • 00:01 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'Update wgFlowCacheVersion to 4.2'

May 5

  • 23:57 logmsgbot: ori updated /a/common to Id1f2e0acf: Drop wgFlowCacheKey from CommonSettings.php
  • 23:54 logmsgbot: ori Finished scap: SWAT deploy for VisualEditor and Flow cherry-picks (duration: 09m 55s)
  • 23:44 logmsgbot: ori Started scap: SWAT deploy for VisualEditor and Flow cherry-picks
  • 23:26 logmsgbot: ori synchronized php-1.24wmf3/extensions/EventLogging 'Update EventLogging for Id23b37fbe for SWAT.'
  • 23:23 logmsgbot: ori synchronized php-1.24wmf2/extensions/EventLogging 'Update EventLogging for Id23b37fbe for SWAT.'
  • 21:07 jgage: trying on analytics1022: https://wikitech.wikimedia.org/wiki/Analytics/Kraken/Kafka/Administration#Recovering_a_laggy_broker_replica
  • 20:58 RobH: ssl1001-1003 now have updated unified cert in service
  • 20:58 jgage: both kafka brokers back in service
  • 20:54 RobH: cp4001-4020 unified cert and nginx service reloaded, back in service
  • 20:50 RobH: ssl1006 and ssl1009 are responsive to nginx and back in service
  • 20:43 RobH: pybal
  • 20:43 RobH: ssl1009 was refusing connections both before and after my ssl cert update. ssl1006 is presently refusing connections post update. they are set to disabled in pubal
  • 20:39 RobH: ssl1008 back into service, ssl1009 already depooled
  • 20:38 jgage: forced kafka broker reelection
  • 20:34 RobH: ssl1007 going back into service, ssl1008 depooling
  • 20:25 RobH: depooled ssl1006/7 for update
  • 20:25 RobH: ssl1004/5 returned to service (and puppet agents enabled)
  • 20:21 RobH: puppet agent has been re-enabled on ssl1001-1003
  • 20:19 RobH: ssl1004/5 disabled for update
  • 20:18 RobH: putting ssl1002/3 back into service
  • 20:15 subbu: deployed parsoid f2f1f1d7 (with deploy sha 71072f8a)
  • 19:58 RobH: ssl1001 back in service, ssl1002-1003 set to disabled in pybal
  • 19:18 RobH: depooling ssl1001 to test new certs live on system
  • 19:09 RobH: disabled puppet on cp40XX, ssl10XX, and ssl30XX
  • 19:08 logmsgbot: bblack synchronized wmf-config/squid.php 'REVERT: Update wgSquidServersNoPurge to use whole subnets for XFF checking'
  • 19:07 logmsgbot: bblack updated /a/common to Iaf4d57d54: Revert "Use whole subnets in squid.php list for XFF acceptance"
  • 19:03 logmsgbot: bblack synchronized wmf-config/squid.php 'Update wgSquidServersNoPurge to use whole subnets for XFF checking'
  • 19:01 logmsgbot: bblack updated /a/common to I5a2d86ef0: Use whole subnets in squid.php list for XFF acceptance
  • 17:05 logmsgbot: aaron synchronized wmf-config/CommonSettings.php 'Revert "Increased htmlCacheUpdate throttle"'
  • 16:00 logmsgbot: anomie synchronized php-1.24wmf3/extensions/MobileFrontend/ 'SWAT: Backport change 131237 to 1.24wmf3 to fix bug in MobileFrontend'
  • 15:59 logmsgbot: anomie synchronized php-1.24wmf2/extensions/MobileFrontend/ 'SWAT: Backport change 131237 to 1.24wmf2 to fix bug in MobileFrontend'
  • 15:49 logmsgbot: anomie synchronized php-1.24wmf2/includes/specials/SpecialAllmessages.php 'SWAT: Backport change 131041 to 1.24wmf2 to fix bug in Special:AllMessages'
  • 15:37 logmsgbot: anomie synchronized php-1.24wmf2/includes/specials/SpecialAllmessages.php 'SWAT: Backport change 131041 to 1.24wmf2 to fix bug in Special:AllMessages'
  • 15:24 logmsgbot: anomie synchronized php-1.24wmf3/includes/specials/SpecialAllmessages.php 'SWAT: Backport change 131041 to 1.24wmf3 to fix bug in Special:AllMessages'
  • 15:12 logmsgbot: anomie synchronized php-1.24wmf3/includes/api/ApiLogin.php 'SWAT: Backport change 131056 to 1.24wmf3 to fix bug 64727'
  • 15:10 logmsgbot: anomie synchronized php-1.24wmf2/includes/api/ApiLogin.php 'SWAT: Backport change 131056 to 1.24wmf2 to fix bug 64727'
  • 12:45 akosiaris: removing various sdtpa devices from LibreNMS
  • 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon May 5 03:11:15 UTC 2014 (duration 11m 14s)
  • 02:32 ^demon|away: [gitb]lit's wonkiness but they're certainly not helping matters.
  • 02:32 ^demon|away: antimony: ran very very aggressive repacking on mediawiki/core, operations/puppet, mediawiki/extensions/{UploadWizard,CentralAuth,CentralNotice,DonationInterface,FlaggedRevs,AbuseFilter,BlueSpiceExtensions,Translate,WikimediaMessages,EducationProgram,UniversalLanguageSelector,Wikibase}, pywikibot/{core,compat}, operations/dumps/tests. Basically anything taking up >90MB on disk. Probably not the cause of gitb
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-05 02:25:34+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-05 02:13:18+00:00

May 4

  • 20:57 logmsgbot: aaron synchronized php-1.24wmf3/thumb.php 'c5ebd2aefce9e3fc5b994053078754021176f411'
  • 20:40 logmsgbot: aaron synchronized php-1.24wmf3/thumb.php '6c230cbbc6ffa4d8909e88961ebf75755cf9c9d9'
  • 19:24 logmsgbot: ori updated /a/common to I2916ef3bd: labs: stream recent changes to redis
  • 09:58 _joe_: restarted gitblit, stuck on GC as usual.
  • 08:40 _joe_: restarted apache on tungsten as it was stuck communicating with uwsgi
  • 03:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun May 4 03:08:41 UTC 2014 (duration 8m 40s)
  • 02:26 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-04 02:25:06+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-04 02:12:57+00:00

May 3

  • 19:41 logmsgbot: hoo synchronized wmf-config/ 'Documentation only change'
  • 06:02 ori: disabled puppet on osmium to test hhvm build
  • 03:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat May 3 03:09:40 UTC 2014 (duration 9m 39s)
  • 02:27 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-03 02:26:41+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-03 02:14:43+00:00

May 2

  • 23:13 logmsgbot: aaron synchronized wmf-config/filebackend.php 'Made private wikis use thumb_handler.php for thumbnails'
  • 22:30 logmsgbot: aaron synchronized wmf-config/filebackend.php 'Removed useless "handlerUrl" config'
  • 21:21 hashar: Jenkins is back :-]
  • 21:14 hashar: restarting Jenkins (making sure the java process properly disappear)
  • 21:14 hashar: zuul jenkins stuck again :(
  • 20:31 logmsgbot: krinkle synchronized php-1.24wmf3/resources/Resources.php 'Ia12998fb11c686'
  • 20:29 logmsgbot: krinkle synchronized php-1.24wmf2/resources/Resources.php 'Ia12998fb11c686'
  • 17:47 RoanKattouw: Restarting stuck jenkins
  • 16:04 _joe_: depooled mw1053 for hardware problems
  • 15:13 andrewbogott: resetting a bunch more UIDs. Running find-and-chown again, but this time not on the swifts: salt -E '^(?!ms-be|labstore|snapshot).*$'
  • 13:46 paravoid: swift @ eqiad: setting zone 5 (ms-be1013/1014/1015) to weight 2000, i.e. 66%
  • 12:06 hashar: updated our Jenkins Job Builder copy abbf318..8df6bab
  • 06:17 ori: re-enabled puppet on osmium and hafnium
  • 04:02 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri May 2 04:01:04 UTC 2014 (duration 1m 3s)
  • 03:10 logmsgbot: LocalisationUpdate completed (1.24wmf3) at 2014-05-02 03:09:16+00:00
  • 02:40 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-02 02:39:39+00:00

May 1

  • 23:43 bd808: Restarted logstash on logstash1001; MaxSem noticed that many recursion-guard logs were not being completely reassembled and JVM had one CPU maxed out.
  • 23:08 logmsgbot: maxsem synchronized php-1.24wmf2/extensions/CommonsMetadata/ 'https://gerrit.wikimedia.org/r/#/c/130971/'
  • 22:58 paravoid: disabling puppet on holmium; manually overriding completely broken varnish config
  • 20:10 bd808: Deployed scap 92ea0e9 via trebuchet (not actively used yet)
  • 19:17 logmsgbot: reedy synchronized wmf-config/
  • 18:55 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf3
  • 18:42 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.24wmf2
  • 18:07 manybubbles: upgrading highlighter plugin on elasticsearch machines - the cluster will go yellow for a few hours during the rolling restart
  • 16:43 logmsgbot: reedy Finished scap: testwiki to 1.24wmf3 (duration: 29m 54s)
  • 16:18 subbu: deployed parsoid 5e05c585 (with deploy sha ca2db96d)
  • 16:16 ottomata: reinstalling elastic1008
  • 16:13 logmsgbot: reedy Started scap: testwiki to 1.24wmf3
  • 16:10 logmsgbot: reedy updated /a/common to I832b45db6: Correct a domain in wgCopyUploadsDomains
  • 16:01 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'Enable cirrus as a betafeature on all wikis which did not already have it.'
  • 15:51 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT fix GWtoolset url and add some more logos'
  • 15:40 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT fix GWtoolset url and add some more logos'
  • 15:35 andrewbogott: reassigning a ton of UIDs in production; running a couple dozen 'find' commands to chown files
  • 15:34 logmsgbot: manybubbles synchronized php-1.24wmf2/includes/Article.php 'SWAT update to prevent fatal in backwards compatibility method'
  • 15:27 logmsgbot: manybubbles synchronized php-1.24wmf2/extensions/VisualEditor/ 'SWAT update for firefox focus'
  • 15:08 logmsgbot: manybubbles synchronized php-1.24wmf2/extensions/Wikidata/ 'SWAT update for time parsing and formatting'
  • 08:15 springle: switching s1-analytics-slave db1047 enwiki to tokudb
  • 03:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu May 1 03:17:39 UTC 2014 (duration 17m 38s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-05-01 02:33:34+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-05-01 02:21:44+00:00

April 30

  • 23:58 bblack: mobile caches now sync zero carriers/proxies from zero.wm.org rather than noc(fenari) temp hack solution
  • 23:14 logmsgbot: ori synchronized php-1.24wmf2/extensions/VisualEditor 'Ibaf0cc823bfe: Update VisualEditor for cherry-picks'
  • 20:34 logmsgbot: yurik synchronized php-1.24wmf2/extensions/ZeroRatedMobileAccess/
  • 20:31 logmsgbot: yurik synchronized php-1.24wmf1/extensions/ZeroRatedMobileAccess/
  • 18:08 logmsgbot: yurik synchronized wmf-config/mobile.php
  • 17:37 yurikR: yurik Added $wmgZeroRatedMobileAccessApiUserName / password
  • 17:36 logmsgbot: yurik synchronized wmf-config/PrivateSettings.php
  • 17:17 logmsgbot: yurik synchronized php-1.24wmf2/extensions/ZeroRatedMobileAccess/
  • 17:11 logmsgbot: yurik synchronized php-1.24wmf1/extensions/ZeroRatedMobileAccess/
  • 17:10 logmsgbot: aaron synchronized php-1.24wmf2/thumb.php '93a33d733fa81a9a5396083ded6aa28a74f08a98'
  • 17:07 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Removed "TMHTransformFrame" pool counter entry'
  • 16:19 Krinkle: restarting stuck Jenkins
  • 16:16 Krinkle: Deploying Id248bd6706f32a on Zuul and reloading service
  • 09:29 springle: dbstore100[12] replicating m2 eventlogging
  • 06:54 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Removed GetLocalFileCopy pool counter entry'
  • 03:23 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 30 03:23:13 UTC 2014 (duration 23m 12s)
  • 02:33 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-04-30 02:33:12+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-30 02:22:23+00:00
  • 01:58 Krinkle: Deploying Ia82779635d762a3 on Zuul, reloading services
  • 00:02 logmsgbot: ebernhardson synchronized php-1.24wmf1/extensions/MultimediaViewer/ 'I684d44a0b5'

April 29

  • 23:59 logmsgbot: ebernhardson synchronized php-1.24wmf2/extensions/Wikidata/ 'I84c2283e07'
  • 23:52 logmsgbot: ebernhardson synchronized php-1.24wmf2/extensions/MultimediaViewer/ 'I84f8e347f'
  • 23:37 logmsgbot: ori synchronized php-1.24wmf2/skins 'I66c56c577bad'
  • 23:37 logmsgbot: ori synchronized php-1.24wmf2/extensions/VisualEditor 'I5818dce62'
  • 23:25 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings.php 'Enable MediaViewer survey on Spanish and Dutch Wikipedia'
  • 23:22 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I59e1fa87e: Include language-0 categories for betawikiversity'
  • 23:21 logmsgbot: ori updated /a/common to I59e1fa87e: Include language-0 categories for betawikiversity
  • 23:16 logmsgbot: ebernhardson synchronized php-1.24wmf2/extensions/WikiEditor 'Update WikiEditor to 1.24wmf2'
  • 21:59 logmsgbot: spage synchronized wmf-config/InitialiseSettings.php 'enable Flow on two mw talk pages for James_F'
  • 21:25 Krinkle: Running deleteEqualMessages.php on newwiki (bug 43917)
  • 21:12 AaronSchulz: populateImageSha1 fixer script finished on all wikis
  • 20:47 manybubbles: rebuilding search indexes for group1 wikis after the train upgraded cirrus for them
  • 20:13 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I56ae921ca: Change group name on the Persian Wikipedia'
  • 20:13 logmsgbot: ori updated /a/common to I56ae921ca: Change group name on the Persian Wikipedia
  • 18:16 logmsgbot: reedy synchronized multiversion/
  • 18:09 logmsgbot: reedy synchronized wmf-config/ 'I52293b29a87e2c645735b37215e4113e561e47da'
  • 18:04 logmsgbot: reedy synchronized docroot and w
  • 18:03 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.24wmf2
  • 17:13 manybubbles: raising number of replicas of enwiki's cirrus index from 1 to 2. cluster will probably complain while they allocate
  • 16:53 RobH: osmium install complete, ticket resolved, ready for ^d and ori to take over
  • 16:29 logmsgbot: reedy updated /a/common to Idb2a86791: Increased htmlCacheUpdate throttle
  • 16:18 logmsgbot: aaron synchronized wmf-config/CommonSettings.php 'Increased htmlCacheUpdate throttle'
  • 16:05 logmsgbot: manybubbles synchronized php-1.24wmf2/extensions/Wikidata/ 'SWAT upgrade wikidata for date parsing fixes'
  • 15:47 manybubbles: rebuilding test2wiki's cirrus index after swat deploy
  • 15:45 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT add autopatrolled group to shwiktionary and draft namespace to chapcomwiki'
  • 15:38 logmsgbot: manybubbles synchronized wmf-config/CirrusSearch-common.php 'SWAT deploy - move group0 wikis to experimental highlighter and give enwiki its redundency back'
  • 15:37 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT deploy - extra setting for cirrus and new groups and sources for gwtoolset'
  • 15:32 manybubbles: cirrus deploys look good, moving on to twkozlowski's requests
  • 15:31 logmsgbot: manybubbles synchronized wmf-config/CirrusSearch-common.php 'SWAT deploy - move group0 wikis to experimental highlighter and give enwiki its redundency back'
  • 15:31 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'SWAT deploy - extra setting for cirrus'
  • 15:14 logmsgbot: manybubbles synchronized php-1.24wmf2/extensions/CirrusSearch/ 'SWAT upgrade - improves as yet undeployed highlighter config'
  • 14:40 logmsgbot: faidon synchronized php-1.24wmf2/extensions/GettingStarted/ 'Revert GettingStarted anon tokens'
  • 14:39 logmsgbot: faidon synchronized php-1.24wmf1/extensions/GettingStarted/ 'Revert GettingStarted anon tokens'
  • 13:45 Krinkle: Running deleteEqualMessages.php on lnwiki (bug 43917)
  • 13:41 logmsgbot: bblack synchronized wmf-config/InitialiseSettings.php 'Revert "Unset $wgUseXVO"'
  • 13:10 Krinkle: Running deleteEqualMessages.php on nlwiki (bug 43917)
  • 12:35 Krinkle: Running deleteEqualMessages.php on cswikiversity (bug 43917)
  • 12:08 Krinkle: Running deleteEqualMessages.php on cswiktionary (bug 43917)
  • 11:31 logmsgbot: bblack synchronized wmf-config/InitialiseSettings.php 'Revert "Unset $wgUseXVO"'
  • 10:34 akosiaris: restarted gitblit on antimony
  • 10:20 hashar: restarting Zuul
  • 10:14 hashar: Jenkins / Zuul : upgrading python-gear from 0.4.0-1 to 0.5.4-1 . Should fix a bunch of jobs registrations issues in Zuul Gearman. bug 63760
  • 09:59 akosiaris: update python-gear on apt.wikimedia.org to 0.5.4-1
  • 08:30 akosiaris: Published carbon's IPv6 address in DNS. apt.wikimedia.org and ubuntu.wikimedia.org are now IPv6 enabled
  • 05:25 AaronSchulz: Manually removed a few 10000s of duplicate Cyberbot job duplicates
  • 03:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 29 03:48:29 UTC 2014 (duration 48m 28s)
  • 03:02 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-04-29 03:02:55+00:00
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-29 02:37:21+00:00
  • 02:24 bblack: wiped disk cache (via mkfs) on cp1055 to (hopefully) clear crash-restart cycle, backend back in service now
  • 01:53 Krinkle: Running deleteEqualMessages.php on cswiki (bug 43917)
  • 00:49 Tim: on cp1055: backend varnish is continually panicking and restarting its child, will try to stop/start service
  • 00:22 logmsgbot: aaron synchronized php-1.24wmf1/includes/WikiPage.php '3505cf933d874ea44bd5a3f3ffe210598ef7eec2'
  • 00:14 logmsgbot: aaron synchronized php-1.24wmf2/includes/WikiPage.php '119fd9fc17b3c309b9065b54f4c83ede7d20498b'
  • 00:00 logmsgbot: mwalker Finished scap: SWAT for 129813, 129640, 129708, 129707, and 130246 (duration: 11m 37s)

April 28

  • 23:49 logmsgbot: mwalker Started scap: SWAT for 129813, 129640, 129708, 129707, and 130246
  • 23:26 logmsgbot: aaron synchronized php-1.24wmf2/maintenance/runJobs.php '91dddcaffa58430204e2bf3c612d893b2710f33b'
  • 22:43 jgage: rebooting db1047 due to unpingable and unresponsive on mgmt console
  • 22:28 logmsgbot: mflaschen synchronized php-1.24wmf2/extensions/GettingStarted/ 'Revert token/TrackedPageContentSaveComplete GettingStarted change'
  • 22:28 logmsgbot: mflaschen synchronized php-1.24wmf1/extensions/GettingStarted/ 'Revert token/TrackedPageContentSaveComplete GettingStarted change'
  • 20:47 Krinkle: Running deleteEqualMessages.php on sqwiki (bug 43917)
  • 20:38 paravoid: apache-graceful-all after tuning php.ini's expose_php setting
  • 20:12 logmsgbot: reedy synchronized wmf-config/db-labs.php
  • 20:08 apergos: restarted gmetad on nickel
  • 20:02 gwicke: deployed Parsoid cab9348e using deploy 9e9030d
  • 20:00 logmsgbot: mflaschen synchronized wmf-config/InitialiseSettings.php 'Update GettingStarted config for new format'
  • 19:59 logmsgbot: mflaschen synchronized php-1.24wmf2/extensions/GettingStarted/ 'Sync GettingStarted for Growth team deploy'
  • 19:58 logmsgbot: mflaschen synchronized php-1.24wmf1/extensions/GettingStarted/ 'Sync GettingStarted for Growth team deploy'
  • 19:30 Krinkle: Running deleteEqualMessages.php on simplewiki (bug 43917)
  • 19:25 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I5e0709ef0: Unset $wgUseXVO'
  • 19:25 logmsgbot: ori updated /a/common to I5e0709ef0: Unset $wgUseXVO
  • 19:22 Krinkle: Running deleteEqualMessages.php on rowiktionary (bug 43917)
  • 19:01 Krinkle: Running deleteEqualMessages.php on bat-smgwiki (bug 43917)
  • 18:48 Krinkle: Running deleteEqualMessages.php on afwikiquote (bug 43917)
  • 18:41 hashar: Jenkins disconnected lanthanum slave, killed all jenkins-slave process on it and repooled server.
  • 18:39 Krinkle: Running deleteEqualMessages.php on abwiki (bug 43917)
  • 17:54 manybubbles: deploying a new version of our Elasticsearch highlighter by doing a rolling restart on Elasticsearch machines - should cause no interruption of service
  • 16:51 akosiaris: executed graceful-stop, start for apaches in order to load the new php-luasandbox apache module
  • 15:08 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'Add new sources to gwtoolset and namespaces to hewikisource'
  • 12:48 _joe_: restarted apache on wikitech-static
  • 12:29 Krinkle: Running deleteEqualMessages.php on cvwiki (bug 43917)
  • 12:29 Krinkle: Running deleteEqualMessages.php on afwiki (bug 43917)
  • 11:46 Krinkle: Running deleteEqualMessages.php on bpywiki (bug 43917)
  • 11:00 springle: completed schema change, bug 64411, page_props.pp_sortkey
  • 08:49 springle: reloading db1046 from fresh m2 dump
  • 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 28 03:12:11 UTC 2014 (duration 12m 10s)
  • 03:00 springle: starting online schema change, bug 64411, page_props.pp_sortkey
  • 02:30 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-04-28 02:30:04+00:00
  • 02:22 springle: powercycle db1046 unresponsive
  • 02:20 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-28 02:20:39+00:00

April 27

  • 14:49 paravoid: stopping pybal on lvs300[1-4] to avoid the logspam
  • 05:31 springle: mariadb sql dump in progress db1048 /a for rebuilding db1046. ok to kill if necessary
  • 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 27 03:08:01 UTC 2014 (duration 8m 0s)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-04-27 02:28:32+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-27 02:19:43+00:00

April 26

  • 23:27 logmsgbot: aaron synchronized php-1.24wmf2/includes/profiler/Profiler.php '7e20cdd2ba0381b81d3b43c8743fa4202a76bd61'
  • 13:43 logmsgbot: hoo synchronized wmf-config/InitialiseSettings-labs.php 'Syncing for cluster consistency'
  • 13:42 logmsgbot: hoo updated /a/common to Ic98928d54: Have Commons on Beta Labs use $stdlogo
  • 13:26 springle: db1016 xfs head behind tail. reverted to last snapshot volume
  • 12:57 springle: powercycle db1016 unresponsive
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 26 03:11:16 UTC 2014 (duration 11m 15s)
  • 02:31 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-04-26 02:31:41+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-26 02:22:41+00:00

April 25

  • 20:29 logmsgbot: mwalker synchronized php-1.24wmf2/extensions/WikiEditor/ 'Reverting some faulty WikiEditor code for bug 64289'
  • 20:28 logmsgbot: mwalker synchronized php-1.24wmf1/extensions/WikiEditor/ 'Reverting some faulty WikiEditor code for bug 64289'
  • 18:02 K4-713: adjusted antifraud filters on payments
  • 17:08 Jeff_Green: reenabled puppet and notifications for iodine
  • 16:22 manybubbles: Elasticsearch rolling restart complete.
  • 14:46 Jeff_Green: disabled icinga notifications for iodine too...
  • 14:44 Jeff_Green: puppet stopped on iodine, doing manual spamassassin training
  • 12:58 springle: upgrading db1047 (analytics slave) to mariadb 10
  • 12:28 manybubbles: Performing rolling restart of Cirrus's Elasticsearch servers to upgrade a plugin. Low risk because it won't be used by the general public until Mondayish so a Friday push should be ok.
  • 12:07 ottomata: stopping puppet on analytics1026 to test more frequent runs of Camus
  • 12:02 logmsgbot: reedy synchronized wmf-config/
  • 11:35 logmsgbot: reedy synchronized docroot and w
  • 09:41 logmsgbot: reedy synchronized wmf-config/
  • 09:15 logmsgbot: reedy synchronized wmf-config/ 'I4a68dc8321b7b302f5e89b5adafcff096f2ac35b'
  • 09:13 logmsgbot: reedy synchronized multiversion/ 'I4a68dc8321b7b302f5e89b5adafcff096f2ac35b'
  • 08:43 logmsgbot: reedy synchronized docroot and w
  • 08:39 logmsgbot: reedy updated /a/common to I57b6d055e: Update flow cache version to 4.2
  • 07:22 springle: up to 5x pt-table-sync running on db1048 m2 master for eventlogging migration. ok to kill if necessary
  • 03:44 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 25 03:44:16 UTC 2014 (duration 44m 15s)
  • 03:03 logmsgbot: LocalisationUpdate completed (1.24wmf2) at 2014-04-25 03:03:33+00:00
  • 02:37 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-25 02:37:06+00:00
  • 00:47 mwalker: updating payments servers from e6d188f0dfcd57406acb58aa2b5bf45e48117c33 to a7fa0d64da2c56586c83cf92babb65bac857be2e for worldpay
  • 00:29 bd808|deploy: Ori was able to fix permissions and second scap test worked as expected
  • 00:28 logmsgbot: bd808 Finished scap: no-op scap to validate I24149ab and Ie967901 (try 2) (duration: 05m 02s)
  • 00:26 bd808|deploy: File permissions on /srv/scap/scap/*.{py,pyc} were not consistently a+r which is needed for scap-rebuild-cdbs
  • 00:23 logmsgbot: bd808 Started scap: no-op scap to validate I24149ab and Ie967901 (try 2)

April 24

  • 23:55 logmsgbot: bd808 Finished scap: no-op scap to validate I24149ab and Ie967901 (duration: 02m 51s)
  • 23:52 logmsgbot: bd808 Started scap: no-op scap to validate I24149ab and Ie967901
  • 23:46 awight: perform crm schema update 7018
  • 23:36 bd808|deploy: Running scap-rebuild-cdbs on tin to test python port
  • 23:17 logmsgbot: mwalker synchronized wmf-config/CommonSettings.php 'Updating flow configuration 129589'
  • 23:16 logmsgbot: mwalker synchronized php-1.24wmf2/extensions/Flow 'Updating flow for 129589 and 129604'
  • 22:53 AaronSchulz: Running PopulateImageSha1.php for all multi-versioned files on all wikis to fix broken SHA-1s
  • 21:22 springle: eventlogging dump loading on db1048 m2 master in screen. ok to kill if necessary
  • 21:18 hashar: restarting Zuul
  • 21:01 mwalker: updating payments from 4811f6d3d80d126c8b3c89c11d20cc6416cb58f6 to e6d188f0dfcd57406acb58aa2b5bf45e48117c33 for donationinterface / worldpay updates
  • 20:39 paravoid: shutting down sdtpa, cr1-sdtpa, csw1-sdtpa, msw1-sdtpa and other sdtpa hosts gone forever
  • 20:37 Coren: sync-apache for 126969 and 91339
  • 19:59 logmsgbot: reedy synchronized docroot and w
  • 19:54 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Removed redundant pool counter config'
  • 19:00 ori: eventlogging data streaming into db1048; db1047 consumer decom'd.
  • 19:00 logmsgbot: reedy synchronized wmf-config/
  • 18:59 logmsgbot: reedy synchronized database lists files:
  • 18:43 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Rest of group0 to 1.24wmf2
  • 18:38 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.24wmf1
  • 18:33 springle: begin eventlogging migration db1047 to db1048 (m2), RT #7081
  • 16:42 hashar: restarted both Zuul and Jenkins
  • 16:29 logmsgbot: reedy synchronized php-1.24wmf2/extensions/Wikidata 'I43988505ea0fd7ac6b1278a50237e0e1d3ee0e9e'
  • 16:25 hashar: restarting Zuul. Got asked to be stopped.
  • 16:15 hashar: Jenkins ended up being stalled due to a known unfigured out issue :/
  • 16:13 hashar: Killed a leftover jenkins process on gallium
  • 15:50 akosiaris: scheduled a safe restart of jenkins
  • 15:48 logmsgbot: hoo updated /a/common to I53de8d84b: Add two languages not supported by MediaWiki to testwikidata
  • 15:48 logmsgbot: hoo synchronized wmf-config/InitialiseSettings.php 'Add two languages not supported by MediaWiki to testwikidata'
  • 15:47 andrewbogott: zuul on gallium is dead and I don't know why
  • 15:45 andrewbogott: restarted jenkins and zuul on gallium
  • 14:02 logmsgbot: reedy Finished scap: testwiki to 1.24wmf2 build l10n cache (take 2) (duration: 15m 46s)
  • 13:46 logmsgbot: reedy Started scap: testwiki to 1.24wmf2 build l10n cache (take 2)
  • 13:45 logmsgbot: reedy Finished scap: testwiki to 1.24wmf2 build l10n cache (duration: 08m 16s)
  • 13:37 logmsgbot: reedy Started scap: testwiki to 1.24wmf2 build l10n cache
  • 13:32 logmsgbot: reedy updated /a/common to I543df75e3: Remove $wgDisableTextSearch and $wgDisableSearchUpdate overrides.
  • 12:22 akosiaris: restarted morebots after upgrade of adminbot to 1.7.5
  • 12:00 Krinkle: Running deleteEqualMessages.php on alswiki (bug 43917)
  • 12:00 Krinkle: Running deleteEqualMessages.php on suwiki (bug 43917)
  • 12:00 Krinkle: Running deleteEqualMessages.php on tlwiki (bug 43917)
  • 11:59 Krinkle: Running deleteEqualMessages.php on nahwiktionary (bug 43917)
  • 11:53 paravoid: reenabling ospf3 between cr1-eqiad/cr2-knams
  • 11:50 paravoid: fixing private4/private6 ACLs to be consistent across all routers
  • 06:47 _joe_: also ran puppet node clean to revoke certs, facts, etc (cp3013.esams.wikimedia.org cp3014.esams.wikimedia.org)
  • 06:38 _joe_: ran puppetstoredconfigclean.rb for cp3013.esams.wikimedia.org cp3014.esams.wikimedia.org
  • 05:20 springle: xtrabackup dbstore1001 to dbstore1002
  • 03:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 24 03:29:12 UTC 2014 (duration 29m 11s)
  • 02:46 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-24 02:45:58+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-24 02:29:14+00:00

April 23

  • 23:40 logmsgbot: catrope synchronized php-1.24wmf1/extensions/VisualEditor/lib/ve/modules/ve/ui/widgets/ve.ui.SurfaceWidget.js 'Fix surface focusing bug in Firefox'
  • 23:39 logmsgbot: catrope synchronized php-1.24wmf1/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js 'Unbreak badtoken recovery in mobile VE'
  • 23:38 logmsgbot: catrope synchronized php-1.24wmf1/resources/src/jquery/jquery.suggestions.js 'Handle CSS ellipsis when calculating suggestions widths'
  • 23:37 logmsgbot: catrope synchronized php-1.23wmf22/resources/src/jquery/jquery.suggestions.js 'Handle CSS ellipsis when calculating suggestions widths'
  • 21:55 bblack: moved cp301[34] ethernet ports to private1-esams
  • 20:14 Krinkle: Running deleteEqualMessages.php on mlwiki (bug 43917)
  • 20:14 Krinkle: Running deleteEqualMessages.php on miwiki (bug 43917)
  • 20:11 Krinkle: Running deleteEqualMessages.php on gvwiki (bug 43917)
  • 20:11 Krinkle: Running deleteEqualMessages.php on euwiktionary (bug 43917)
  • 20:04 subbu: deployed Parsoid 9c99b0be (deploy SHA cf5eb4d0)
  • 19:51 Krinkle: Running deleteEqualMessages.php on brwiki (bug 43917)
  • 19:51 Krinkle: Running deleteEqualMessages.php on afwiki (bug 43917)
  • 19:50 Krinkle: Running deleteEqualMessages.php on iawiki (bug 43917)
  • 19:37 Krinkle: Running deleteEqualMessages.php on hrwiktionary (bug 43917)
  • 19:37 Krinkle: Running deleteEqualMessages.php on hrwiki (bug 43917)
  • 19:34 Krinkle: Running deleteEqualMessages.php on dawiki (bug 43917)
  • 19:00 logmsgbot: reedy synchronized wmf-config/ 'I543df75e364171a71a48f18429972b662b542894'
  • 18:58 logmsgbot: reedy updated /a/common to I865a08779: Fix $wmgBetaFeaturesWhitelist for labs too
  • 18:58 Krinkle: Running deleteEqualMessages.php on amwiki (bug 43917)
  • 17:59 logmsgbot: demon synchronized wmf-config/InitialiseSettings-labs.php 'no-op in prod, for completeness'
  • 17:52 logmsgbot: demon synchronized all-labs.dblist 'no-op for prod, syncing for completeness'
  • 17:52 logmsgbot: demon synchronized wikiversions-labs.json 'no-op for prod, syncing for completeness'
  • 17:51 logmsgbot: demon updated /a/common to I960a792bc: Override wgSearchTypeAlternatives for beta to remove lucene
  • 15:58 akosiaris: updated adminbot on apt.wikimedia.org to 1.7.5
  • 15:57 manybubbles: rebuilding commons' cirrus search index
  • 15:55 logmsgbot: manybubbles synchronized php-1.24wmf1/extensions/CirrusSearch/maintenance/updateOneSearchIndexConfig.php 'swat update to fix maintenance script'
  • 15:46 ottomata: temporarily disabling puppet on analytics1003 to test some kafkatee settings
  • 15:32 apergos: rebooting dataset2 hoping to detect the arrays on reboot
  • 14:40 akosiaris: unexported /vol/{originals,thumbs} on nas1001-a, nas1-a
  • 14:30 akosiaris: break replication for volumes originals, thumbs on nas1001-a, nas1-a
  • 14:03 paravoid: adding AS path 1257 6830 (Tele2 -> UPC) to avoided paths @ cr1-esams/cr2-knams, multiple users reporting slowness issues
  • 13:52 akosiaris: unmounted /vol/originals and /vol/thumbs on fenari (was /mnt/upload7, /mnt/thumbs2) see RT #7076
  • 12:54 logmsgbot: marc synchronized wmf-config/interwiki.cdb 'Updating interwiki cache'
  • 12:50 hashar: Jenkins back
  • 12:46 hashar: restarting jenkins
  • 12:46 hashar: Jenkins: upgrading email-ext and JobConfigHistory plugins (the later now supports slaves configs!)
  • 10:35 hashar: Jenkins: update lanthanum slave agent to use java7
  • 10:32 hashar: Jenkins switching integration-slave1001.eqiad.wmflabs java to use Java 7 . In https://integration.wikimedia.org/ci/computer/integration-slave1001/configure changed JavaPath to /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java
  • 07:45 mutante: nfs1 - delete some old kernels and zip mw logs last touched in 2012/13 to free some disk on /
  • 07:32 mutante: nfs1 - re-enabled puppet
  • 07:12 mutante: nfs2 - revoke puppet cert,salt key,stored configs
  • 06:26 mutante: db48,db63 - revoke puppet cert, salt key, kill from storedconfigs
  • 03:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 23 03:55:03 UTC 2014 (duration 55m 2s)
  • 03:09 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-23 03:08:58+00:00
  • 02:46 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-23 02:46:47+00:00
  • 02:25 manybubbles: restarted rebuilding common's Cirrus index after something crashed. going to get more logging out of it if it crashes again. or it'll work. Either way. Like last time the Elasticsearch check might freak out for a bit after it finished because shards are assigning. That can be ignored for an hour or so.

April 22

  • 23:30 logmsgbot: ori Finished scap: I595446dc5, If2c57846f, Iaa232298e (duration: 00m 45s)
  • 23:30 logmsgbot: ori Started scap: I595446dc5, If2c57846f, Iaa232298e
  • 23:28 logmsgbot: ori synchronized php-1.24wmf1/extensions/EventLogging 'Update EventLogging for Iaa232298e: Set line-height for code icon on schema pages (bug 64251)'
  • 23:27 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'If2c57846f: Enable survey option in MediaViewer on a few more wikis'
  • 23:26 logmsgbot: ori updated /a/common to If2c57846f: Enable survey option in MediaViewer on a few more wikis
  • 23:24 logmsgbot: ori synchronized php-1.23wmf22/extensions/MultimediaViewer 'Update MultimediaViewer for I595446dc5: Add more survey languages (fr, de, pt/pr-br)'
  • 23:23 logmsgbot: ori synchronized php-1.24wmf1/extensions/MultimediaViewer 'Update MultimediaViewer for I595446dc5: Add more survey languages (fr, de, pt/pr-br)'
  • 22:28 logmsgbot: aaron synchronized php-1.24wmf1/maintenance/populateImageSha1.php '32d9206'
  • 22:22 logmsgbot: reedy synchronized php-1.24wmf1/extensions/TimedMediaHandler 'I7483c8b7ec75f5149998da2b530ca04'
  • 21:31 logmsgbot: spage synchronized php-1.24wmf1/extensions/Flow/modules/discussion/styles/mixins/collapse.less 'Fix Flow collapsed topics on mw.org'
  • 21:14 logmsgbot: spage synchronized wmf-config/InitialiseSettings.php 'Enable Flow on Compact Personal Bar talk'
  • 21:12 logmsgbot: spage updated /a/common to I851651247: Non wikipedias to 1.24wmf1
  • 20:35 logmsgbot: demon synchronized wmf-config/CommonSettings.php 'No op in prod, disables lsearchd completely for beta'
  • 20:28 ottomata: turning on varnishkafka on text varnishes
  • 20:12 MatmaRex: wikibugs replaced by pywikibugs (https://github.com/valhallasw/pywikibugs) and moved to #wikimedia-dev (at last!)
  • 20:12 manybubbles: rebuilding the search index for a few wikis - might cause the Elasticsearch health check to freak out because it sucks
  • 19:51 MatmaRex: wikibugs is down, let's not bring it back up
  • 19:31 Krinkle: Reloading Zuul to deploy config change I9c2f94b138244ab8
  • 19:05 hashar: Jenkins killed Jenkins java process on deployment-bastion.eqiad.wmflab to free up the executor and threads entirely.
  • 18:55 hashar: restarted Zuul to clean up some stuck jobs from the queue
  • 18:49 hashar: Jenkins deployment-bastion.eqiad.wmflab is back online: Slave successfully connected and online
  • 18:47 logmsgbot: reedy synchronized docroot and w
  • 18:46 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: non wikipedias to 1.24wmf1
  • 18:43 RobH: tridge back and accessible
  • 18:42 hashar: Jenkins deposing / repooling deployment-bastion.eqiad.wmflabs slave locked up somehow, the executors are no more taken in account by Jenkins master
  • 18:33 RobH: resurrecting tridge in pmtpa
  • 18:00 RobH: tridge is coming dow for relocation, shouldnt disrupt anything but backups in progress
  • 17:52 bblack: disable cp301[34] (mobile varnish frontends) in pybal on fenari
  • 17:21 awight: update crm from 7dafce5 to cfe34fe
  • 16:45 mark: Reenabled Apache and puppet on fenari
  • 16:40 logmsgbot: aaron synchronized php-1.24wmf1/includes/filerepo/file/LocalFile.php 'e9807d0'
  • 16:15 logmsgbot: reedy updated /a/common to I55954c612: Commit updated interwiki.cdb file
  • 15:47 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Collection back on, server move over'
  • 15:14 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Icb6b4bad: Updated $wgForceUIMsgAsContentMsg for commonswiki'
  • 14:59 cmjohnson1: shutting down and relocating virt0 and pdf2
  • 14:50 logmsgbot: marc synchronized wmf-config/interwiki.cdb 'Updating interwiki cache'
  • 14:41 logmsgbot: marc synchronized wmf-config/interwiki.cdb 'Updating interwiki cache'
  • 14:33 manybubbles: populating cirrus indexes for all remaining wikis
  • 14:19 akosiaris: added bblack account on all junipers
  • 14:18 manybubbles: building new elasticsearch indexes for the last wikis that didn't have them. the cluster may go red as the indexes are assigned. silly nagios check.
  • 14:15 logmsgbot: manybubbles synchronized wmf-config/ 'cirrus for more wikis and disable collection for more'
  • 14:13 logmsgbot: manybubbles synchronized docroot/noc/createTxtFileSymlinks.sh 'noncirrus is removed'
  • 14:09 cmjohnson1: mchenry and sanger going down for server relocation
  • 13:26 mark: Disabled puppet and apache on fenari
  • 13:25 paravoid: second pass of swiftrepl eqiad->esams
  • 12:04 logmsgbot: faidon synchronized wmf-config/squid.php 'add cp3013/cp3014 IPv6 addresses'
  • 12:04 logmsgbot: faidon updated /a/common to If8f39abee: squid.php: add cp3013/cp3014 IPv6 addresses
  • 10:35 akosiaris: upgraded php-luasandbox to 1.9.1 on beta (deployment-apache0{1,2})
  • 10:25 akosiaris: upgraded php-luasandbox to 1.9-1 on test.wikimedia.org
  • 10:18 mutante: harmon - delete salt key
  • 09:04 mutante: hooper - revoked puppet cert
  • 08:47 mutante: upgrading Bugzilla to 4.4.4
  • 06:58 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Unbreak $wmgBetaFeaturesWhitelist'
  • 06:44 mutante: db77 - revoke puppet cert,salt key,rm from monitoring
  • 05:14 springle: db68 down. s1-analytics-slave cname to db1007
  • 04:35 paravoid: reactivate esams<->HE & eqiad<->HE peerings; issues are confirmed to be resolved
  • 03:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 22 03:32:29 UTC 2014 (duration 32m 28s)
  • 02:43 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-22 02:43:01+00:00
  • 02:30 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-22 02:29:34+00:00
  • 01:16 awight: nevermind previous update.
  • 01:11 awight: update from 7dafce5 to cfe34fe
  • 00:18 logmsgbot: catrope synchronized php-1.24wmf1/extensions/MobileFrontend/javascripts/modules/editor/VisualEditorOverlay.js 'Fix JS error on save'
  • 00:15 Reedy: torrus (on manutius) is down
  • 00:05 RoanKattouw: Restarted ircecho on ekrem, IRC working again now
  • 00:00 K4-713: updated payments from 2819549 --> 4811f6d

April 21

  • 23:57 RoanKattouw: Started ircd on ekrem, startup doesn't seem to be puppetized
  • 23:24 logmsgbot: catrope synchronized php-1.23wmf22/extensions/MultimediaViewer 'SWAT deploy cherry-picks'
  • 23:24 logmsgbot: catrope synchronized php-1.24wmf1/extensions/MobileFrontend 'SWAT deploy cherry-picks'
  • 23:23 logmsgbot: catrope synchronized php-1.24wmf1/extensions/MultimediaViewer 'SWAT deploy cherry-picks'
  • 23:23 logmsgbot: catrope synchronized php-1.24wmf1/extensions/VisualEditor 'SWAT deploy cherry-picks'
  • 23:16 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'Lithuanian namespace aliases for betawikiversity'
  • 23:11 logmsgbot: catrope synchronized wmf-config/CommonSettings.php 'BetaFeatures whitelist'
  • 23:11 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'Beta Features whitelist'
  • 22:50 RoanKattouw: Restarted stuck Jenkins
  • 21:33 awight: updated payments: af35b7b --> 2819549
  • 20:45 andrewbogott: rebooted wtp1018
  • 20:45 subbu: deployed Parsoid ec51e5d1 (deploy SHA 0dd607fc)
  • 20:31 manybubbles: rolling restart on remaining Elasticsearch servers to get the plugin (1010, 1011, 1012, 1015)
  • 19:45 manybubbles: rolling restart on more of the elasticsearch servers to pick up plugins (06, 07, 09)
  • 19:15 cmjohnson1: shutting down and relocating dobson
  • 19:15 cmjohnson1: shutting down and relocating linne
  • 17:39 cmjohnson: shutting down and relocating fenari
  • 17:37 cmjohnson: shutting down mexia to relocate to 12th floor
  • 17:25 logmsgbot: aaron synchronized php-1.24wmf1/includes/filerepo/file/LocalFile.php '2026e4a'
  • 17:22 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Adjust large file download pool counter config to tie up less workers'
  • 17:18 logmsgbot: aaron synchronized php-1.23wmf22/includes/filerepo/file/LocalFile.php '01ce288'
  • 17:13 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'renderfile-nonstandard throttle config'
  • 17:03 logmsgbot: aaron synchronized php-1.24wmf1/thumb.php '44c4658'
  • 17:02 logmsgbot: aaron synchronized php-1.23wmf22/thumb.php '9591365'
  • 16:34 ottomata: reinstalling elastic1013 (elastic1014 is still coming back online, but I don't want there to an extra eligible master for long)
  • 16:04 ottomata: reinstalling elastic1014
  • 15:22 cmjohnson1: dataset2 going down to be relocated to the 12th floor
  • 13:41 manybubbles: rolling restart on some of the Elasticsearch servers to pick up new plugins. should not cause any trouble.
  • 13:05 Reedy: De-activated status.wm.o monitor for icinga due to false positive from HTTP auth
  • 12:54 paravoid: demoting myself, removing Commons crat/admin rights
  • 12:41 paravoid: escalating myself to Commons bureaucrat/admin, then adding GWToolset privileges
  • 12:40 paravoid: deleting 29 GWToolset XML under Swift's wikipedia-commons-gwtoolset-metadata container for user Fæ/
  • 11:51 logmsgbot: reedy synchronized php-1.23wmf22/extensions/TimedMediaHandler 'I7483c8b7ec75f5149998da2b530ca04'
  • 11:50 paravoid: deactivating esams<->HE peering, >90% packet loss between lon<->nyc
  • 11:49 paravoid: deactivating eqiad<->HE peering, >90% packet loss between lon<->nyc
  • 11:45 logmsgbot: reedy synchronized php-1.23wmf22/extensions/TimedMediaHandler 'I7483c8b7ec75f5149998da2b530ca0467ac70de7'
  • 03:55 springle: reset pc100* slaves previously replicating from pmtpa
  • 03:32 ori: 5.5k fatals over last 20 hrs, of which 3.5k are calls to doTransform() on a non-object at TimedMediaThumbnail.php:201, and 0.9k are Lua API OOMs at LuaSandbox/Engine.php:264
  • 03:30 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 21 03:30:15 UTC 2014 (duration 30m 14s)
  • 03:26 ori: ap_busy_workers spike on image scalers eqiad, started ~2:55, subsided around ~3:20
  • 02:42 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-21 02:42:30+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-21 02:29:49+00:00

April 20

  • 18:51 ori: restarted grrrit-wm by following instructions on https://wikitech.wikimedia.org/wiki/Grrrit-wm#Restarting_the_bot
  • 11:12 Nemo_bis: grrrit dead: 10.28 -!- grrrit-wm [tools.lolr@208.80.155.145] has quit
  • 03:26 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 20 03:26:48 UTC 2014 (duration 26m 47s)
  • 02:39 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-20 02:39:23+00:00
  • 02:28 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-20 02:28:09+00:00

April 19

  • 03:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 19 03:27:52 UTC 2014 (duration 27m 51s)
  • 02:41 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-19 02:41:02+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-19 02:29:33+00:00

April 18

  • 21:00 hashar: Jenkins renamed mw-jenkinsbot irc bot to wmf-insecte (french for "bug"). Updated IRC conf to point to chat.freenode.net:7000 with SSL.
  • 19:02 bblack: enabled cp30[14] varnish mobile frontends in esams pybal
  • 17:50 bblack: cp301[34] reinstalls complete, should stay ok in monitoring
  • 17:48 ottomata: resinsalling elastic1008
  • 16:20 springle: db48 mysqld shutdown for decom
  • 16:20 bblack: ignore cp301[34] msgs, reinstalling them
  • 16:10 springle: db63 mysqld shutdown for decom
  • 15:53 ottomata: reinstalling elastic1007
  • 15:52 springle: db48 mysqld set read_only, disabled m2 repl to db1048
  • 15:51 ottomata: disabling puppet on elasti1007 and elastic1008 for reformatting
  • 15:45 mutante: DNS update - removing Tampa msbe/msfe
  • 15:38 Jeff_Green: switched mchenry to use db1048/db1049 for OTRS address lookups
  • 15:24 mutante: DNS update - removing all the Tampa mw/srv mgmt
  • 15:15 mutante: DNS update - removing lvs1-6
  • 14:54 mutante: es5,es6 - revoke puppet certs, salt keys, icinga
  • 14:51 ottomata: powering down stat1 for decom
  • 14:43 mutante: ms-fe[14] - shutting down
  • 14:41 ottomata: disabling puppet on stat1 for decom
  • 14:37 mutante: ms-be 1-12, Tampa Swift boxes, shutdown
  • 14:24 mutante: ms-fe[14] - stop puppet,revoke certs,remove icinga
  • 13:54 mutante: ms-be1-12 - removing from puppet,salt,icinga
  • 13:06 mutante: Bugzilla Apache, changed SSL cipher suite in I7e9adc182dc ,might cost a a few % performance but zirconium had plenty
  • 11:48 hashar: removing mw-jenkinsbot (the wikimedia jenkins installation) from #wikimedia-labs
  • 10:10 hashar: Jenkins upgraded to 1.532.3.
  • 10:06 hashar: Upgrading Jenkins to latest LTS version 1.532.3
  • 07:57 mutante: DNS update - remove api.svc, arptest.pmtpa ..
  • 06:31 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Next round of wikis done building Cirrus indexes, throw into beta mode'
  • 04:04 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 18 04:04:21 UTC 2014 (duration 4m 20s)
  • 03:06 logmsgbot: LocalisationUpdate completed (1.24wmf1) at 2014-04-18 03:06:06+00:00
  • 02:39 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-18 02:39:51+00:00
  • 00:21 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'Ie9b265be9: Enable GlobalCssJs on testwiki & test2wiki (2/2)'
  • 00:21 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'Ie9b265be9: Enable GlobalCssJs on testwiki & test2wiki (1/2)'
  • 00:20 logmsgbot: ori updated /a/common to Ie9b265be9: Enable GlobalCssJs on testwiki & test2wiki
  • 00:18 logmsgbot: ori Finished scap: Cherry-pick Ibe8e67ebf for MobileFrontend on 1.23wmf22 and 1.24wmf1; add GlobalCssJs extension to 1.24wmf1 and 1.23wmf22 (duration: 32m 53s)

April 17

  • 23:45 logmsgbot: ori Started scap: Cherry-pick Ibe8e67ebf for MobileFrontend on 1.23wmf22 and 1.24wmf1; add GlobalCssJs extension to 1.24wmf1 and 1.23wmf22
  • 23:39 logmsgbot: ori scap failed: CalledProcessError Command '/usr/local/bin/mw-update-l10n' returned non-zero exit status 1 (duration: 00m 24s)
  • 23:38 logmsgbot: ori Started scap: Cherry-pick Ibe8e67ebf for MobileFrontend on 1.23wmf22 and 1.24wmf1; add GlobalCssJs extension to 1.24wmf1
  • 23:37 logmsgbot: ori updated /a/common to I2a2abd7f3: Add GlobalCssJs to extension-list
  • 23:29 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I52378a4b4: Add meta to legalteamwiki import sources'
  • 23:28 logmsgbot: ori updated /a/common to I52378a4b4: Add meta to legalteamwiki import sources
  • 23:27 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I373df6138: Normalize TextExtracts config handling (2/2)'
  • 23:27 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I373df6138: Normalize TextExtracts config handling (1/2)'
  • 23:26 logmsgbot: ori updated /a/common to I373df6138: Normalize TextExtracts config handling
  • 23:24 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I7841f74b0: Kill all vestiges of $wgMFRemovableClasses (2/2)'
  • 23:24 logmsgbot: ori synchronized wmf-config/mobile.php 'I7841f74b0: Kill all vestiges of $wgMFRemovableClasses (1/2)'
  • 23:23 logmsgbot: ori updated /a/common to I7841f74b0: Kill all vestiges of $wgMFRemovableClasses
  • 23:18 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I1795c70d1: Create a FeaturedFeed for the Tech News bulletin (2/2)'
  • 23:17 logmsgbot: ori synchronized wmf-config/FeaturedFeedsWMF.php 'I1795c70d1: Create a FeaturedFeed for the Tech News bulletin (1/2)'
  • 23:16 logmsgbot: ori updated /a/common to I1795c70d1: Create a FeaturedFeed for the Tech News bulletin
  • 23:14 logmsgbot: ori synchronized php-1.23wmf22/extensions/ApiSandbox 'I9a56b2c5a: Update ApiSandbox'
  • 23:12 K4-713: updates antifraud rules in payments
  • 22:03 andrewbogott: updated default labs precise image (heartbleed fix)
  • 20:41 manybubbles: elastic1016 restarted and not freaking out any more.
  • 20:37 _joe_: restarting gitblit in order to prevent crippling due to the usual memory leak
  • 20:28 manybubbles: restarting elastic1016 - it is freaking out. If it happens again I'll dig deeper, but for now I consider it a fluke of the rolling restarts today....
  • 20:20 RobH: sorry for the misc-web-lb issues folks, they should be resolved at this time (for now)
  • 20:19 paravoid: lvs1002/1005: commenting first resolv.conf entry until we have a more permanent fix, restarting pybal
  • 20:18 paravoid: disabling puppet on lvs1002/lvs1005
  • 19:57 RobH: still working on issue
  • 19:57 RobH: both cp1043 and cp1044 seem online and serving nginx service, but pybal says they are down still working
  • 19:46 ottomata: power off emery
  • 19:40 RobH: replacing ticket.wikimedia.org cert/key, apache may hiccup
  • 19:33 RobH: blog.w.o cert replacement successful
  • 19:30 ottomata: disabling puppet on emery for decommission
  • 19:29 RobH: blog.w.o certificate swap (yes, again ;), apache may hiccup
  • 19:10 logmsgbot: reedy synchronized wmf-config/
  • 19:09 logmsgbot: reedy synchronized database lists files: I6fc44d3eb829d656d352dab652148dd327b06679
  • 19:04 ottomata: reinstalling elastic1001
  • 18:59 logmsgbot: faidon synchronized wmf-config/CommonSettings.php 'reenable CN CrossWiki Hiding'
  • 18:58 logmsgbot: faidon updated /a/common to Ie95165065: Reenable CentralNotice CrossWiki Hiding
  • 18:57 logmsgbot: reedy synchronized wmf-config/
  • 18:34 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'Touch for I0c36c65bb9f405e03b84d3f6c6b93acda522c5c9'
  • 18:33 logmsgbot: reedy synchronized database lists files: I0c36c65bb9f405e03b84d3f6c6b93acda522c5c9
  • 18:30 ottomata: switching erbium udp2log instance from consuming multicast relay to unicast direct from varnishes
  • 18:21 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.24wmf1
  • 18:10 ottomata: stopping puppet on elastic1001 and elastic1002, reinstalling elastic1002
  • 18:02 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.23wmf22
  • 16:06 logmsgbot: reedy Finished scap: testwiki to 1.24wmf1 and build l10n cache (duration: 26m 06s)
  • 15:47 logmsgbot: anomie synchronized php-1.23wmf22/extensions/VisualEditor 'SWAT: 126913 - backport to wmf22 of critical fixes for the Math extension's VisualEditor tool'
  • 15:47 logmsgbot: anomie synchronized php-1.23wmf22/extensions/Math 'SWAT: 126913 - backport to wmf22 of critical fixes for the Math extension's VisualEditor tool'
  • 15:40 logmsgbot: reedy Started scap: testwiki to 1.24wmf1 and build l10n cache
  • 15:28 logmsgbot: faidon synchronized wmf-config/CommonSettings.php 'disable CN CrossWiki Hiding again'
  • 15:27 logmsgbot: faidon updated /a/common to If74ba5a52: Revert "Enable CentralNotice CrossWiki Hiding"
  • 15:17 manybubbles: updgraded site plugins on Elasticsearch nodes
  • 15:04 ottomata: reinstalling elastic1016
  • 14:02 logmsgbot: reedy updated /a/common to I290bd1ea6: Remove further pmtpa remnants
  • 13:41 manybubbles: synced experimental highlighter to elasticsearch nodes - they'll pick it up on restart
  • 11:05 logmsgbot: reedy synchronized wmf-config/ 'I290bd1ea628563646c02651041fa2cec4a320b56'
  • 10:56 mutante: lvs3,lvs4,lvs5,lvs6 - shutdown
  • 10:42 mutante: lvs1, lvs2 shutdown
  • 10:15 mutante: re-deleting unaccepted salt keys for virt2,5-11
  • 10:11 mutante: lvs1-6 - disable puppet,salt,revoke certs,keys
  • 08:28 mutante: db35,db38 - shutdown
  • 07:47 mutante: db35,db38, stop puppet and salt, revoke certs,keys
  • 07:45 mutante: restarting gitblit
  • 03:49 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 17 03:49:19 UTC 2014 (duration 49m 18s)
  • 03:06 subbu: deployed Parsoid 0bccf02c (deploy SHA 5e25f3b05) @ 1:30 pm PST, Apr 16th, 2014
  • 03:02 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-17 03:02:25+00:00
  • 02:33 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-17 02:33:47+00:00
  • 02:03 springle: stop mysqld on db35 (m1) for decom
  • 01:42 springle: xtrabackup db63 to db60

April 16

  • 23:19 logmsgbot: mwalker Finished scap: SWAT deploy: configuration change 126223 and multimediaviewer 126852 (duration: 04m 07s)
  • 23:15 logmsgbot: mwalker Started scap: SWAT deploy: configuration change 126223 and multimediaviewer 126852
  • 20:03 RobH: osmium cleared from salt, puppetca, and puppetstoredconfig for reinstall with trusty (ignore any icinga alerts, there are no pages)
  • 18:58 ottomata: reinstalling elastic1015
  • 17:54 logmsgbot: reedy synchronized php-1.23wmf22/includes/jobqueue/ 'I4b4dbe4637dc50cd4630ef19d54f01efba10e138'
  • 17:09 paravoid: starting swiftrepl on copper for eqiad->esams copy
  • 17:08 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I7b6e5c2d7: Enable web fonts by default on Hebrew Wikisource'
  • 17:07 logmsgbot: ori synchronized fc-list 'Ib7b2bc21a: updated fonts list and sorted it, rt #810'
  • 17:06 logmsgbot: ori updated /a/common to I7b6e5c2d7: Enable web fonts by default on Hebrew Wikisource
  • 16:37 logmsgbot: reedy synchronized php-1.23wmf21/includes/jobqueue/JobQueueRedis.php 'I678ab55ae3678b5cd944393f2f2048851625f153'
  • 16:36 logmsgbot: reedy synchronized php-1.23wmf22/includes/jobqueue/JobQueueRedis.php 'I678ab55ae3678b5cd944393f2f2048851625f153'
  • 15:38 ottomata: reinstalling elastic1012
  • 13:50 manybubbles: restarting elastic1009 to suck up new config
  • 13:50 manybubbles: raised the number of replicas for labswiki's search directly in elasticsearch because I can't easilly do for cirrus due to access restrictions
  • 13:45 ottomata: reinstalling elastic1011
  • 13:22 mutante: DNS update - remove virt5-15
  • 12:11 mutante: virt5-11 - shut down
  • 11:40 akosiaris: upgraded python-voluptuous on apt.wikimedia.org to 0.8.2-1wmf1
  • 11:39 hashar: Upgraded Zuul to wmf-deploy-20140416-3 (bring in a84f0e4 - "Make queue processing more efficient" which was much needed)
  • 11:29 hashar: upgraded Zuul to wmf-deploy-20140416-2
  • 11:15 mutante: virt5-11 removing from icinga
  • 11:03 mutante: virt5-11 revoked puppet certs and salt keys
  • 10:56 mutante: stopping puppet on virt5-11
  • 10:47 hashar: Upgraded Zuul on gallium to wmf-deploy-20140416 (depends on python-voluptuous 0.7+ , Alexandros packaged 0.8.2 which I manually installed to validate).
  • 09:26 mutante: disabling mw1163 in pybal
  • 07:03 mutante: zirconium - upgrading apache2, php5 packages
  • 06:07 springle: stop mysqld on db38 (x1) for decom
  • 03:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 16 03:46:23 UTC 2014 (duration 46m 22s)
  • 02:55 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-16 02:55:28+00:00
  • 02:28 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-16 02:28:01+00:00
  • 00:43 K4-713: updated listener credentials on thulium

April 15

  • 23:25 logmsgbot: mwalker synchronized wmf-config/abusefilter.php '126168 more abuse filter configuration fun'
  • 23:21 logmsgbot: mwalker Finished scap: Configuration change 126163 and MultimediaViewer 126158 (duration: 02m 15s)
  • 23:19 logmsgbot: mwalker Started scap: Configuration change 126163 and MultimediaViewer 126158
  • 23:08 logmsgbot: mwalker Finished scap: Configuration changes, 113656, 121834, 126065 (duration: 03m 11s)
  • 23:05 logmsgbot: mwalker Started scap: Configuration changes, 113656, 121834, 126065
  • 23:01 hashar: restarting Zuul to clear leaked file descriptor (know issue, fixed upstream)
  • 22:12 awight: crm updated from e3f2859 to 7dafce5
  • 21:51 manybubbles: restarting elastic1009 again
  • 21:39 hashar: jenkins /var/lib/git cleaned up on gallium
  • 21:16 manybubbles: restarting elastic1009 to test performance changes. cluster will go yellow for a few minutes. might go red (wikitech is busted)
  • 21:15 hashar: Jenkins is processing jobs again
  • 21:14 hashar: cleared /tmp/ on integration-slave1002 (filled up by hhvm job, known issue, bug filled already)
  • 21:12 hashar: Zuul locked again :/ Unpooling and repooling Jenkins slaves.
  • 19:50 RoanKattouw: Restarting stuck Jenkins
  • 19:31 manybubbles: setting refresh interval on elasticsearch indexes to 30s to test effect on load
  • 19:24 logmsgbot: reedy synchronized wmf-config/
  • 19:20 logmsgbot: reedy synchronized php-1.23wmf22/includes/PrefixSearch.php 'I82b5ca65864099c180d915055c43e6839bd4f4a2'
  • 19:07 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikisources back to 1.23wmf22
  • 19:07 ottomata: reinstalling elastic1010
  • 19:07 logmsgbot: reedy synchronized php-1.23wmf22/extensions/ProofreadPage
  • 18:41 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikisources back to 1.23wmf21 due to ProofreadPage fatal
  • 18:36 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.23wmf22
  • 17:09 paravoid: stopped pybal on lvs1005
  • 17:06 cmjohnson1: fixing lvs1005 eth1 cable
  • 16:56 cmjohnson1: mw1057 replacing ethernet cable
  • 16:50 manybubbles: raised "new generation" size on elastic1009 to test a performance theory
  • 16:50 cmjohnson1: mw1093 replacing ethernet cable
  • 16:40 cmjohnson1: replacing eth cable on mw1193
  • 16:31 hashar: ... all Jenkins jobs are using /srv/ssd/gerrit instead
  • 16:30 hashar: gallium had two Gerrit replications streams, one of them got removed 122419 thus deleting the target directories under /var/lib/git
  • 16:22 cmjohnson1: shutting down mw1163 to replace DIMM
  • 16:18 cmjohnson1: swapping bad disk slot 4 on dataset1001
  • 16:13 paravoid: moving ms-fe3xxx/ms-be3xxx to private1-esams
  • 16:05 ottomata: reinstalling elastic1009
  • 15:21 logmsgbot: anomie synchronized php-1.23wmf21/extensions/Flow 'SWAT: Flow: Prevent logspam on enwiki 125930'
  • 15:13 logmsgbot: anomie synchronized php-1.23wmf21/extensions/Flow 'SWAT: Flow: Prevent logspam on enwiki 125930'
  • 15:02 mutante: DNS update - removing Tampa service IPs
  • 13:51 hashar: Jenkins compressing console logs of builds. On gallium as user jenkins : find /var/lib/jenkins/jobs -wholename '*/builds/*/log' -type f -exec gzip --best {} \;
  • 13:42 hashar: Command executed (as gerritslave user): find /srv/ssd/gerrit -type d -name '*.git' -exec bash -c 'echo; date; cd {}; echo; pwd; echo; git repack -ad; date;' \;
  • 13:41 hashar: Repacking Gerrit replicated repositories on lanthanum and gallium (both under /srv/ssd/gerrit/ )
  • 13:13 andrewbogott: shutdown and decommissioned virt12
  • 12:19 paravoid: adding ms-be101[345] to Swift eqiad's rings, at 33% weight; old rings kept at ms-fe1001:~/swift-2014-04-14
  • 11:30 mutante: DNS update - removed dbdump.pmtpa.wmnet
  • 11:26 mutante: DNS update - remove db64,db65,db66,db67,db70
  • 10:55 mutante: db64,db67 - powerdown via mgmt
  • 10:51 mutante: db65,db66 - shutdown
  • 10:07 mutante: db70 - powerdown via mgmt
  • 09:47 mutante: db64-67 - puppetstoredconfigclean.rb db${db}.pmtpa.wmnet ; puppetca --clean db${db}.pmtpa.wmnet ; salt-key -d db${db}.pmtpa.wmnet
  • 07:02 springle: shutdown db67 for decom. analytics data is backed up on dbstore1002
  • 06:47 springle: moving pmtpa m1 and x1 slaves to db73 and db69 on 12th floor
  • 03:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 15 03:25:52 UTC 2014 (duration 25m 51s)
  • 02:42 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-15 02:42:48+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-15 02:22:54+00:00
  • 00:09 gwicke: enabled wikinews family in Parsoid with temporary live patch to un-break VE deploy

April 14

  • 23:43 logmsgbot: ori Finished scap: (no message) (duration: 04m 31s)
  • 23:39 ori: scap: php-1.23wmf22/extensions/VisualEditor 2b0979f...0652ad2 (I12e5c9751)
  • 23:38 logmsgbot: ori Started scap: (no message)
  • 23:17 logmsgbot: ori synchronized php-1.23wmf22/skins/vector/variables.less 'Ibcdaff017: Revert body font stack to be just sans-serif'
  • 23:15 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I22f25730d: Enable VisualEditor for opt-in on Meta (2/2)'
  • 23:15 logmsgbot: ori synchronized visualeditor.dblist 'I22f25730d: Enable VisualEditor for opt-in on Meta (1/2)'
  • 23:14 logmsgbot: ori updated /a/common to I22f25730d: Enable VisualEditor for opt-in on Meta
  • 23:12 logmsgbot: ori synchronized visualeditor.dblist 'I59f5a6e0b: Enable VisualEditor on French Wikinews'
  • 23:12 logmsgbot: ori updated /a/common to I59f5a6e0b: Enable VisualEditor on French Wikinews
  • 22:56 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Limit large (djvu) file downloads for thumbnails'
  • 20:37 mwalker: updating payments wiki for worldpay currencies (from af35b7b to 8a93c17)
  • 20:13 subbu: deployed Parsoid fba548cbf (deploy repo sha d0e12ddf)
  • 17:47 paravoid: fixing /e/n/interfaces for static configuration: gadolinium hafnium labsdb1001 labsdb1002 labsdb1003 labstore1001 searchidx1001 ssl1005 ssl1006 ssl1009 virt1001 ytterbium
  • 17:37 paravoid: fixing /e/n/interfaces for static configuration for cp40xx, lvs40xx
  • 17:14 mutante: brewster - power down, could not revive due to disk or SATA controller fail
  • 16:57 ottomata1: shutting down elastic1006 for reinstall
  • 16:45 mutante: powering brewster back on
  • 16:40 paravoid: powering up brewster
  • 16:13 mutante: deleted old svn apache config on formey, started apache
  • 15:22 paravoid: restarting virt0's salt-master, glance-api, glance-registry, keystone, nova-scheduler
  • 15:11 logmsgbot: manybubbles synchronized wmf-config/CirrusSearch-common.php 'SWAT Cirrus update to improve performance'
  • 15:09 logmsgbot: manybubbles synchronized php-1.23wmf21/extensions/CirrusSearch/ 'SWAT deploy to improve performance'
  • 14:48 paravoid: upgrading all snapshot* hosts
  • 14:38 paravoid: upgrading all packages & staggered restart of all of swift (ms-fe/ms-be)
  • 13:22 logmsgbot: reedy synchronized php-1.23wmf22/includes/api/ApiFeedRecentChanges.php 'I268d0a53067738ba96bee74c593358b0b28cc083'
  • 13:22 logmsgbot: reedy synchronized php-1.23wmf21/includes/api/ApiFeedRecentChanges.php 'I268d0a53067738ba96bee74c593358b0b28cc083'
  • 13:15 paravoid: staggered upgrades for all pending updates on all mw* boxes & restarting apaches/other core services
  • 11:08 mutante: brewster - shut down
  • 10:49 logmsgbot: reedy synchronized wmf-config/interwiki.cdb 'Updating interwiki cache'
  • 10:05 apergos: had to toss extensions/Elastica on virt1000 and run git submodule update --init --recursive seems to be working now
  • 09:26 mutante: deleting huge pybal log on lvs3001
  • 09:01 mutante: brewster - stop lighttpd,bacula-fd,haproxy,dhcp3-server,rsync,nrpe,salt
  • 07:54 mutante: brewster - disabling puppet agent, removed from site.pp, revoke puppet cert
  • 03:23 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 14 03:22:58 UTC 2014 (duration 22m 57s)
  • 02:42 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-14 02:42:05+00:00
  • 02:23 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-14 02:22:58+00:00

April 13

  • 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 13 03:19:45 UTC 2014 (duration 19m 44s)
  • 02:39 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-13 02:39:11+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-13 02:20:49+00:00

April 12

  • 05:03 logmsgbot: ori updated /a/common to I5f900190c: Replace $channel with $variant; make it Beta-only
  • 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 12 03:21:43 UTC 2014 (duration 21m 42s)
  • 02:42 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-12 02:42:07+00:00
  • 02:23 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-12 02:23:02+00:00

April 11

  • 23:55 RoanKattouw: Restarting stuck Jenkins
  • 23:35 K4-713: synchronized payments to af35b7b
  • 23:25 K4-713: synchronized payments to b321163
  • 19:50 ottomata: upgraded wikitech to MediaWiki 1.23wmf22, applied security patch
  • 18:19 ottomata: rebooting elastic1003
  • 18:14 Krinkle: git-deploy: Deploying integration/slave-scripts I38b90e8c08d7cb
  • 18:08 Krinkle: git-deploy: Deploying integration/slave-scripts I04d8e308daedb3ccb8
  • 17:41 Krinkle: git-deploy: Deploying integration/slave-scripts 'Ia9ee438fa2675170'
  • 14:27 ottomata: reinstalling elastic1005
  • 04:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 11 04:33:40 UTC 2014 (duration 33m 39s)
  • 03:47 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-11 03:47:01+00:00
  • 02:41 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-11 02:41:33+00:00
  • 02:29 ori_: graphite: carbon instance 'f' saturates a cpu core. it's the instance that mediawiki profiling data gets hashed to. collector should probably emit to statsd and have statsd compute per-minute rollups
  • 00:06 marktraceur: leaving MultimediaViewer slightly broken on enwiki based on the fact that logged-in users seem mostly unaffected and other wikis aren't seeing issues, will investigate more tomorrow and fix on Monday

April 10

  • 23:54 bd808: Enabled beta update Jenkins job (https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/)
  • 23:37 logmsgbot: mwalker Finished scap: Attempting to regenerate i18n keys for multimediaviewer (duration: 03m 33s)
  • 23:34 logmsgbot: mwalker Started scap: Attempting to regenerate i18n keys for multimediaviewer
  • 23:16 logmsgbot: mwalker synchronized wmf-config/filebackend.php
  • 23:09 logmsgbot: mwalker synchronized wmf-config/InitialiseSettings.php 'touched to see if that pushes changes to FileBackend.php'
  • 23:03 mwalker: sync-common for 125340 and 125335
  • 22:53 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/modules/ve-mw/ui/tools/ 'touch *.js'
  • 22:35 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/modules/ve-mw/ui/tools/ve.ui.MWReferenceDialogTool.js 'touch'
  • 22:12 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/lib/ve/modules/ve/ui/ve.ui.Toolbar.js 'touch'
  • 22:10 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/lib/ve/lib/oojs-ui/oojs-ui.js 'touch'
  • 22:10 logmsgbot: krinkle synchronized php-1.23wmf21/resources/startup.js 'touch'
  • 22:10 logmsgbot: krinkle synchronized php-1.23wmf21/resources/oojs-ui/oojs-ui.js 'touch'
  • 21:50 Krinkle: VisualEditor throws uncaught error on load for 1.23wmf21 wikis (bug 63791)
  • 21:15 bd808: Disabled beta update Jenkins job (https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/) so that scap testing can happen in beta.
  • 19:29 ottomata: reinstalling elastic1004
  • 19:19 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'I5501078cee871fb9df03e085547b7a047ef5bd7e'
  • 19:16 logmsgbot: ori synchronized wmf-config/InitialiseSettings-labs.php 'Ia79b1b848: Work around bug 63780 by specifying a siteParamsCallback'
  • 19:15 logmsgbot: ori updated /a/common to Ia79b1b848: Work around bug 63780 by specifying a siteParamsCallback
  • 18:44 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 18:43 logmsgbot: reedy synchronized database lists files: Enable MediaViewer on mediawikiwiki
  • 18:42 logmsgbot: reedy synchronized docroot and w
  • 18:41 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf22
  • 18:37 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.23wmf21
  • 18:33 logmsgbot: reedy synchronized php-1.23wmf21/extensions/MultimediaViewer
  • 16:55 ottomata: shutting down elastic1003 for reinstall and reformat
  • 16:42 logmsgbot: reedy updated /a/common to Ie72029103: Add/update symlinks
  • 16:40 logmsgbot: reedy Finished scap: testwiki to 1.23wmf22 and build l10n cache (duration: 24m 45s)
  • 16:15 logmsgbot: reedy Started scap: testwiki to 1.23wmf22 and build l10n cache
  • 16:14 logmsgbot: reedy updated /a/common to I2cccebdd7: wikidatawiki back to 1.23wmf21
  • 15:03 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikidatawiki back to 1.23wmf21...
  • 15:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 10 15:00:01 UTC 2014 (duration 27m 49s)
  • 14:17 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-10 14:17:16+00:00
  • 13:49 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-10 13:49:24+00:00
  • 13:31 logmsgbot: reedy Finished scap: l10n cache update for wikidatawiki (duration: 19m 15s)
  • 13:12 logmsgbot: reedy Started scap: l10n cache update for wikidatawiki
  • 12:42 bblack: removed broken pdns_gmetric cronjob on lvs boxes
  • 09:44 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I107179a27: Move HHVM extension blacklist below extract($globals) so it isn't simply clobbered'
  • 09:42 logmsgbot: ori updated /a/common to I107179a27: Move HHVM extension blacklist below extract($globals) so it isn't simply clobbered
  • 09:39 hashar: Zuul processed its backlog. Had to disconnect/reconnect the labs slaves. There is some weird bug occurring :-(
  • 09:29 hashar: Jenkins: disabling Gearman client in https://integration.wikimedia.org/ci/configure and reenabling it
  • 09:20 hashar: Jenkins unpooling both slave labs using the web interface and killing the Jenkins client running as jenkins-deploy . Will repool so the job can be reregistered properly [[bugzilla:63760|bug 63760]]
  • 09:11 mutante: DNS update - removing ms6
  • 09:04 hashar: Jenkins bunch of jobs are not being triggered properly. Taking traces.
  • 08:55 mutante: ms6 - shutdown -h now
  • 08:42 mutante: forcing Bugzilla logout for all users
  • 08:19 logmsgbot: aude synchronized php-1.23wmf20/extensions/Wikidata
  • 08:09 logmsgbot: aude synchronized php-1.23wmf20/extensions/Wikidata
  • 07:57 logmsgbot: aude rebuilt wikiversions.cdb and synchronized wikiversions files: Rebuild wikiversions and put wikidata on 1.23wmf20
  • 07:53 logmsgbot: aude synchronized wikiversions.json 'Put Wikidata back on 1.23wmf20, due to localisation cache issues'
  • 07:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 10 07:21:16 UTC 2014 (duration 7m 21s)
  • 06:46 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-10 06:45:59+00:00
  • 06:27 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-10 06:27:30+00:00
  • 06:27 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I20bbe05cc: Avoid using bits on beta-hhvm.wmflabs.org'
  • 06:26 logmsgbot: ori updated /a/common to I20bbe05cc: Avoid using bits on beta-hhvm.wmflabs.org
  • 06:15 ori: Some interface messages are missing on wikidata.org. Started a manual l10nupdate.
  • 04:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 10 04:11:47 UTC 2014 (duration 11m 46s)
  • 03:19 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-10 03:19:18+00:00
  • 03:01 logmsgbot: ori synchronized multiversion/MWMultiVersion.php 'Ibdbac982b: Update multiversion regexp for *.beta-hhvm.wmflabs.org'
  • 03:01 logmsgbot: ori updated /a/common to Ibdbac982b: Update multiversion regexp for *.beta-hhvm.wmflabs.org
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-10 02:22:11+00:00
  • 01:49 logmsgbot: ori synchronized wmf-config/InitialiseSettings-labs.php 'I697f7e4a6: Use to branch on interpreter'
  • 01:48 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I697f7e4a6: Use to branch on interpreter'
  • 01:48 logmsgbot: ori updated /a/common to I697f7e4a6: Use '$channel' to branch on interpreter
  • 01:08 K4-713: updated payments to e1d00b61a703
  • 01:06 Krinkle: git-deploy: Deploying integration/slave-scripts If2539c
  • 01:05 Krinkle: Undid local patch to "grunt-lib-phantomjs/phantomjs/main.js" (for bug 63579) in "/srv/deployment/integration/slave-scripts" on gallium
  • 00:20 logmsgbot: yurik synchronized wmf-config/InitialiseSettings.php
  • 00:08 awight: updated crm from e726e42 to e3f2859
  • 00:06 K4-713: updated payments to 70dce8f4bc7

April 9

  • 23:36 logmsgbot: ebernhardson synchronized php-1.23wmf21/extensions/Math/modules/VisualEditor/ve.ui.MWMathInspectorTool.js 'Update Math VE tool to use a command in 1.23wmf21'
  • 23:32 logmsgbot: ebernhardson synchronized wmf-config/CommonSettings.php 'Update Flow cache version'
  • 23:22 logmsgbot: ebernhardson synchronized php-1.23wmf21/extensions/Flow 'Backport fix DB-to-cache pipeline for 1.23wmd21'
  • 23:05 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings-labs.php 'Enable math VE plugin on labs'
  • 23:04 Krinkle: Jenkins and Zuul are back up. Queues have not been preserved.
  • 23:01 ^d: gerrit: reloaded bugzilla plugin to force it to log back in
  • 23:00 Krinkle: Restarting Jenkins because I have no clue what is going on and have no time to investigate yet another random clogging of all jobs. Restart ought to fix it.
  • 22:54 Krinkle: Zuul has lots of queued jobs for npm slaves, but neither Jenkins nor integration-slave1001.eqiad.wmflabs and 1002 themselves have anything queued. They're idle, responsive and waiting for jobs.
  • 22:47 Krinkle: Jenkins slaves in labs seem to be down. Zuul is stacking up jobs for hasNpm nodes (integration slaves in labs). Both slaves have 7/7 executors idle.
  • 22:33 hoo: Logged out all Bugzilla users by deleting all session cookie data from mysql
  • 19:15 logmsgbot: csteipp synchronized php-1.23wmf21/extensions/CentralAuth/maintenance
  • 19:10 logmsgbot: yurik synchronized wmf-config/InitialiseSettings.php
  • 17:38 logmsgbot: yurik synchronized wmf-config/InitialiseSettings.php
  • 17:22 manybubbles: regenerating Elasticsearch index from mediawiki for testwiki to soak up geo changes.
  • 16:48 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/124880'
  • 16:41 manybubbles: reindexed testwiki to soak up geo changes
  • 16:37 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/124876'
  • 16:32 logmsgbot: maxsem synchronized php-1.23wmf21/extensions/GeoData
  • 16:28 manybubbles: fiddling with Elasticsearch cluster balancing options trying to get enwiki better balanced
  • 16:17 logmsgbot: aude synchronized php-1.23wmf21/extensions/Wikidata 'Switch Wikidata back to previous version of Wikibase'
  • 15:52 mutante: ms6 - revoke puppet cert, salt key, remove from icinga
  • 15:02 ottomata: stopped puppet on emery to test sqstat on analytics1003
  • 14:48 ottomata: disabling puppet to test sqstat on analytics1003
  • 14:14 RobH: otrs back up, live hacked apache change, now working permanent puppet change (puppet is disabled on iodine at present)
  • 14:02 RobH: yes, otrs is totally ssl borked, robh is working on it
  • 14:00 mutante: adding filippo to ops/wmf LDAP groups
  • 13:58 RobH: updating otrs cert
  • 09:19 logmsgbot: hashar synchronized wmf-config/InitialiseSettings.php '[] = 'musees.cg70.fr'; 124754 [[bugzilla:63449|bug 63449]]'
  • 08:39 hashar: Gerrit Letting JenkinsBot submit changes on apps/android/*
  • 03:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 9 03:33:25 UTC 2014 (duration 33m 24s)
  • 02:43 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-09 02:43:52+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-09 02:19:25+00:00
  • 01:25 Krinkle: Bug 63579 is still happening occasionally. Leaving patch on gallium in place for now.
  • 01:09 ori: Debugging uWSGI init scripts on tungsten; expect some Graphite / Gdash flapping.
  • 00:15 ori: graphite webapp 502 caused by uwsgi's init script not restarting the service correctly
  • 00:07 Krinkle: graphite.wikimedia.org (e.g. https://graphite.wikimedia.org/render/?) is serving 502 Bad Gateway, ori is investigating
  • 00:04 Krinkle: To investigate bug 63579, manually patched "grunt-lib-phantomjs/phantomjs/main.js" in "/srv/deployment/integration/slave-scripts" on gallium

April 8

  • 23:34 logmsgbot: mwalker synchronized php-1.23wmf21/extensions/MultimediaViewer/ 'Updating MultimediaViewer for 124510'
  • 23:08 logmsgbot: csteipp synchronized php-1.23wmf21/extensions/CentralAuth/maintenance 'Push maintenance script for token reset'
  • 21:21 logmsgbot: bd808 Purged l10n cache for 1.23wmf19
  • 21:20 logmsgbot: bd808 Purged l10n cache for 1.23wmf18
  • 21:12 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.23wmf21
  • 19:58 manybubbles: finished upgrading to Elasticsearch 1.1.0. The process went well with no issues other then some knocking out search in labs 3 times for 30 seconds a piece. And logging lots of nasty warnings to irc. I've started to the process to fix search in labs so it won't happen again.
  • 19:56 manybubbles: upgraded all elasticsearch servers except elastic1008. that is coming now.
  • 18:45 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I4b18e4ce8: Change wgServer and wgCanonicalServer for arbcom wikis'
  • 18:45 logmsgbot: ori updated /a/common to I4b18e4ce8: Change wgServer and wgCanonicalServer for arbcom wikis
  • 18:13 logmsgbot: bd808 Finished scap: group0 wikis to 1.23wmf21 (with patch for bug 63659) (duration: 03m 18s)
  • 18:10 logmsgbot: bd808 Started scap: group0 wikis to 1.23wmf21 (with patch for bug 63659)
  • 18:01 logmsgbot: hoo synchronized wmf-config/InitialiseSettings.php 'Touch to clear config. cache'
  • 17:56 hoo: changed the Wikidata wb_changes_dispatch position of all wikiquote wikis to 118158153
  • 17:37 logmsgbot: hoo synchronized php-1.23wmf20/extensions/Wikidata/extensions/Wikibase/lib/resources/wikibase.Site.js 'touch'
  • 17:37 logmsgbot: aude synchronized wmf-config/Wikibase.php 'bump wgCacheEpoch for wikidata after enabling wikiquote site links'
  • 17:35 ottomata: restarted gmetad on nickel to fix ganglia
  • 17:29 logmsgbot: aude synchronized wikidataclient.dblist 'Enable Wikibase on Wikiquote'
  • 17:29 logmsgbot: aude synchronized wmf-config 'config changes to enable Wikibase on Wikiquote'
  • 17:22 logmsgbot: aude synchronized wmf-config/CirrusSearch-labs.php 'config change for beta, to enable highlighting'
  • 17:16 manybubbles: finished upgrading elastic1001-1006. starting on 1007. yay progress.
  • 17:03 logmsgbot: aude synchronized php-1.23wmf20/extensions/Wikidata 'Update Wikidata build, to allow populating sites table on wikiquote'
  • 16:31 aude: added sites and site_identifiers core tables on wikiquote
  • 16:28 hashar: Jenkins: killed jenkins-slave java process on gallium and repooled gallium slave. It was no more registered in Zuul :-/
  • 14:32 manybubbles: no harm done, just lost time
  • 14:32 manybubbles: woops, just restarted elastic1002. silly me
  • 14:31 manybubbles: upgrading elastic1001
  • 13:54 manybubbles: they'll pick it up during the rolling restart today to upgrade to 1.1.0
  • 13:53 manybubbles: synced first Elasticsearch plugin to production Elasticsearch servers
  • 13:46 RobH: upgraded libssl on holmium
  • 13:39 Jeff_Green: update & reboot tellurium
  • 13:39 RobH: replacing the blog cert, if holmium crashes I didn't do it correctly.
  • 13:37 mutante: restarting gitblit
  • 12:56 logmsgbot: reedy updated /a/common to Id15ddc665: Revert "Group0 wikis to 1.23wmf21"
  • 10:21 Jeff_Green: update & reboot barium
  • 10:15 Jeff_Green: update & reboot samarium
  • 07:47 _joe|away: restarted nginx on cp1044 and cp1043
  • 05:47 apergos: shot many old apache processes running as stats user from 2013, on stat1001 (restarting apache runs it as www-data user)
  • 05:39 apergos: restarted apache on fenari magnesium yterrbium antimony
  • 05:31 _joe_: upgraded openssl on cp10* and cp30* servers as well
  • 04:46 Tim: on dataset1001: upgraded libssl and restarted lighttpd
  • 04:43 Tim: restarted apache on the above list, failed on labs-ns1, virt1000, ytterbium
  • 04:41 Tim: upgraded libssl on zirconium.wikimedia.org,neon.wikimedia.org,netmon1001.wikimedia.org,iodine.wikimedia.org,ytterbium.wikimedia.org,gerrit.wikimedia.org,virt1000.wikimedia.org,labs-ns1.wikimedia.org,stat1001.wikimedia.org
  • 04:38 Ryan_Lane: upgrading libssl on virt0
  • 04:37 Ryan_Lane: upgrading libssl on virt1000
  • 04:15 Tim: also upgraded libssl on cp4001-4019. Restarted nginx on these servers and also the previous list.
  • 04:03 Tim: upgrading libssl on ssl1001,ssl1002,ssl1003,ssl1004,ssl1005,ssl1006,ssl1007,ssl1008,ssl1009,ssl3001.esams.wikimedia.org,ssl3002.esams.wikimedia.org,ssl3003.esams.wikimedia.org
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 8 03:11:04 UTC 2014 (duration 11m 3s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-08 02:34:56+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-08 02:15:58+00:00
  • 01:06 logmsgbot: bd808 Finished scap: revert group0 to 1.23wmf21 (testwiki still on 1.23wmf21) (duration: 09m 54s)
  • 00:56 logmsgbot: bd808 Started scap: revert group0 to 1.23wmf21 (testwiki still on 1.23wmf21)
  • 00:54 logmsgbot: bd808 scap aborted: group0 to 1.23wmf21 (testing python change for mwversionsinuse) (again) (duration: 00m 25s)
  • 00:54 logmsgbot: bd808 Started scap: group0 to 1.23wmf21 (testing python change for mwversionsinuse) (again)
  • 00:53 logmsgbot: bd808 scap aborted: group0 to 1.23wmf21 (testing python change for mwversionsinuse) (duration: 02m 57s)
  • 00:50 logmsgbot: bd808 Started scap: group0 to 1.23wmf21 (testing python change for mwversionsinuse)
  • 00:25 logmsgbot: catrope synchronized php-1.23wmf21/extensions/VisualEditor 'it helps if you run git submodule update first'
  • 00:24 logmsgbot: catrope synchronized php-1.23wmf20/extensions/VisualEditor 'it helps if you run git submodule update first'

April 7

  • 23:58 logmsgbot: catrope synchronized php-1.23wmf20/extensions/VisualEditor/ 'VisualEditor bug fixes'
  • 23:57 logmsgbot: catrope synchronized php-1.23wmf20/skins/vector/variables.less 'Remove troublesome fonts from font stack'
  • 23:50 logmsgbot: catrope synchronized php-1.23wmf21/resources/oojs-ui/ 'Update OOJS-UI for bug fixes'
  • 23:50 logmsgbot: catrope synchronized php-1.23wmf21/extensions/VisualEditor/ 'VisualEditor bug fixes'
  • 23:49 logmsgbot: catrope synchronized php-1.23wmf21/skins/vector/variables.less 'Remove troublesome fonts from font stack'
  • 23:45 logmsgbot: catrope synchronized php-1.23wmf21/extensions/VisualEditor/ 'VisualEditor bug fixes'
  • 23:42 logmsgbot: catrope synchronized php-1.23wmf21/resources/oojs-ui/ 'Update OOJS-UI for bug fixes'
  • 23:42 logmsgbot: catrope synchronized php-1.23wmf21/skins/vector/variables.less 'Remove troublesome fonts from font stack'
  • 23:17 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'SWAT changes: other projects bar on frwikisource, import sources'
  • 23:06 logmsgbot: bd808 Finished scap: test2wiki to 1.23wmf20 (duration: 16m 48s)
  • 22:49 logmsgbot: bd808 Started scap: test2wiki to 1.23wmf20
  • 22:38 logmsgbot: bd808 Finished scap: test2wiki to 1.23wmf21 (duration: 12m 07s)
  • 22:26 logmsgbot: bd808 Started scap: test2wiki to 1.23wmf21
  • 22:12 logmsgbot: bd808 Finished scap: Testing 1.23wmf21 l10n changes (duration: 01m 31s)
  • 22:10 logmsgbot: bd808 Started scap: Testing 1.23wmf21 l10n changes
  • 22:07 logmsgbot: bd808 Finished scap: Testing 1.23wmf21 l10n changes (duration: 03m 49s)
  • 22:04 logmsgbot: bd808 Started scap: Testing 1.23wmf21 l10n changes
  • 21:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 7 21:13:14 UTC 2014 (duration 1m 59s)
  • 20:40 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-07 20:40:02+00:00
  • 20:23 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-07 20:23:14+00:00
  • 19:08 ottomata: temporatily disabling puppet on analytics 1009, 1010, 1019, 1020 to bring up new journalnodes
  • 18:39 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 7 18:39:21 UTC 2014 (duration 15m 30s)
  • 18:23 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Added "downloadtiff" pool counter config'
  • 18:13 AaronSchulz: shwiki queue finished emptying out in staggered loop on terbium
  • 18:03 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-07 18:03:48+00:00
  • 17:36 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-07 17:36:02+00:00
  • 17:24 bd808: Manually running l10nupdate with new --verbose flag to capture log output
  • 14:34 MaxSem: Rebuilding GeoData index
  • 14:09 hashar: Jenkins cleared swap on gallium (swapoff -a && swapon -a). Makes ganglia graph nicer :D
  • 13:15 apergos: reenabled puppet on dataset2, testing done
  • 12:23 apergos: disabled puppet on dataset2, testing
  • 11:19 logmsgbot: reedy Finished scap: because we're scappy... (rebuilding l10n cache for 1.23wmf21 (duration: 18m 04s)
  • 11:01 logmsgbot: reedy Started scap: because we're scappy... (rebuilding l10n cache for 1.23wmf21
  • 10:45 hashar: integration Getting PHP Composer installed on labs slaves. 124305
  • 09:21 paravoid: reactivating peerings with HE, issues reportedly resolved
  • 09:04 hashar: restarted Zuul
  • 08:54 hashar: gallium killed console-kit-daemon process which was eating a lot of memory
  • 08:42 hashar: Restarting Jenkins, out of Java heap space. Something is leaking memory
  • 08:41 hashar: Jenkins being broken for some reason AGAIN !
  • 05:04 ori: Zuul is stuck: <http://i.imgur.com/o5ghCam.jpg> (617kb image)
  • 02:56 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 7 02:56:12 UTC 2014 (duration 56m 11s)
  • 02:20 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-07 02:20:10+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-07 02:13:42+00:00

April 6

  • 02:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 6 02:53:13 UTC 2014 (duration 53m 12s)
  • 02:18 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-06 02:18:43+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-06 02:12:57+00:00
  • 01:32 jamesofu_: sugar down for move to labs

April 5

  • 04:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 5 04:15:00 UTC 2014 (duration 50m 39s)
  • 03:56 logmsgbot: andrew rebuilt wikiversions.cdb and synchronized wikiversions files: Revert mw.org, test2wiki and testwikidatawiki to 1.23wmf20 due to localisation issue
  • 03:51 Andrew: Reverting mw.org, test2 and test.wikidata back to 1.23wmf20
  • 03:41 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-05 03:41:36+00:00
  • 03:36 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-05 03:36:04+00:00
  • 03:23 Andrew: Actually, going to rerun l10nupdate first just to check.
  • 03:22 Andrew: Going to revert deployment of 1.23wmf21 again - still broken
  • 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 5 03:08:33 UTC 2014 (duration 8m 32s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-05 02:34:54+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-05 02:14:07+00:00

April 4

  • 21:34 logmsgbot: bd808 Finished scap: Group0 to 1.23wmf21 (again) (duration: 14m 35s)
  • 21:19 logmsgbot: bd808 Started scap: Group0 to 1.23wmf21 (again)
  • 19:28 hashar: Jenkins: unpooled slave agent on lanthanum, killed it the java agent on it and repooled it.
  • 19:22 hashar: Jenkins is processing jobs again. Queue unchanged so it will resume everything
  • 19:16 hashar: restarting Jenkins
  • 19:07 hashar: Jenkins un pooling gallium slave
  • 19:05 hashar: Zuul / Jenkins stalled again.
  • 18:43 csteipp: redeployed updated patch for bug63251 to fix a reported bug
  • 16:10 _joe_: restarting gitlbit, for the last time today
  • 15:06 _joe_: restarting gitblit as it has eaten up all of its ram again and is trashing cpu
  • 12:32 mutante: hume - shutting down
  • 12:06 mutante: hume - disable puppet/salt/monitoring
  • 11:13 mutante: restarting gitblit with new option to use incremental GC in an attempt to fix timeouts caused by GC eating CPU
  • 08:07 paravoid: deactivating cr1-eqiad<->HE peerings, translantic par2<->ash1 is congested
  • 07:25 mutante: restarting gitblit
  • 05:45 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 4 05:45:07 UTC 2014 (duration 18m 25s)
  • 04:56 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-04 04:56:06+00:00
  • 04:45 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-04 04:45:01+00:00
  • 04:20 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: unbreak test2.wp and test.wikidata as well
  • 04:17 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: mw.org back to 1.23wmf20
  • 03:43 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 4 03:43:03 UTC 2014 (duration 43m 2s)
  • 03:28 ori: Interface messages are missing on group0 / 1.23wmf21 wikis (mediawikiwiki, testwiki, test2wiki, and testwikidata)
  • 02:50 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-04 02:50:26+00:00
  • 02:24 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-04 02:24:51+00:00
  • 01:08 logmsgbot: krinkle synchronized php-1.23wmf21/resources 'I6e93d9ab0e4a926c09c'

April 3

  • 22:00 logmsgbot: demon synchronized wmf-config/CirrusSearch-production.php 'lowering cache time, for testing'
  • 21:55 logmsgbot: demon updated /a/common/php-1.23wmf20 to Ic853ebff4: Cherry-pick I550eb4b0a8fa18344e8b0de3ec85d61c2122ffb8
  • 21:54 logmsgbot: demon synchronized php-1.23wmf20/extensions/CirrusSearch 'Cirrus back to master again'
  • 21:50 logmsgbot: ori synchronized multiversion/updateBitsBranchPointers 'updateBitsBranchPointers: get rid of 'static-stable' branch link'
  • 21:50 logmsgbot: ori updated /a/common to Ic1602c045: updateBitsBranchPointers: get rid of 'static-stable' branch link
  • 21:46 logmsgbot: demon synchronized php-1.23wmf20/extensions/CirrusSearch 'Rolling back to 1.23wmf20 branch point from master'
  • 21:38 logmsgbot: demon synchronized php-1.23wmf20/extensions/CirrusSearch 'Updating Cirrus to master'
  • 21:33 logmsgbot: demon synchronized wmf-config/CirrusSearch-production.php 'italian wikis getting interwiki search. they're my favorite beta testers'
  • 19:23 logmsgbot: reedy synchronized docroot and w
  • 19:21 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf21
  • 19:17 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias actually to 1.23wmf20
  • 19:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf20
  • 19:09 logmsgbot: reedy Finished scap: testwiki to 1.23wmf21 and build l10n cache (duration: 38m 23s)
  • 18:30 logmsgbot: reedy Started scap: testwiki to 1.23wmf21 and build l10n cache
  • 18:23 logmsgbot: reedy updated /a/common to I835c2b1d5: Depool. See RT 7191.
  • 11:10 paravoid: IPv4 eqiad<->esams private link also elevated by ~15ms but no packet loss observed
  • 11:09 paravoid: affects both IPv6 transit at esams (slowdowns) as well as IPv6 eqiad<->esams
  • 11:08 paravoid: deactivating cr1-esams<->HE peering, latency > 160ms, over at 200ms (congestion?); back to 84ms now;
  • 10:51 akosiaris: temporarily stopped squid on brewster
  • 10:26 hashar: Jenkins job mediawiki-core-phpunit-hhvm is back around thanks to 123573
  • 06:28 paravoid: powercycling ms-be1003, unresponsive, no console output
  • 04:43 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'return upgraded DB slaves to normal load'
  • 04:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 repool db1015, warm up'
  • 04:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 depool db1015 for upgrade'
  • 04:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 repool db1037, warm up'
  • 03:53 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 depool db1037 for upgrade'
  • 03:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 3 03:53:18 UTC 2014 (duration 53m 16s)
  • 03:34 springle: db1020 raid controller dimm ecc errors
  • 03:14 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 depool db1020 for upgrade'
  • 03:12 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 repool db1019, warm up'
  • 02:57 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 depool db1019 for upgrade'
  • 02:56 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1060, warm up'
  • 02:48 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-03 02:48:01+00:00
  • 02:47 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 depool db1060 for upgrade'
  • 02:45 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1061, warm up'
  • 02:35 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1061 for upgrade'
  • 02:24 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-04-03 02:24:07+00:00

April 2

  • 23:47 logmsgbot: aaron synchronized wmf-config/CommonSettings.php 'Bumped wgJobBackoffThrottling for htmlCacheUpdate to 15'
  • 23:47 mwalker: ... deploy was for mobile frontend 123454
  • 23:46 logmsgbot: mwalker synchronized php-1.23wmf20/extensions/MobileFrontend 'SWAT deploy for MaxSem'
  • 20:23 subbu: deployed Parsoid 33471172 with deploy repo sha 5c620e54
  • 19:03 logmsgbot: ori synchronized php-1.23wmf20/extensions/WikimediaEvents 'Update WikimediaEvents for I7fdaa5524: Use simple random sampling to log deprecated usage at 1:100'
  • 19:03 logmsgbot: ori synchronized php-1.23wmf19/extensions/WikimediaEvents 'Update WikimediaEvents for I7fdaa5524: Use simple random sampling to log deprecated usage at 1:100'
  • 17:00 andrewbogott: fixed updating crons on wikitech-status, I think. Time will tell...
  • 16:19 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'Lower timeout on prefix searches and make the cirrus.dblist sync I just did take effect.'
  • 16:19 logmsgbot: manybubbles synchronized cirrus.dblist 'Cirrus as primary for most of group1'
  • 16:14 akosiaris: banned tools-exec-03.eqiad.wmflabs. using manual iptables on ytterbium
  • 15:20 ottomata: stopping puppet on stat1
  • 14:27 hashar: Jenkins applying label contintLabsSlave on slaves in labs used for ci (integration-slave1001 and 1002)
  • 14:15 hashar: Jenkins deleting pmtpa slaves (they all have been shutdown and jobs got deleted)
  • 14:00 manybubbles: tried restarting some lsearchd services (carefully) to clear out some crashing when searching for a particular query term. It caused pool queue full errors.... serves me right for trying?
  • 11:20 mutante: running CheckUser/maintenance/purgeOldData.php on all wikis
  • 09:42 akosiaris: rsynced brewster /srv to carbon
  • 09:34 mutante: restarting gitblit on antimony
  • 09:14 mutante: DNS update - removing capella
  • 09:09 mutante: DNS update - removing ms10
  • 05:31 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'normal loads for all upraded slaves'
  • 04:53 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1062, warm up'
  • 04:45 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1062 for upgrade'
  • 04:42 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 repool db1039, warm up'
  • 04:27 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 depool db1039 for upgrade'
  • 03:56 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 repool db1006, warm up'
  • 03:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 2 03:48:31 UTC 2014 (duration 48m 30s)
  • 03:46 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 depool db1006 for upgrade'
  • 03:43 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 repool db1045, warm up'
  • 03:27 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 depool db1045 for upgrade'
  • 03:21 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 repool db1059, warm up'
  • 03:07 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 depool db1059 for upgrade'
  • 03:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1063, warm up'
  • 02:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 depool db1063 for upgrade'
  • 02:52 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-02 02:52:48+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-04-02 02:29:18+00:00
  • 02:22 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 repool db1027, warm up'
  • 02:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 depool db1027 for upgrade'
  • 01:16 logmsgbot: ori synchronized php-1.23wmf20/extensions/WikimediaEvents 'Undeployed change from earlier SWAT deploy'
  • 01:16 logmsgbot: ori synchronized php-1.23wmf19/extensions/WikimediaEvents 'Undeployed change from earlier SWAT deploy'
  • 01:05 logmsgbot: ori synchronized php-1.23wmf19/extensions/WikimediaEvents
  • 01:04 logmsgbot: ori synchronized php-1.23wmf19/extensions/EventLogging
  • 01:02 logmsgbot: ori synchronized php-1.23wmf20/extensions/WikimediaEvents
  • 01:02 logmsgbot: ori synchronized php-1.23wmf20/extensions/EventLogging

April 1

  • 23:48 logmsgbot: ebernhardson synchronized php-1.23wmf19/extensions/WikimediaEvents/ 'Update WikimediaEvents to master'
  • 23:48 logmsgbot: ebernhardson synchronized php-1.23wmf19/extensions/EventLogging/ 'Update EventLogging to master'
  • 23:47 logmsgbot: ebernhardson synchronized php-1.23wmf20/extensions/EventLogging/ 'Update EventLogging to master'
  • 23:46 logmsgbot: ebernhardson synchronized php-1.23wmf20/extensions/WikimediaEvents/ 'Update WikimediaEvents to master'
  • 23:32 logmsgbot: ebernhardson synchronized docroot and w
  • 21:42 hashar: Ganglia in labs is more or less back in activity: http://ganglia.wmflabs.org/ No clue what it is graphing though
  • 21:27 hashar: jenkins killed stuck build (5 hours+) of beta-update-databases-eqiad . Might have been blocking Jenkins build queue
  • 19:09 Reedy: ori gracefulled mw1018, mw1050, mw1061, mw1070, mw1139, mw1179
  • 18:45 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: zerowiki to 1.23wmf20
  • 18:43 logmsgbot: reedy updated /a/common to If887effe5: Add zerowiki
  • 18:43 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 18:42 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Add zerowiki
  • 18:41 logmsgbot: reedy synchronized database lists files:
  • 18:37 logmsgbot: reedy synchronized docroot and w
  • 18:36 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: non wikipedias to 1.23wmf20
  • 18:28 mutante: ms10 - shut down
  • 18:19 mutante: ms10 - disable puppet, revoke puppet cert,salt key,icinga..
  • 18:03 logmsgbot: ori synchronized php-1.23wmf20/includes/profiler/ProfilerSimple.php 'Iad91f1d12: Send profiled items under the correct name'
  • 18:02 logmsgbot: ori synchronized php-1.23wmf19/includes/profiler/ProfilerSimple.php 'Iad91f1d12: Send profiled items under the correct name'
  • 17:34 mutante: logging to eqiad wikitech after Andrew switched over
  • 16:05 andrewbogott: switching wikitech to read-only, migrating to eqiad
  • 15:06 logmsgbot: reedy updated /a/common to If3ca3d486: beta: adjust $wgCaptchaDirectory
  • 15:01 hashar: Gerrit super slow again :-(
  • 14:46 mutante: added oblivion to root-auth-keys
  • 14:17 mutante: welcome new shell user oblivion
  • 14:04 hashar: Gerrit flushed a few caches related to user accounts / LDAP
  • 13:43 mutante: adding oblivion to ops and wmf LDAP groups
  • 08:44 mutante: solr1/2 - revoke puppet certs
  • 08:43 mutante: solr3 - delete salt key, puppet cert
  • 03:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 db1034 full steam'
  • 02:02 logmsgbot: LocalisationUpdate failed: git pull of extensions failed
  • 01:58 springle: killed research queries on db1047. email me
  • 01:35 springle: restarted sanitarium s3 instance for additional private wikis
  • 00:59 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 repool db1034, warm up'

Archives