Server Admin Log

From Wikitech
(Redirected from Server admin log)
Jump to: navigation, search

April 16

  • 13:50 manybubbles: restarting elastic1009 to suck up new config
  • 13:50 manybubbles: raised the number of replicas for labswiki's search directly in elasticsearch because I can't easilly do for cirrus due to access restrictions
  • 13:45 ottomata: reinstalling elastic1011
  • 13:22 mutante: DNS update - remove virt5-15
  • 12:11 mutante: virt5-11 - shut down
  • 11:40 akosiaris: upgraded python-voluptuous on apt.wikimedia.org to 0.8.2-1wmf1
  • 11:39 hashar: Upgraded Zuul to wmf-deploy-20140416-3 (bring in a84f0e4 - "Make queue processing more efficient" which was much needed)
  • 11:29 hashar: upgraded Zuul to wmf-deploy-20140416-2
  • 11:15 mutante: virt5-11 removing from icinga
  • 11:03 mutante: virt5-11 revoked puppet certs and salt keys
  • 10:56 mutante: stopping puppet on virt5-11
  • 10:47 hashar: Upgraded Zuul on gallium to wmf-deploy-20140416 (depends on python-voluptuous 0.7+ , Alexandros packaged 0.8.2 which I manually installed to validate).
  • 09:26 mutante: disabling mw1163 in pybal
  • 07:03 mutante: zirconium - upgrading apache2, php5 packages
  • 06:07 springle: stop mysqld on db38 (x1) for decom
  • 03:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 16 03:46:23 UTC 2014 (duration 46m 22s)
  • 02:55 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-16 02:55:28+00:00
  • 02:28 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-16 02:28:01+00:00
  • 00:43 K4-713: updated listener credentials on thulium

April 15

  • 23:25 logmsgbot: mwalker synchronized wmf-config/abusefilter.php '126168 more abuse filter configuration fun'
  • 23:21 logmsgbot: mwalker Finished scap: Configuration change 126163 and MultimediaViewer 126158 (duration: 02m 15s)
  • 23:19 logmsgbot: mwalker Started scap: Configuration change 126163 and MultimediaViewer 126158
  • 23:08 logmsgbot: mwalker Finished scap: Configuration changes, 113656, 121834, 126065 (duration: 03m 11s)
  • 23:05 logmsgbot: mwalker Started scap: Configuration changes, 113656, 121834, 126065
  • 23:01 hashar: restarting Zuul to clear leaked file descriptor (know issue, fixed upstream)
  • 22:12 awight: crm updated from e3f2859 to 7dafce5
  • 21:51 manybubbles: restarting elastic1009 again
  • 21:39 hashar: jenkins /var/lib/git cleaned up on gallium
  • 21:16 manybubbles: restarting elastic1009 to test performance changes. cluster will go yellow for a few minutes. might go red (wikitech is busted)
  • 21:15 hashar: Jenkins is processing jobs again
  • 21:14 hashar: cleared /tmp/ on integration-slave1002 (filled up by hhvm job, known issue, bug filled already)
  • 21:12 hashar: Zuul locked again :/ Unpooling and repooling Jenkins slaves.
  • 19:50 RoanKattouw: Restarting stuck Jenkins
  • 19:31 manybubbles: setting refresh interval on elasticsearch indexes to 30s to test effect on load
  • 19:24 logmsgbot: reedy synchronized wmf-config/
  • 19:20 logmsgbot: reedy synchronized php-1.23wmf22/includes/PrefixSearch.php 'I82b5ca65864099c180d915055c43e6839bd4f4a2'
  • 19:07 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikisources back to 1.23wmf22
  • 19:07 ottomata: reinstalling elastic1010
  • 19:07 logmsgbot: reedy synchronized php-1.23wmf22/extensions/ProofreadPage
  • 18:41 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikisources back to 1.23wmf21 due to ProofreadPage fatal
  • 18:36 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.23wmf22
  • 17:09 paravoid: stopped pybal on lvs1005
  • 17:06 cmjohnson1: fixing lvs1005 eth1 cable
  • 16:56 cmjohnson1: mw1057 replacing ethernet cable
  • 16:50 manybubbles: raised "new generation" size on elastic1009 to test a performance theory
  • 16:50 cmjohnson1: mw1093 replacing ethernet cable
  • 16:40 cmjohnson1: replacing eth cable on mw1193
  • 16:31 hashar: ... all Jenkins jobs are using /srv/ssd/gerrit instead
  • 16:30 hashar: gallium had two Gerrit replications streams, one of them got removed 122419 thus deleting the target directories under /var/lib/git
  • 16:22 cmjohnson1: shutting down mw1163 to replace DIMM
  • 16:18 cmjohnson1: swapping bad disk slot 4 on dataset1001
  • 16:13 paravoid: moving ms-fe3xxx/ms-be3xxx to private1-esams
  • 16:05 ottomata: reinstalling elastic1009
  • 15:21 logmsgbot: anomie synchronized php-1.23wmf21/extensions/Flow 'SWAT: Flow: Prevent logspam on enwiki 125930'
  • 15:13 logmsgbot: anomie synchronized php-1.23wmf21/extensions/Flow 'SWAT: Flow: Prevent logspam on enwiki 125930'
  • 15:02 mutante: DNS update - removing Tampa service IPs
  • 13:51 hashar: Jenkins compressing console logs of builds. On gallium as user jenkins : find /var/lib/jenkins/jobs -wholename '*/builds/*/log' -type f -exec gzip --best {} \;
  • 13:42 hashar: Command executed (as gerritslave user): find /srv/ssd/gerrit -type d -name '*.git' -exec bash -c 'echo; date; cd {}; echo; pwd; echo; git repack -ad; date;' \;
  • 13:41 hashar: Repacking Gerrit replicated repositories on lanthanum and gallium (both under /srv/ssd/gerrit/ )
  • 13:13 andrewbogott: shutdown and decommissioned virt12
  • 12:19 paravoid: adding ms-be101[345] to Swift eqiad's rings, at 33% weight; old rings kept at ms-fe1001:~/swift-2014-04-14
  • 11:30 mutante: DNS update - removed dbdump.pmtpa.wmnet
  • 11:26 mutante: DNS update - remove db64,db65,db66,db67,db70
  • 10:55 mutante: db64,db67 - powerdown via mgmt
  • 10:51 mutante: db65,db66 - shutdown
  • 10:07 mutante: db70 - powerdown via mgmt
  • 09:47 mutante: db64-67 - puppetstoredconfigclean.rb db${db}.pmtpa.wmnet ; puppetca --clean db${db}.pmtpa.wmnet ; salt-key -d db${db}.pmtpa.wmnet
  • 07:02 springle: shutdown db67 for decom. analytics data is backed up on dbstore1002
  • 06:47 springle: moving pmtpa m1 and x1 slaves to db73 and db69 on 12th floor
  • 03:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 15 03:25:52 UTC 2014 (duration 25m 51s)
  • 02:42 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-15 02:42:48+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-15 02:22:54+00:00
  • 00:09 gwicke: enabled wikinews family in Parsoid with temporary live patch to un-break VE deploy

April 14

  • 23:43 logmsgbot: ori Finished scap: (no message) (duration: 04m 31s)
  • 23:39 ori: scap: php-1.23wmf22/extensions/VisualEditor 2b0979f...0652ad2 (I12e5c9751)
  • 23:38 logmsgbot: ori Started scap: (no message)
  • 23:17 logmsgbot: ori synchronized php-1.23wmf22/skins/vector/variables.less 'Ibcdaff017: Revert body font stack to be just sans-serif'
  • 23:15 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I22f25730d: Enable VisualEditor for opt-in on Meta (2/2)'
  • 23:15 logmsgbot: ori synchronized visualeditor.dblist 'I22f25730d: Enable VisualEditor for opt-in on Meta (1/2)'
  • 23:14 logmsgbot: ori updated /a/common to I22f25730d: Enable VisualEditor for opt-in on Meta
  • 23:12 logmsgbot: ori synchronized visualeditor.dblist 'I59f5a6e0b: Enable VisualEditor on French Wikinews'
  • 23:12 logmsgbot: ori updated /a/common to I59f5a6e0b: Enable VisualEditor on French Wikinews
  • 22:56 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Limit large (djvu) file downloads for thumbnails'
  • 20:37 mwalker: updating payments wiki for worldpay currencies (from af35b7b to 8a93c17)
  • 20:13 subbu: deployed Parsoid fba548cbf (deploy repo sha d0e12ddf)
  • 17:47 paravoid: fixing /e/n/interfaces for static configuration: gadolinium hafnium labsdb1001 labsdb1002 labsdb1003 labstore1001 searchidx1001 ssl1005 ssl1006 ssl1009 virt1001 ytterbium
  • 17:37 paravoid: fixing /e/n/interfaces for static configuration for cp40xx, lvs40xx
  • 17:14 mutante: brewster - power down, could not revive due to disk or SATA controller fail
  • 16:57 ottomata1: shutting down elastic1006 for reinstall
  • 16:45 mutante: powering brewster back on
  • 16:40 paravoid: powering up brewster
  • 16:13 mutante: deleted old svn apache config on formey, started apache
  • 15:22 paravoid: restarting virt0's salt-master, glance-api, glance-registry, keystone, nova-scheduler
  • 15:11 logmsgbot: manybubbles synchronized wmf-config/CirrusSearch-common.php 'SWAT Cirrus update to improve performance'
  • 15:09 logmsgbot: manybubbles synchronized php-1.23wmf21/extensions/CirrusSearch/ 'SWAT deploy to improve performance'
  • 14:48 paravoid: upgrading all snapshot* hosts
  • 14:38 paravoid: upgrading all packages & staggered restart of all of swift (ms-fe/ms-be)
  • 13:22 logmsgbot: reedy synchronized php-1.23wmf22/includes/api/ApiFeedRecentChanges.php 'I268d0a53067738ba96bee74c593358b0b28cc083'
  • 13:22 logmsgbot: reedy synchronized php-1.23wmf21/includes/api/ApiFeedRecentChanges.php 'I268d0a53067738ba96bee74c593358b0b28cc083'
  • 13:15 paravoid: staggered upgrades for all pending updates on all mw* boxes & restarting apaches/other core services
  • 11:08 mutante: brewster - shut down
  • 10:49 logmsgbot: reedy synchronized wmf-config/interwiki.cdb 'Updating interwiki cache'
  • 10:05 apergos: had to toss extensions/Elastica on virt1000 and run git submodule update --init --recursive seems to be working now
  • 09:26 mutante: deleting huge pybal log on lvs3001
  • 09:01 mutante: brewster - stop lighttpd,bacula-fd,haproxy,dhcp3-server,rsync,nrpe,salt
  • 07:54 mutante: brewster - disabling puppet agent, removed from site.pp, revoke puppet cert
  • 03:23 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 14 03:22:58 UTC 2014 (duration 22m 57s)
  • 02:42 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-14 02:42:05+00:00
  • 02:23 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-14 02:22:58+00:00

April 13

  • 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 13 03:19:45 UTC 2014 (duration 19m 44s)
  • 02:39 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-13 02:39:11+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-13 02:20:49+00:00

April 12

  • 05:03 logmsgbot: ori updated /a/common to I5f900190c: Replace $channel with $variant; make it Beta-only
  • 03:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 12 03:21:43 UTC 2014 (duration 21m 42s)
  • 02:42 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-12 02:42:07+00:00
  • 02:23 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-12 02:23:02+00:00

April 11

  • 23:55 RoanKattouw: Restarting stuck Jenkins
  • 23:35 K4-713: synchronized payments to af35b7b
  • 23:25 K4-713: synchronized payments to b321163
  • 19:50 ottomata: upgraded wikitech to MediaWiki 1.23wmf22, applied security patch
  • 18:19 ottomata: rebooting elastic1003
  • 18:14 Krinkle: git-deploy: Deploying integration/slave-scripts I38b90e8c08d7cb
  • 18:08 Krinkle: git-deploy: Deploying integration/slave-scripts I04d8e308daedb3ccb8
  • 17:41 Krinkle: git-deploy: Deploying integration/slave-scripts 'Ia9ee438fa2675170'
  • 14:27 ottomata: reinstalling elastic1005
  • 04:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 11 04:33:40 UTC 2014 (duration 33m 39s)
  • 03:47 logmsgbot: LocalisationUpdate completed (1.23wmf22) at 2014-04-11 03:47:01+00:00
  • 02:41 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-11 02:41:33+00:00
  • 02:29 ori_: graphite: carbon instance 'f' saturates a cpu core. it's the instance that mediawiki profiling data gets hashed to. collector should probably emit to statsd and have statsd compute per-minute rollups
  • 00:06 marktraceur: leaving MultimediaViewer slightly broken on enwiki based on the fact that logged-in users seem mostly unaffected and other wikis aren't seeing issues, will investigate more tomorrow and fix on Monday

April 10

  • 23:54 bd808: Enabled beta update Jenkins job (https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/)
  • 23:37 logmsgbot: mwalker Finished scap: Attempting to regenerate i18n keys for multimediaviewer (duration: 03m 33s)
  • 23:34 logmsgbot: mwalker Started scap: Attempting to regenerate i18n keys for multimediaviewer
  • 23:16 logmsgbot: mwalker synchronized wmf-config/filebackend.php
  • 23:09 logmsgbot: mwalker synchronized wmf-config/InitialiseSettings.php 'touched to see if that pushes changes to FileBackend.php'
  • 23:03 mwalker: sync-common for 125340 and 125335
  • 22:53 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/modules/ve-mw/ui/tools/ 'touch *.js'
  • 22:35 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/modules/ve-mw/ui/tools/ve.ui.MWReferenceDialogTool.js 'touch'
  • 22:12 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/lib/ve/modules/ve/ui/ve.ui.Toolbar.js 'touch'
  • 22:10 logmsgbot: krinkle synchronized php-1.23wmf21/extensions/VisualEditor/lib/ve/lib/oojs-ui/oojs-ui.js 'touch'
  • 22:10 logmsgbot: krinkle synchronized php-1.23wmf21/resources/startup.js 'touch'
  • 22:10 logmsgbot: krinkle synchronized php-1.23wmf21/resources/oojs-ui/oojs-ui.js 'touch'
  • 21:50 Krinkle: VisualEditor throws uncaught error on load for 1.23wmf21 wikis (bug 63791)
  • 21:15 bd808: Disabled beta update Jenkins job (https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/) so that scap testing can happen in beta.
  • 19:29 ottomata: reinstalling elastic1004
  • 19:19 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'I5501078cee871fb9df03e085547b7a047ef5bd7e'
  • 19:16 logmsgbot: ori synchronized wmf-config/InitialiseSettings-labs.php 'Ia79b1b848: Work around bug 63780 by specifying a siteParamsCallback'
  • 19:15 logmsgbot: ori updated /a/common to Ia79b1b848: Work around bug 63780 by specifying a siteParamsCallback
  • 18:44 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 18:43 logmsgbot: reedy synchronized database lists files: Enable MediaViewer on mediawikiwiki
  • 18:42 logmsgbot: reedy synchronized docroot and w
  • 18:41 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf22
  • 18:37 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.23wmf21
  • 18:33 logmsgbot: reedy synchronized php-1.23wmf21/extensions/MultimediaViewer
  • 16:55 ottomata: shutting down elastic1003 for reinstall and reformat
  • 16:42 logmsgbot: reedy updated /a/common to Ie72029103: Add/update symlinks
  • 16:40 logmsgbot: reedy Finished scap: testwiki to 1.23wmf22 and build l10n cache (duration: 24m 45s)
  • 16:15 logmsgbot: reedy Started scap: testwiki to 1.23wmf22 and build l10n cache
  • 16:14 logmsgbot: reedy updated /a/common to I2cccebdd7: wikidatawiki back to 1.23wmf21
  • 15:03 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikidatawiki back to 1.23wmf21...
  • 15:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 10 15:00:01 UTC 2014 (duration 27m 49s)
  • 14:17 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-10 14:17:16+00:00
  • 13:49 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-10 13:49:24+00:00
  • 13:31 logmsgbot: reedy Finished scap: l10n cache update for wikidatawiki (duration: 19m 15s)
  • 13:12 logmsgbot: reedy Started scap: l10n cache update for wikidatawiki
  • 12:42 bblack: removed broken pdns_gmetric cronjob on lvs boxes
  • 09:44 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I107179a27: Move HHVM extension blacklist below extract($globals) so it isn't simply clobbered'
  • 09:42 logmsgbot: ori updated /a/common to I107179a27: Move HHVM extension blacklist below extract($globals) so it isn't simply clobbered
  • 09:39 hashar: Zuul processed its backlog. Had to disconnect/reconnect the labs slaves. There is some weird bug occurring :-(
  • 09:29 hashar: Jenkins: disabling Gearman client in https://integration.wikimedia.org/ci/configure and reenabling it
  • 09:20 hashar: Jenkins unpooling both slave labs using the web interface and killing the Jenkins client running as jenkins-deploy . Will repool so the job can be reregistered properly [[bugzilla:63760|bug 63760]]
  • 09:11 mutante: DNS update - removing ms6
  • 09:04 hashar: Jenkins bunch of jobs are not being triggered properly. Taking traces.
  • 08:55 mutante: ms6 - shutdown -h now
  • 08:42 mutante: forcing Bugzilla logout for all users
  • 08:19 logmsgbot: aude synchronized php-1.23wmf20/extensions/Wikidata
  • 08:09 logmsgbot: aude synchronized php-1.23wmf20/extensions/Wikidata
  • 07:57 logmsgbot: aude rebuilt wikiversions.cdb and synchronized wikiversions files: Rebuild wikiversions and put wikidata on 1.23wmf20
  • 07:53 logmsgbot: aude synchronized wikiversions.json 'Put Wikidata back on 1.23wmf20, due to localisation cache issues'
  • 07:21 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 10 07:21:16 UTC 2014 (duration 7m 21s)
  • 06:46 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-10 06:45:59+00:00
  • 06:27 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-10 06:27:30+00:00
  • 06:27 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I20bbe05cc: Avoid using bits on beta-hhvm.wmflabs.org'
  • 06:26 logmsgbot: ori updated /a/common to I20bbe05cc: Avoid using bits on beta-hhvm.wmflabs.org
  • 06:15 ori: Some interface messages are missing on wikidata.org. Started a manual l10nupdate.
  • 04:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 10 04:11:47 UTC 2014 (duration 11m 46s)
  • 03:19 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-10 03:19:18+00:00
  • 03:01 logmsgbot: ori synchronized multiversion/MWMultiVersion.php 'Ibdbac982b: Update multiversion regexp for *.beta-hhvm.wmflabs.org'
  • 03:01 logmsgbot: ori updated /a/common to Ibdbac982b: Update multiversion regexp for *.beta-hhvm.wmflabs.org
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-10 02:22:11+00:00
  • 01:49 logmsgbot: ori synchronized wmf-config/InitialiseSettings-labs.php 'I697f7e4a6: Use to branch on interpreter'
  • 01:48 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I697f7e4a6: Use to branch on interpreter'
  • 01:48 logmsgbot: ori updated /a/common to I697f7e4a6: Use '$channel' to branch on interpreter
  • 01:08 K4-713: updated payments to e1d00b61a703
  • 01:06 Krinkle: git-deploy: Deploying integration/slave-scripts If2539c
  • 01:05 Krinkle: Undid local patch to "grunt-lib-phantomjs/phantomjs/main.js" (for bug 63579) in "/srv/deployment/integration/slave-scripts" on gallium
  • 00:20 logmsgbot: yurik synchronized wmf-config/InitialiseSettings.php
  • 00:08 awight: updated crm from e726e42 to e3f2859
  • 00:06 K4-713: updated payments to 70dce8f4bc7

April 9

  • 23:36 logmsgbot: ebernhardson synchronized php-1.23wmf21/extensions/Math/modules/VisualEditor/ve.ui.MWMathInspectorTool.js 'Update Math VE tool to use a command in 1.23wmf21'
  • 23:32 logmsgbot: ebernhardson synchronized wmf-config/CommonSettings.php 'Update Flow cache version'
  • 23:22 logmsgbot: ebernhardson synchronized php-1.23wmf21/extensions/Flow 'Backport fix DB-to-cache pipeline for 1.23wmd21'
  • 23:05 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings-labs.php 'Enable math VE plugin on labs'
  • 23:04 Krinkle: Jenkins and Zuul are back up. Queues have not been preserved.
  • 23:01 ^d: gerrit: reloaded bugzilla plugin to force it to log back in
  • 23:00 Krinkle: Restarting Jenkins because I have no clue what is going on and have no time to investigate yet another random clogging of all jobs. Restart ought to fix it.
  • 22:54 Krinkle: Zuul has lots of queued jobs for npm slaves, but neither Jenkins nor integration-slave1001.eqiad.wmflabs and 1002 themselves have anything queued. They're idle, responsive and waiting for jobs.
  • 22:47 Krinkle: Jenkins slaves in labs seem to be down. Zuul is stacking up jobs for hasNpm nodes (integration slaves in labs). Both slaves have 7/7 executors idle.
  • 22:33 hoo: Logged out all Bugzilla users by deleting all session cookie data from mysql
  • 19:15 logmsgbot: csteipp synchronized php-1.23wmf21/extensions/CentralAuth/maintenance
  • 19:10 logmsgbot: yurik synchronized wmf-config/InitialiseSettings.php
  • 17:38 logmsgbot: yurik synchronized wmf-config/InitialiseSettings.php
  • 17:22 manybubbles: regenerating Elasticsearch index from mediawiki for testwiki to soak up geo changes.
  • 16:48 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/124880'
  • 16:41 manybubbles: reindexed testwiki to soak up geo changes
  • 16:37 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/124876'
  • 16:32 logmsgbot: maxsem synchronized php-1.23wmf21/extensions/GeoData
  • 16:28 manybubbles: fiddling with Elasticsearch cluster balancing options trying to get enwiki better balanced
  • 16:17 logmsgbot: aude synchronized php-1.23wmf21/extensions/Wikidata 'Switch Wikidata back to previous version of Wikibase'
  • 15:52 mutante: ms6 - revoke puppet cert, salt key, remove from icinga
  • 15:02 ottomata: stopped puppet on emery to test sqstat on analytics1003
  • 14:48 ottomata: disabling puppet to test sqstat on analytics1003
  • 14:14 RobH: otrs back up, live hacked apache change, now working permanent puppet change (puppet is disabled on iodine at present)
  • 14:02 RobH: yes, otrs is totally ssl borked, robh is working on it
  • 14:00 mutante: adding filippo to ops/wmf LDAP groups
  • 13:58 RobH: updating otrs cert
  • 09:19 logmsgbot: hashar synchronized wmf-config/InitialiseSettings.php '[] = 'musees.cg70.fr'; 124754 [[bugzilla:63449|bug 63449]]'
  • 08:39 hashar: Gerrit Letting JenkinsBot submit changes on apps/android/*
  • 03:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 9 03:33:25 UTC 2014 (duration 33m 24s)
  • 02:43 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-09 02:43:52+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-09 02:19:25+00:00
  • 01:25 Krinkle: Bug 63579 is still happening occasionally. Leaving patch on gallium in place for now.
  • 01:09 ori: Debugging uWSGI init scripts on tungsten; expect some Graphite / Gdash flapping.
  • 00:15 ori: graphite webapp 502 caused by uwsgi's init script not restarting the service correctly
  • 00:07 Krinkle: graphite.wikimedia.org (e.g. https://graphite.wikimedia.org/render/?) is serving 502 Bad Gateway, ori is investigating
  • 00:04 Krinkle: To investigate bug 63579, manually patched "grunt-lib-phantomjs/phantomjs/main.js" in "/srv/deployment/integration/slave-scripts" on gallium

April 8

  • 23:34 logmsgbot: mwalker synchronized php-1.23wmf21/extensions/MultimediaViewer/ 'Updating MultimediaViewer for 124510'
  • 23:08 logmsgbot: csteipp synchronized php-1.23wmf21/extensions/CentralAuth/maintenance 'Push maintenance script for token reset'
  • 21:21 logmsgbot: bd808 Purged l10n cache for 1.23wmf19
  • 21:20 logmsgbot: bd808 Purged l10n cache for 1.23wmf18
  • 21:12 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.23wmf21
  • 19:58 manybubbles: finished upgrading to Elasticsearch 1.1.0. The process went well with no issues other then some knocking out search in labs 3 times for 30 seconds a piece. And logging lots of nasty warnings to irc. I've started to the process to fix search in labs so it won't happen again.
  • 19:56 manybubbles: upgraded all elasticsearch servers except elastic1008. that is coming now.
  • 18:45 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I4b18e4ce8: Change wgServer and wgCanonicalServer for arbcom wikis'
  • 18:45 logmsgbot: ori updated /a/common to I4b18e4ce8: Change wgServer and wgCanonicalServer for arbcom wikis
  • 18:13 logmsgbot: bd808 Finished scap: group0 wikis to 1.23wmf21 (with patch for bug 63659) (duration: 03m 18s)
  • 18:10 logmsgbot: bd808 Started scap: group0 wikis to 1.23wmf21 (with patch for bug 63659)
  • 18:01 logmsgbot: hoo synchronized wmf-config/InitialiseSettings.php 'Touch to clear config. cache'
  • 17:56 hoo: changed the Wikidata wb_changes_dispatch position of all wikiquote wikis to 118158153
  • 17:37 logmsgbot: hoo synchronized php-1.23wmf20/extensions/Wikidata/extensions/Wikibase/lib/resources/wikibase.Site.js 'touch'
  • 17:37 logmsgbot: aude synchronized wmf-config/Wikibase.php 'bump wgCacheEpoch for wikidata after enabling wikiquote site links'
  • 17:35 ottomata: restarted gmetad on nickel to fix ganglia
  • 17:29 logmsgbot: aude synchronized wikidataclient.dblist 'Enable Wikibase on Wikiquote'
  • 17:29 logmsgbot: aude synchronized wmf-config 'config changes to enable Wikibase on Wikiquote'
  • 17:22 logmsgbot: aude synchronized wmf-config/CirrusSearch-labs.php 'config change for beta, to enable highlighting'
  • 17:16 manybubbles: finished upgrading elastic1001-1006. starting on 1007. yay progress.
  • 17:03 logmsgbot: aude synchronized php-1.23wmf20/extensions/Wikidata 'Update Wikidata build, to allow populating sites table on wikiquote'
  • 16:31 aude: added sites and site_identifiers core tables on wikiquote
  • 16:28 hashar: Jenkins: killed jenkins-slave java process on gallium and repooled gallium slave. It was no more registered in Zuul :-/
  • 14:32 manybubbles: no harm done, just lost time
  • 14:32 manybubbles: woops, just restarted elastic1002. silly me
  • 14:31 manybubbles: upgrading elastic1001
  • 13:54 manybubbles: they'll pick it up during the rolling restart today to upgrade to 1.1.0
  • 13:53 manybubbles: synced first Elasticsearch plugin to production Elasticsearch servers
  • 13:46 RobH: upgraded libssl on holmium
  • 13:39 Jeff_Green: update & reboot tellurium
  • 13:39 RobH: replacing the blog cert, if holmium crashes I didn't do it correctly.
  • 13:37 mutante: restarting gitblit
  • 12:56 logmsgbot: reedy updated /a/common to Id15ddc665: Revert "Group0 wikis to 1.23wmf21"
  • 10:21 Jeff_Green: update & reboot barium
  • 10:15 Jeff_Green: update & reboot samarium
  • 07:47 _joe|away: restarted nginx on cp1044 and cp1043
  • 05:47 apergos: shot many old apache processes running as stats user from 2013, on stat1001 (restarting apache runs it as www-data user)
  • 05:39 apergos: restarted apache on fenari magnesium yterrbium antimony
  • 05:31 _joe_: upgraded openssl on cp10* and cp30* servers as well
  • 04:46 Tim: on dataset1001: upgraded libssl and restarted lighttpd
  • 04:43 Tim: restarted apache on the above list, failed on labs-ns1, virt1000, ytterbium
  • 04:41 Tim: upgraded libssl on zirconium.wikimedia.org,neon.wikimedia.org,netmon1001.wikimedia.org,iodine.wikimedia.org,ytterbium.wikimedia.org,gerrit.wikimedia.org,virt1000.wikimedia.org,labs-ns1.wikimedia.org,stat1001.wikimedia.org
  • 04:38 Ryan_Lane: upgrading libssl on virt0
  • 04:37 Ryan_Lane: upgrading libssl on virt1000
  • 04:15 Tim: also upgraded libssl on cp4001-4019. Restarted nginx on these servers and also the previous list.
  • 04:03 Tim: upgrading libssl on ssl1001,ssl1002,ssl1003,ssl1004,ssl1005,ssl1006,ssl1007,ssl1008,ssl1009,ssl3001.esams.wikimedia.org,ssl3002.esams.wikimedia.org,ssl3003.esams.wikimedia.org
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Apr 8 03:11:04 UTC 2014 (duration 11m 3s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-08 02:34:56+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-08 02:15:58+00:00
  • 01:06 logmsgbot: bd808 Finished scap: revert group0 to 1.23wmf21 (testwiki still on 1.23wmf21) (duration: 09m 54s)
  • 00:56 logmsgbot: bd808 Started scap: revert group0 to 1.23wmf21 (testwiki still on 1.23wmf21)
  • 00:54 logmsgbot: bd808 scap aborted: group0 to 1.23wmf21 (testing python change for mwversionsinuse) (again) (duration: 00m 25s)
  • 00:54 logmsgbot: bd808 Started scap: group0 to 1.23wmf21 (testing python change for mwversionsinuse) (again)
  • 00:53 logmsgbot: bd808 scap aborted: group0 to 1.23wmf21 (testing python change for mwversionsinuse) (duration: 02m 57s)
  • 00:50 logmsgbot: bd808 Started scap: group0 to 1.23wmf21 (testing python change for mwversionsinuse)
  • 00:25 logmsgbot: catrope synchronized php-1.23wmf21/extensions/VisualEditor 'it helps if you run git submodule update first'
  • 00:24 logmsgbot: catrope synchronized php-1.23wmf20/extensions/VisualEditor 'it helps if you run git submodule update first'

April 7

  • 23:58 logmsgbot: catrope synchronized php-1.23wmf20/extensions/VisualEditor/ 'VisualEditor bug fixes'
  • 23:57 logmsgbot: catrope synchronized php-1.23wmf20/skins/vector/variables.less 'Remove troublesome fonts from font stack'
  • 23:50 logmsgbot: catrope synchronized php-1.23wmf21/resources/oojs-ui/ 'Update OOJS-UI for bug fixes'
  • 23:50 logmsgbot: catrope synchronized php-1.23wmf21/extensions/VisualEditor/ 'VisualEditor bug fixes'
  • 23:49 logmsgbot: catrope synchronized php-1.23wmf21/skins/vector/variables.less 'Remove troublesome fonts from font stack'
  • 23:45 logmsgbot: catrope synchronized php-1.23wmf21/extensions/VisualEditor/ 'VisualEditor bug fixes'
  • 23:42 logmsgbot: catrope synchronized php-1.23wmf21/resources/oojs-ui/ 'Update OOJS-UI for bug fixes'
  • 23:42 logmsgbot: catrope synchronized php-1.23wmf21/skins/vector/variables.less 'Remove troublesome fonts from font stack'
  • 23:17 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'SWAT changes: other projects bar on frwikisource, import sources'
  • 23:06 logmsgbot: bd808 Finished scap: test2wiki to 1.23wmf20 (duration: 16m 48s)
  • 22:49 logmsgbot: bd808 Started scap: test2wiki to 1.23wmf20
  • 22:38 logmsgbot: bd808 Finished scap: test2wiki to 1.23wmf21 (duration: 12m 07s)
  • 22:26 logmsgbot: bd808 Started scap: test2wiki to 1.23wmf21
  • 22:12 logmsgbot: bd808 Finished scap: Testing 1.23wmf21 l10n changes (duration: 01m 31s)
  • 22:10 logmsgbot: bd808 Started scap: Testing 1.23wmf21 l10n changes
  • 22:07 logmsgbot: bd808 Finished scap: Testing 1.23wmf21 l10n changes (duration: 03m 49s)
  • 22:04 logmsgbot: bd808 Started scap: Testing 1.23wmf21 l10n changes
  • 21:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 7 21:13:14 UTC 2014 (duration 1m 59s)
  • 20:40 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-07 20:40:02+00:00
  • 20:23 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-07 20:23:14+00:00
  • 19:08 ottomata: temporatily disabling puppet on analytics 1009, 1010, 1019, 1020 to bring up new journalnodes
  • 18:39 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 7 18:39:21 UTC 2014 (duration 15m 30s)
  • 18:23 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Added "downloadtiff" pool counter config'
  • 18:13 AaronSchulz: shwiki queue finished emptying out in staggered loop on terbium
  • 18:03 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-07 18:03:48+00:00
  • 17:36 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-07 17:36:02+00:00
  • 17:24 bd808: Manually running l10nupdate with new --verbose flag to capture log output
  • 14:34 MaxSem: Rebuilding GeoData index
  • 14:09 hashar: Jenkins cleared swap on gallium (swapoff -a && swapon -a). Makes ganglia graph nicer :D
  • 13:15 apergos: reenabled puppet on dataset2, testing done
  • 12:23 apergos: disabled puppet on dataset2, testing
  • 11:19 logmsgbot: reedy Finished scap: because we're scappy... (rebuilding l10n cache for 1.23wmf21 (duration: 18m 04s)
  • 11:01 logmsgbot: reedy Started scap: because we're scappy... (rebuilding l10n cache for 1.23wmf21
  • 10:45 hashar: integration Getting PHP Composer installed on labs slaves. 124305
  • 09:21 paravoid: reactivating peerings with HE, issues reportedly resolved
  • 09:04 hashar: restarted Zuul
  • 08:54 hashar: gallium killed console-kit-daemon process which was eating a lot of memory
  • 08:42 hashar: Restarting Jenkins, out of Java heap space. Something is leaking memory
  • 08:41 hashar: Jenkins being broken for some reason AGAIN !
  • 05:04 ori: Zuul is stuck: <http://i.imgur.com/o5ghCam.jpg> (617kb image)
  • 02:56 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Apr 7 02:56:12 UTC 2014 (duration 56m 11s)
  • 02:20 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-07 02:20:10+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-07 02:13:42+00:00

April 6

  • 02:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Apr 6 02:53:13 UTC 2014 (duration 53m 12s)
  • 02:18 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-06 02:18:43+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-06 02:12:57+00:00
  • 01:32 jamesofu_: sugar down for move to labs

April 5

  • 04:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 5 04:15:00 UTC 2014 (duration 50m 39s)
  • 03:56 logmsgbot: andrew rebuilt wikiversions.cdb and synchronized wikiversions files: Revert mw.org, test2wiki and testwikidatawiki to 1.23wmf20 due to localisation issue
  • 03:51 Andrew: Reverting mw.org, test2 and test.wikidata back to 1.23wmf20
  • 03:41 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-05 03:41:36+00:00
  • 03:36 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-05 03:36:04+00:00
  • 03:23 Andrew: Actually, going to rerun l10nupdate first just to check.
  • 03:22 Andrew: Going to revert deployment of 1.23wmf21 again - still broken
  • 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Apr 5 03:08:33 UTC 2014 (duration 8m 32s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-05 02:34:54+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-05 02:14:07+00:00

April 4

  • 21:34 logmsgbot: bd808 Finished scap: Group0 to 1.23wmf21 (again) (duration: 14m 35s)
  • 21:19 logmsgbot: bd808 Started scap: Group0 to 1.23wmf21 (again)
  • 19:28 hashar: Jenkins: unpooled slave agent on lanthanum, killed it the java agent on it and repooled it.
  • 19:22 hashar: Jenkins is processing jobs again. Queue unchanged so it will resume everything
  • 19:16 hashar: restarting Jenkins
  • 19:07 hashar: Jenkins un pooling gallium slave
  • 19:05 hashar: Zuul / Jenkins stalled again.
  • 18:43 csteipp: redeployed updated patch for bug63251 to fix a reported bug
  • 16:10 _joe_: restarting gitlbit, for the last time today
  • 15:06 _joe_: restarting gitblit as it has eaten up all of its ram again and is trashing cpu
  • 12:32 mutante: hume - shutting down
  • 12:06 mutante: hume - disable puppet/salt/monitoring
  • 11:13 mutante: restarting gitblit with new option to use incremental GC in an attempt to fix timeouts caused by GC eating CPU
  • 08:07 paravoid: deactivating cr1-eqiad<->HE peerings, translantic par2<->ash1 is congested
  • 07:25 mutante: restarting gitblit
  • 05:45 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 4 05:45:07 UTC 2014 (duration 18m 25s)
  • 04:56 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-04 04:56:06+00:00
  • 04:45 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-04 04:45:01+00:00
  • 04:20 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: unbreak test2.wp and test.wikidata as well
  • 04:17 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: mw.org back to 1.23wmf20
  • 03:43 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Apr 4 03:43:03 UTC 2014 (duration 43m 2s)
  • 03:28 ori: Interface messages are missing on group0 / 1.23wmf21 wikis (mediawikiwiki, testwiki, test2wiki, and testwikidata)
  • 02:50 logmsgbot: LocalisationUpdate completed (1.23wmf21) at 2014-04-04 02:50:26+00:00
  • 02:24 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-04 02:24:51+00:00
  • 01:08 logmsgbot: krinkle synchronized php-1.23wmf21/resources 'I6e93d9ab0e4a926c09c'

April 3

  • 22:00 logmsgbot: demon synchronized wmf-config/CirrusSearch-production.php 'lowering cache time, for testing'
  • 21:55 logmsgbot: demon updated /a/common/php-1.23wmf20 to Ic853ebff4: Cherry-pick I550eb4b0a8fa18344e8b0de3ec85d61c2122ffb8
  • 21:54 logmsgbot: demon synchronized php-1.23wmf20/extensions/CirrusSearch 'Cirrus back to master again'
  • 21:50 logmsgbot: ori synchronized multiversion/updateBitsBranchPointers 'updateBitsBranchPointers: get rid of 'static-stable' branch link'
  • 21:50 logmsgbot: ori updated /a/common to Ic1602c045: updateBitsBranchPointers: get rid of 'static-stable' branch link
  • 21:46 logmsgbot: demon synchronized php-1.23wmf20/extensions/CirrusSearch 'Rolling back to 1.23wmf20 branch point from master'
  • 21:38 logmsgbot: demon synchronized php-1.23wmf20/extensions/CirrusSearch 'Updating Cirrus to master'
  • 21:33 logmsgbot: demon synchronized wmf-config/CirrusSearch-production.php 'italian wikis getting interwiki search. they're my favorite beta testers'
  • 19:23 logmsgbot: reedy synchronized docroot and w
  • 19:21 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf21
  • 19:17 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias actually to 1.23wmf20
  • 19:15 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf20
  • 19:09 logmsgbot: reedy Finished scap: testwiki to 1.23wmf21 and build l10n cache (duration: 38m 23s)
  • 18:30 logmsgbot: reedy Started scap: testwiki to 1.23wmf21 and build l10n cache
  • 18:23 logmsgbot: reedy updated /a/common to I835c2b1d5: Depool. See RT 7191.
  • 11:10 paravoid: IPv4 eqiad<->esams private link also elevated by ~15ms but no packet loss observed
  • 11:09 paravoid: affects both IPv6 transit at esams (slowdowns) as well as IPv6 eqiad<->esams
  • 11:08 paravoid: deactivating cr1-esams<->HE peering, latency > 160ms, over at 200ms (congestion?); back to 84ms now;
  • 10:51 akosiaris: temporarily stopped squid on brewster
  • 10:26 hashar: Jenkins job mediawiki-core-phpunit-hhvm is back around thanks to 123573
  • 06:28 paravoid: powercycling ms-be1003, unresponsive, no console output
  • 04:43 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'return upgraded DB slaves to normal load'
  • 04:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 repool db1015, warm up'
  • 04:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 depool db1015 for upgrade'
  • 04:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 repool db1037, warm up'
  • 03:53 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 depool db1037 for upgrade'
  • 03:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Apr 3 03:53:18 UTC 2014 (duration 53m 16s)
  • 03:34 springle: db1020 raid controller dimm ecc errors
  • 03:14 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 depool db1020 for upgrade'
  • 03:12 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 repool db1019, warm up'
  • 02:57 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 depool db1019 for upgrade'
  • 02:56 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1060, warm up'
  • 02:48 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-03 02:48:01+00:00
  • 02:47 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 depool db1060 for upgrade'
  • 02:45 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1061, warm up'
  • 02:35 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1061 for upgrade'
  • 02:24 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-04-03 02:24:07+00:00

April 2

  • 23:47 logmsgbot: aaron synchronized wmf-config/CommonSettings.php 'Bumped wgJobBackoffThrottling for htmlCacheUpdate to 15'
  • 23:47 mwalker: ... deploy was for mobile frontend 123454
  • 23:46 logmsgbot: mwalker synchronized php-1.23wmf20/extensions/MobileFrontend 'SWAT deploy for MaxSem'
  • 20:23 subbu: deployed Parsoid 33471172 with deploy repo sha 5c620e54
  • 19:03 logmsgbot: ori synchronized php-1.23wmf20/extensions/WikimediaEvents 'Update WikimediaEvents for I7fdaa5524: Use simple random sampling to log deprecated usage at 1:100'
  • 19:03 logmsgbot: ori synchronized php-1.23wmf19/extensions/WikimediaEvents 'Update WikimediaEvents for I7fdaa5524: Use simple random sampling to log deprecated usage at 1:100'
  • 17:00 andrewbogott: fixed updating crons on wikitech-status, I think. Time will tell...
  • 16:19 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'Lower timeout on prefix searches and make the cirrus.dblist sync I just did take effect.'
  • 16:19 logmsgbot: manybubbles synchronized cirrus.dblist 'Cirrus as primary for most of group1'
  • 16:14 akosiaris: banned tools-exec-03.eqiad.wmflabs. using manual iptables on ytterbium
  • 15:20 ottomata: stopping puppet on stat1
  • 14:27 hashar: Jenkins applying label contintLabsSlave on slaves in labs used for ci (integration-slave1001 and 1002)
  • 14:15 hashar: Jenkins deleting pmtpa slaves (they all have been shutdown and jobs got deleted)
  • 14:00 manybubbles: tried restarting some lsearchd services (carefully) to clear out some crashing when searching for a particular query term. It caused pool queue full errors.... serves me right for trying?
  • 11:20 mutante: running CheckUser/maintenance/purgeOldData.php on all wikis
  • 09:42 akosiaris: rsynced brewster /srv to carbon
  • 09:34 mutante: restarting gitblit on antimony
  • 09:14 mutante: DNS update - removing capella
  • 09:09 mutante: DNS update - removing ms10
  • 05:31 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'normal loads for all upraded slaves'
  • 04:53 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1062, warm up'
  • 04:45 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1062 for upgrade'
  • 04:42 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 repool db1039, warm up'
  • 04:27 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 depool db1039 for upgrade'
  • 03:56 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 repool db1006, warm up'
  • 03:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Apr 2 03:48:31 UTC 2014 (duration 48m 30s)
  • 03:46 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 depool db1006 for upgrade'
  • 03:43 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 repool db1045, warm up'
  • 03:27 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 depool db1045 for upgrade'
  • 03:21 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 repool db1059, warm up'
  • 03:07 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 depool db1059 for upgrade'
  • 03:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1063, warm up'
  • 02:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 depool db1063 for upgrade'
  • 02:52 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-04-02 02:52:48+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-04-02 02:29:18+00:00
  • 02:22 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 repool db1027, warm up'
  • 02:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 depool db1027 for upgrade'
  • 01:16 logmsgbot: ori synchronized php-1.23wmf20/extensions/WikimediaEvents 'Undeployed change from earlier SWAT deploy'
  • 01:16 logmsgbot: ori synchronized php-1.23wmf19/extensions/WikimediaEvents 'Undeployed change from earlier SWAT deploy'
  • 01:05 logmsgbot: ori synchronized php-1.23wmf19/extensions/WikimediaEvents
  • 01:04 logmsgbot: ori synchronized php-1.23wmf19/extensions/EventLogging
  • 01:02 logmsgbot: ori synchronized php-1.23wmf20/extensions/WikimediaEvents
  • 01:02 logmsgbot: ori synchronized php-1.23wmf20/extensions/EventLogging

April 1

  • 23:48 logmsgbot: ebernhardson synchronized php-1.23wmf19/extensions/WikimediaEvents/ 'Update WikimediaEvents to master'
  • 23:48 logmsgbot: ebernhardson synchronized php-1.23wmf19/extensions/EventLogging/ 'Update EventLogging to master'
  • 23:47 logmsgbot: ebernhardson synchronized php-1.23wmf20/extensions/EventLogging/ 'Update EventLogging to master'
  • 23:46 logmsgbot: ebernhardson synchronized php-1.23wmf20/extensions/WikimediaEvents/ 'Update WikimediaEvents to master'
  • 23:32 logmsgbot: ebernhardson synchronized docroot and w
  • 21:42 hashar: Ganglia in labs is more or less back in activity: http://ganglia.wmflabs.org/ No clue what it is graphing though
  • 21:27 hashar: jenkins killed stuck build (5 hours+) of beta-update-databases-eqiad . Might have been blocking Jenkins build queue
  • 19:09 Reedy: ori gracefulled mw1018, mw1050, mw1061, mw1070, mw1139, mw1179
  • 18:45 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: zerowiki to 1.23wmf20
  • 18:43 logmsgbot: reedy updated /a/common to If887effe5: Add zerowiki
  • 18:43 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 18:42 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Add zerowiki
  • 18:41 logmsgbot: reedy synchronized database lists files:
  • 18:37 logmsgbot: reedy synchronized docroot and w
  • 18:36 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: non wikipedias to 1.23wmf20
  • 18:28 mutante: ms10 - shut down
  • 18:19 mutante: ms10 - disable puppet, revoke puppet cert,salt key,icinga..
  • 18:03 logmsgbot: ori synchronized php-1.23wmf20/includes/profiler/ProfilerSimple.php 'Iad91f1d12: Send profiled items under the correct name'
  • 18:02 logmsgbot: ori synchronized php-1.23wmf19/includes/profiler/ProfilerSimple.php 'Iad91f1d12: Send profiled items under the correct name'
  • 17:34 mutante: logging to eqiad wikitech after Andrew switched over
  • 16:05 andrewbogott: switching wikitech to read-only, migrating to eqiad
  • 15:06 logmsgbot: reedy updated /a/common to If3ca3d486: beta: adjust $wgCaptchaDirectory
  • 15:01 hashar: Gerrit super slow again :-(
  • 14:46 mutante: added oblivion to root-auth-keys
  • 14:17 mutante: welcome new shell user oblivion
  • 14:04 hashar: Gerrit flushed a few caches related to user accounts / LDAP
  • 13:43 mutante: adding oblivion to ops and wmf LDAP groups
  • 08:44 mutante: solr1/2 - revoke puppet certs
  • 08:43 mutante: solr3 - delete salt key, puppet cert
  • 03:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 db1034 full steam'
  • 02:02 logmsgbot: LocalisationUpdate failed: git pull of extensions failed
  • 01:58 springle: killed research queries on db1047. email me
  • 01:35 springle: restarted sanitarium s3 instance for additional private wikis
  • 00:59 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 repool db1034, warm up'

March 31

  • 23:30 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 depool db1034 for upgrade'
  • 23:22 ori: End of SWAT deploy.
  • 23:21 logmsgbot: ori synchronized php-1.23wmf19/extensions 'I6f0f1b18d: Update MobileFrontend, PageImages and TextExtracts for bug 63248'
  • 23:16 logmsgbot: ori synchronized php-1.23wmf20/extensions 'I6f0f1b18d: Update MobileFrontend, PageImages and TextExtracts for bug 63248'
  • 23:05 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I532f8ee7c: Add "upload" custom debug group to $wgDebugLogGroups'
  • 23:05 logmsgbot: ori updated /a/common to I532f8ee7c: Add 'upload' custom debug group to wgDebugLogGroups
  • 23:03 ori: Starting SWAT deploy window.
  • 20:36 awight: updated crm from 5151d97 to e726e42
  • 19:31 logmsgbot: yurik synchronized docroot/bits/WikipediaMobileFirefoxOS/
  • 16:37 logmsgbot: aaron synchronized wmf-config/PoolCounterSettings-eqiad.php 'Added "downloadpdf" pool counter config'
  • 16:11 bd808: Upgraded kibana on logstash cluster to e317bc6
  • 16:08 mutante: DNS update - remove tampa search pools (a second time, was duplicate, heh)
  • 15:33 Jeff_Green: reenable fundraising services, reenable silicon icinga monitoring
  • 14:28 bd808: Started logstash on logstash1001
  • 14:23 bd808: Elasticsearch upgraded to 1.0.1 on logstash100[123]
  • 14:18 hashar: beta cluster DNS entries migrated to point to the EQIAD datacenter. Keeping pmtpa instances around for a couple days
  • 14:16 bd808: Stopped elasticsearch on logstash100[123]
  • 14:13 bd808: Stopped logstash on logstash1001
  • 13:48 Jeff_Green: icinga alerts disabled for silicon
  • 13:34 Jeff_Green: fundraising-system full-stop for hardware repairs on queue server
  • 11:57 hashar: Jenkins: updating jslint jobs to run a PHP based json linter, will bails out whenever json files are invalid. [[bugzilla:58279|bug 58279]] 113958
  • 10:48 hashar: Jenkins fixed up browsertests jobs. Bundler could not compile gems on the eqiad slaves 122346
  • 09:46 hashar: Jenkins: applied hasBrowserTests label on both labs slave. Unblocks the browser tests which were still tied to a deleted instance 122341
  • 07:23 logmsgbot: faidon synchronized php-1.23wmf20/extensions/SpamBlacklist/SpamBlacklist.php 'revert I694860b - SpamBlacklist'
  • 07:22 logmsgbot: faidon synchronized php-1.23wmf19/extensions/SpamBlacklist/SpamBlacklist.php 'revert I694860b - SpamBlacklist'
  • 05:47 logmsgbot: faidon synchronized php-1.23wmf19/extensions/SpamBlacklist/SpamBlacklist.php 'local hack to test memcached traffic increase theory'
  • 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Mar 31 03:08:39 UTC 2014 (duration 8m 38s)
  • 02:30 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-03-31 02:30:26+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-31 02:13:38+00:00

March 30

  • 04:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 db1034 to full steam'
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Mar 30 03:11:14 UTC 2014 (duration 11m 13s)
  • 02:31 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-03-30 02:31:23+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-30 02:14:34+00:00

March 29

  • 17:31 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 17:30 logmsgbot: reedy synchronized database lists files: Update size related dblists
  • 17:28 logmsgbot: reedy updated /a/common to I7e9595ce0: Prepare GeoData config for Elasticsearch switchover
  • 16:54 Reedy: afl_namespace tinyint -> int on all medium wikis
  • 16:06 hoo: 800+ connections open on db1021, db1037, db1045 (s5)
  • 15:49 Reedy: afl_namespace tinyint -> int on all small wikis
  • 15:10 Reedy: afl_namespace tinyint -> int on mediawikiwiki
  • 05:59 csteipp: deployed patch for bug63251 for wmf19 and wmf20
  • 03:16 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Mar 29 03:16:24 UTC 2014 (duration 16m 23s)
  • 02:36 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-03-29 02:36:39+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-29 02:16:33+00:00
  • 00:47 paravoid: kill updateSpecial MWScript on terbium & pt-kill updateSpecial frwiki query running for 15m

March 28

  • 18:37 ori: added jgonera to wmf-deployment; was already a deployer but not in relevant gerrit group
  • 17:50 logmsgbot: demon synchronized wmf-config/CommonSettings.php 'Prepping GeoData config for Elasticsearch support. No-op'
  • 17:49 logmsgbot: demon synchronized wmf-config/InitialiseSettings-labs.php 'Prepping GeoData config for Elasticsearch support. No-op'
  • 17:48 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Prepping GeoData config for Elasticsearch support. No-op'
  • 17:03 mutante: DNS update - remove db40,db62
  • 16:56 mutante: shut down db40
  • 15:43 hashar: Jenkins removing pbuilder images from gallium.wikimedia.org and lanthanum.eqiad.wmnet production slaves. Packages should now be build on the labs slaves.
  • 14:11 mutante: shut down db62
  • 13:59 mutante: db40 - revoke puppet cert,salt key,remove from monitoring
  • 13:57 mutante: db40 - stopping mysql
  • 13:56 mutante: db62 - revoke puppet cert,salt key,remove from monitoring
  • 12:26 hashar: Jenkins: migrating both labs slaves puppet master to integration-puppetmaster.eqiad.wmflabs
  • 10:56 hashar: Jenkins: depooling ntegration-debian-builder pmtpa slave. Jobs should be build on the eqiad slaves now. debian-glue jobs might be broken for a while until I figure out how to get cow builder setup properly.
  • 09:43 paravoid: removing language subdomains for wikidata
  • 09:30 logmsgbot: reedy synchronized wmf-config/PrivateSettings.php 'Uncomment AdminSettings.php'
  • 04:25 ori: sync-l10nupdate ran to completion. messages that were previously missing now render.
  • 04:01 RoanKattouw: Running sync-l10nupdate to sync l10n cache
  • 03:56 Krinkle: Rebuilding i18n using mw-update-l10n, syncing ExtensionMessages-1.23wmf20.php doesn't actually regenerate the cache
  • 03:55 RoanKattouw: Running mw-update-l10n. Skipped wmf19, is now rebuilding l10ncache for wmf20 because I updated ExtensionMessages-1.23wmf20.php
  • 03:54 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Mar 28 03:54:22 UTC 2014 (duration 46m 33s)
  • 03:49 logmsgbot: catrope synchronized wmf-config/ExtensionMessages-1.23wmf20.php 'Regenerated to add new JSON dirs that were missing'
  • 03:49 RoanKattouw: Maybe the new scap scripts regenerate ExtensionMessages-$version.php at the wrong time (i.e. do things in the wrong order)?
  • 03:48 Krinkle: Scap/mw-update-l10n (at branch cut) or LocalisationUpdate must've run mergeMessagesFileList.php wrongly somehow
  • 03:45 RoanKattouw: ExtensionMessages-1.23wmf20.php is missing entries in $wgMessagesDirs, regenerating added them
  • 03:33 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-03-28 03:33:49+00:00
  • 03:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Mar 28 03:27:41 UTC 2014 (duration 27m 40s)
  • 03:19 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-28 03:19:47+00:00
  • 03:08 RoanKattouw: Rerunning LocalisationUpdate in an attempt to figure out what's going on
  • 03:06 Krinkle: Interface messages for of some extensions in 1.23wmf20 that use i18n/json are broken (mw-ui-feature-user-count, betafeatures-toplink, wikimediamessages-desc, ..)
  • 02:53 Krinkle: BetaFeatures is missing message "betafeatures-toplink" on every page on mediawiki.org
  • 02:51 logmsgbot: LocalisationUpdate completed (1.23wmf20) at 2014-03-28 02:51:49+00:00
  • 02:33 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 increase db1007 temporarily'
  • 02:19 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-28 02:19:40+00:00
  • 02:06 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 adjust loads' (edit: done to recover from apache icinga freakout dropping DB connections. db1028 hit max_connections while db1034 was still warming up; see gerrit 121569 commit message)
  • 01:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 repool db1034'
  • 00:08 logmsgbot: mwalker synchronized wmf-config/InitialiseSettings.php 'Attempting to poke frwikitionary into using VE'

March 27

  • 23:58 logmsgbot: mwalker synchronized php-1.23wmf19/extensions/LocalisationUpdate/ 'Updating for 121560'
  • 23:57 logmsgbot: mwalker synchronized visualeditor.dblist 'Actually syncing for 121439'
  • 23:56 logmsgbot: mwalker synchronized visualeditor-default.dblist 'Actually syncing for 121439'
  • 23:48 logmsgbot: mwalker synchronized php-1.23wmf20/extensions/Wikidata/ 'Updating Wikibase (not mulitmediaviewer) with 121545...'
  • 23:45 logmsgbot: mwalker synchronized php-1.23wmf19/extensions/Wikidata/ 'Updating Wikibase (not multimedia viewer) with 121544...'
  • 23:43 logmsgbot: mwalker synchronized php-1.23wmf19/extensions/MultimediaViewer/ 'Updating Wikibase with 121544...'
  • 23:41 logmsgbot: mwalker synchronized php-1.23wmf20/extensions/MultimediaViewer/ 'Updating Wikibase with 121545...'
  • 23:40 logmsgbot: mwalker synchronized php-1.23wmf20/extensions/MultimediaViewer/ 'Actually updating multimediaviewer with 121555...'
  • 23:26 logmsgbot: mwalker synchronized php-1.23wmf20/extensions/MultimediaViewer/ 'Updating multimediaviewer with 121555'
  • 23:23 logmsgbot: mwalker synchronized php-1.23wmf20/skins/vector/components/common.less 'Syncing for 'Follow-up to typography changes to Vector
  • 23:23 logmsgbot: ori synchronized multiversion/MWMultiVersion.php 'Ie4c431ae1: MWMultiVersion: treat *.hhvm.beta as *.wikipedia.beta'
  • 23:22 logmsgbot: ori updated /a/common to Ie4c431ae1: MWMultiVersion: treat *.hhvm.beta as *.wikipedia.beta
  • 23:21 logmsgbot: mwalker synchronized php-1.23wmf20/resources/oojs-ui 'Syncing for Update OOjs UI to v0.1.0-pre'
  • 23:07 logmsgbot: mwalker synchronized wmf-config/ 'Enabling VE on French Wiktionary & French Wikibooks'
  • 21:54 Krinkle: Reloading Zuul to deploy I243258bc2b1770524285ec7
  • 20:01 logmsgbot: hoo synchronized php-1.23wmf20/extensions/Wikidata/ 'Update Wikidata to fix a problem with SpecialMobileWatchlist'
  • 19:58 logmsgbot: hoo synchronized php-1.23wmf19/extensions/Wikidata/ 'Update Wikidata to fix a problem with SpecialMobileWatchlist'
  • 18:51 K4-713: updated payments cluster to 051fc4f
  • 18:40 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf20
  • 18:19 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf19
  • 17:52 logmsgbot: bd808 Purged l10n cache for 1.23wmf17
  • 17:44 logmsgbot: bd808 Finished scap: testwiki to php-1.23wmf20 and rebuild l10n cache (duration: 17m 29s)
  • 17:31 hashar: Zuul is all happy.
  • 17:26 logmsgbot: bd808 Started scap: testwiki to php-1.23wmf20 and rebuild l10n cache
  • 17:24 bd808|deploy: Deleted /a/common/php-1.23wmf1[12] on tin
  • 17:22 hashar: Zuul managed to retrigger all jobs that got stalled. The root cause is that the 'gallium' slave was no more proceeding jobs. The way to fix it is to unpool the slave from the Jenkins web interface (mark it offline) and repool it. I also raised the number of executors, some of the executors might be stalled for some reasons
  • 17:13 hashar: repooled Jenkins slave 'gallium'
  • 17:09 hashar: Jenkins restarting gallium slave.
  • 17:09 hashar: Changes are stalled in Zuul because jobs tied to the 'gallium' slaves are not being processed.
  • 14:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 depool db1034 after crash'
  • 13:07 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 db1021 full steam'
  • 11:51 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 repool db1021 warm up'
  • 11:22 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 depool db1021 for schema changes'
  • 10:02 springle: db1034 running custom mariadb 5.5.34 build. if in doubt, depool
  • 09:57 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 db1034 full steam'
  • 08:40 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's7 pool db1034, warm up'
  • 05:27 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php
  • 05:27 logmsgbot: demon synchronized noncirrus.dblist
  • 03:18 springle: xtrabackup clone db1007 to db1034
  • 03:04 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'touch'
  • 03:03 logmsgbot: demon synchronized noncirrus.dblist
  • 02:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Mar 27 02:11:30 UTC 2014 (duration 11m 29s)
  • 02:04 logmsgbot: LocalisationUpdate failed (1.23wmf19) at 2014-03-27 02:04:37+00:00
  • 02:04 logmsgbot: LocalisationUpdate failed (1.23wmf18) at 2014-03-27 02:04:01+00:00
  • 01:36 Krinkle: Reloading Zuul to deploy I8a7ccef26da45d9ed7c7705df77246001bb85544
  • 00:31 RobH: acknowledge silicon's check raid in icinga, already on rt7136

March 26

  • 23:41 logmsgbot: ori synchronized php-1.23wmf19/extensions/MobileFrontend/includes/specials/SpecialMobileWatchlist.php 'If18397782: Fix the watchlist header'
  • 23:38 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I310a42c51: Crats should add users to gwtoolset group on Commons'
  • 23:37 logmsgbot: ori updated /a/common to I310a42c51: Crats should add users to gwtoolset group on Commons
  • 23:28 paravoid: swapping virt1000 LDAP certificate
  • 23:24 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I5b565f47b: No longer force recentchangestext as content message'
  • 23:24 logmsgbot: ori updated /a/common to I5b565f47b: No longer force recentchangestext as content message
  • 20:43 hashar: out of despair, restarting Zuul
  • 20:42 hashar: I have NOT done that change.
  • 20:42 hashar: I have done that change. I merged fast forward to the commit before I186e3ade7: Update Wikidata OAuth grants
  • 20:41 logmsgbot: hashar updated /a/common to I186e3ade7: Update Wikidata OAuth grants
  • 20:27 awight: updated crm from c6a7129 to 5151d97
  • 20:23 hashar: Zuul / Jenkins overloaded for some reason again. Investigating.
  • 20:13 Krinkle: It seems Zuul is held up by something. Loads of jobs are 'queued', yet Jenkins is operating fine with all executors idling and an empty queue.
  • 20:09 awight: updated crm from 21b69b3 to c6a7129
  • 16:21 ottomata: initiating controlled shutdown of kafka broker analytics1022 for reracking
  • 16:18 hashar: Jenkins clearing /tmp on integration-slave1001 and 1002
  • 16:18 hashar: Jenkins: restarted gallium jenkins slave
  • 16:15 hashar: restarting Zuul
  • 16:12 ottomata: taking analytics1024 offline so it can be re-racked
  • 16:10 hashar: Jenkins got slightly overloaded for unknown reason. Will restart Zuul to clean some leaked file descriptors.
  • 15:55 hashar: Jenkins: gallium swapping :/
  • 15:32 logmsgbot: manybubbles synchronized wmf-config/CommonSettings.php 'SWAT deploy for Wikidata'
  • 15:06 Jeff_Green: dist-upgrade and reboot tantalum
  • 14:23 ottomata: stopping zookeeper and analytics1025 and shutting down, preparing to move it to Row D
  • 14:08 logmsgbot: hashar synchronized wmf-config/CommonSettings.php 'beta: vary udp2log destination by $wmfDatacenter 121056'
  • 12:32 manybubbles: rebuilding cirrus search indexes for all wikipedias with cirrus in preparation a change requiring it in the release on Thursday
  • 12:23 apergos: restarted gerrit (slow or unresponsive, nothing obvious wrong)
  • 12:18 Nemo_bis: gerrit.wikimedia.org interface throws 503
  • 10:10 apergos: restarted gmetad on nickel
  • 03:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Mar 26 03:00:41 UTC 2014 (duration 0m 40s)
  • 02:07 logmsgbot: LocalisationUpdate failed (1.23wmf19) at 2014-03-26 02:06:59+00:00
  • 02:06 logmsgbot: LocalisationUpdate failed (1.23wmf18) at 2014-03-26 02:06:17+00:00

March 25

  • 23:27 logmsgbot: catrope synchronized php-1.23wmf19/extensions/Math/modules/VisualEditor/ve.ui.MWMathInspector.js 'Fix VE math inspector title'
  • 22:40 logmsgbot: spage synchronized php-1.23wmf19/extensions/Flow/Hooks.php 'Fix Flow notification preference in 1.23wmf19'
  • 22:22 logmsgbot: hoo synchronized php-1.23wmf19/extensions/Wikidata/ 'Update Wikidata to fix an exception within WikibaseClient (bug 63087)'
  • 22:10 Krinkle: Reloading Zuul to deploy Ib2abe3a000300
  • 22:09 logmsgbot: spage synchronized wmf-config/InitialiseSettings.php 'Enable Flow on mediawiki.org Compact Personal Bar BF talk page'
  • 22:07 logmsgbot: spage updated /a/common to I7ca3051f6: Enable Flow on Compact Personal Bar BF talk page
  • 21:49 awight: updated tools from 400b4f6 to 0eb485c
  • 20:07 mutante: DNS update - killing ms5
  • 19:55 mutante: pc1-3 - remove from puppet,salt,icinga,..
  • 19:48 mutante: restarted gerrit on ytterbium
  • 19:48 hashar: Gerrit back
  • 19:48 hashar: Gerrit 503 :-(
  • 19:06 logmsgbot: reedy synchronized docroot and w
  • 19:03 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.23wmf19
  • 18:40 logmsgbot: hoo synchronized php-1.23wmf19/extensions/Wikidata/ 'Update Wikidata, patch for Wikibase js config and revert entity selector patch'
  • 17:44 mutante: ms5 - 1091 days of uptime, but this was the last, shutdown -h now
  • 17:32 logmsgbot: catrope synchronized php-1.23wmf18/extensions/LocalisationUpdate 'LU rewrite'
  • 17:23 mutante: salt-keys: removed snapshots1-4, signed palladiums own salt key
  • 17:08 RoanKattouw: Syncing rebuilt l10ncache for 1.23wmf19, built with new LocalisationUpdate version
  • 16:53 awight: update tools from 2cfc441 to 400b4f6
  • 16:26 ottomata: restarting hadoop namenodes to bring in new net topology layout
  • 16:25 logmsgbot: catrope synchronized php-1.23wmf19/extensions/LocalisationUpdate 'Deploy rewrite for 1.23wmf19'
  • 16:20 logmsgbot: catrope synchronized wmf-config/CommonSettings.php 'New LocalisationUpdate config'
  • 15:59 mutante: DNS update - removing snapshots
  • 15:31 apergos: snapshot1-4 in pmtpa powered off, decom :-)
  • 09:27 paravoid: elevated 503 levels cause seems to be a single spambot with multiple IPs POSTing malformed requests
  • 03:09 springle: shutting down pc[123] for decom (pmtpa parser cache)
  • 02:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Mar 25 02:48:08 UTC 2014 (duration 48m 7s)
  • 02:19 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-25 02:19:52+00:00
  • 02:11 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-25 02:11:10+00:00
  • 00:07 logmsgbot: catrope synchronized php-1.23wmf19/extensions/Wikidata/
  • 00:07 logmsgbot: catrope synchronized php-1.23wmf19/extensions/VisualEditor
  • 00:06 logmsgbot: catrope synchronized php-1.23wmf18/extensions/VisualEditor

March 24

  • 23:57 mutante: ms5 - removed from icinga,puppet,storedconfigs,salt...
  • 23:51 ori: Switched CentralNotice to GeoIP cookie rather than bits.wm.o/geoiplookup script tags
  • 23:43 awight: update tools from de586ae to 2cfc441
  • 23:26 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I42e9c8c97: CentralNotice: Re-set $wgCentralGeoScriptURL to false'
  • 23:25 logmsgbot: ori updated /a/common to I42e9c8c97: CentralNotice: Re-set $wgCentralGeoScriptURL to false
  • 23:17 logmsgbot: ori synchronized php-1.23wmf19/extensions/Wikidata 'I6ffb304d3: Update Wikidata'
  • 22:57 bd808: Forced /srv/scap to update to c771a46 across the cluster
  • 20:07 subbu: deployed Parsoid fa03dd20 with deploy repo sha e4d28e7e
  • 19:55 mutante: DNS update - removing tampa search pools and searchidx2
  • 19:36 mutante: shutdown searchidx2
  • 19:23 ottomata: stopping hadoop service on analytics1020 and shutting down for move to Row D
  • 19:19 mutante: searchidx2 - revoked puppet cert,remove from puppet,icinga,salt...
  • 19:02 ottomata: stopping hadoop services on analytics1019 and shutting it down for move to Row D
  • 18:22 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 db1037 full steam'
  • 14:31 ottomata: stopping hadoop services and shutting down analytics1018
  • 03:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's5 repool db1037 warm up'
  • 02:41 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Mar 24 02:41:13 UTC 2014 (duration 41m 12s)
  • 02:17 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-24 02:17:32+00:00

March 23

  • 02:41 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Mar 23 02:41:21 UTC 2014 (duration 41m 20s)
  • 02:17 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-23 02:17:16+00:00
  • 02:09 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-23 02:09:42+00:00

March 22

  • 02:44 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Mar 22 02:43:56 UTC 2014 (duration 43m 55s)
  • 02:18 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-22 02:18:55+00:00
  • 02:11 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-22 02:11:01+00:00
  • 01:33 K4-713: updated prod civi to 21b69b3
  • 00:17 K4-713: synchronized payments to 9227c8f

March 21

  • 22:13 K4-713: updated prod civicrm to 2b635c1f5cf7
  • 22:01 K4-713: updated prod civicrm to c736116
  • 19:23 mutante: torrus was broken. going through fix per https://wikitech.wikimedia.org/wiki/Torrus#Deadlock_problem
  • 18:14 mutante: DNS update - adding zero.wikimedia.org
  • 18:05 mutante: disabled and commented mobile2-5 in pmtpa pybal
  • 18:00 Krinkle: Reloading Zuul to deploy I20b4aa9159df7
  • 17:54 mutante: same for search_pool5, and setting search_prefix (search19/20) to disabled
  • 17:52 mutante: removing search_pool4 from pmtpa pybal
  • 17:50 mutante: removing owa1-3 from pmtpa pybal
  • 17:42 mutante: removing pmtpa https from pybal (ssl1-4)
  • 17:23 mutante: remove pmtpa bits (sq67-70) from pybal
  • 17:22 Krinkle: Running deleteEqualMessages.php on wuuwiki (bug 43917 comment 23)
  • 14:08 hashar: Jenkins: label labs slaves with hasJenkinsDebianGlue to build debian packages on them
  • 13:43 hashar: Jenkins: installing jenkins-debian-glue and misc::package-builder on labs slaves.
  • 12:02 akosiaris: upgrade jenkins-debian-glue to 0.8.1 on apt.wikimedia.org
  • 11:38 akosiaris: upgraded libmemcached packages on apt.wikimedia.org to libmemcached_1.0.17-1~wmf+precise2
  • 10:01 hashar: Jenkins: deleting pmtpa labs slaves integration-slave02 and integration-slave03. Replaced by eqiad instances integration-slave1001 and integration-slave1002.
  • 03:05 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Mar 21 03:05:27 UTC 2014 (duration 5m 26s)
  • 02:34 logmsgbot: LocalisationUpdate completed (1.23wmf19) at 2014-03-21 02:34:39+00:00
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-21 02:12:39+00:00
  • 00:49 logmsgbot: ori synchronized wmf-config 'Ib539f96eb7: Increase the network performance sampling rate for MediaViewer'
  • 00:48 logmsgbot: ori updated /a/common to Ib0eb802c4: Fix typo in I86f5493d0
  • 00:38 logmsgbot: hoo synchronized wmf-config/ 'Fix typo <> udp, also Icdb5425 and I04e5f7f which weren't synced but look harmless'

March 20

  • 23:38 logmsgbot: catrope synchronized php-1.23wmf18/resources/mediawiki/mediawiki.inspect.js
  • 23:22 logmsgbot: catrope synchronized wmf-config/CommonSettings.php
  • 23:22 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php
  • 23:22 logmsgbot: catrope synchronized docroot/noc/createTxtFileSymlinks.sh
  • 23:22 logmsgbot: catrope updated /a/common to Ia08c65d40: Enable Flow on Hovercards Beta Features
  • 21:43 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'If51eda243: Follow up Id6222f4db to amend sort order in feed URL'
  • 21:43 logmsgbot: ori updated /a/common to If51eda243: Follow up Id6222f4db to amend sort order in feed URL
  • 21:20 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'Id6222f4d: Add RSS of Bugzilla query of open HHVM bugs to mediawikiwiki's whitelist'
  • 21:20 logmsgbot: ori updated /a/common to Id6222f4db: Add RSS of Bugzilla query of open HHVM bugs to mediawikiwiki's whitelist
  • 21:06 logmsgbot: reedy synchronized docroot and w
  • 21:03 logmsgbot: reedy updated /a/common to Iaa99d2162: Add Wikibase repoSiteName setting for client
  • 20:56 bd808: Updated scholarships.wikimedia.org to cb2ef4c (fix for bug 62464)
  • 20:14 mutante: DNS update - remove ssl1-4
  • 20:08 mutante: DNS update - remove sq67-70, former varnish testing
  • 19:35 akosiaris: created a 50G LV for /var/log on zirconium, stopped all services, moved data to it, mounted it and restarted all services
  • 19:28 logmsgbot: reedy synchronized wmf-config/ 'Wikibase config updates'
  • 19:22 logmsgbot: reedy Finished scap: Rebuild 1.23wmf19 l10n cache for wikibase (duration: 12m 01s)
  • 19:10 logmsgbot: reedy Started scap: Rebuild 1.23wmf19 l10n cache for wikibase
  • 19:08 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf19
  • 19:02 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.23wmf18
  • 19:00 mutante: disk full on zirconium - gzipping some etherpadlite.sql dump i found
  • 17:41 logmsgbot: reedy synchronized php-1.23wmf19 'Update Wikidata and WikimediaMessages'
  • 17:06 Krinkle: Reloading Zuul to deploy Ie800ed90b51c47d5a1
  • 16:58 mutante: repooling mw1163 (it's back in dsh as well)
  • 16:38 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki back to 1.23wmf18 till window
  • 16:05 bd808: Ran /usr/local/bin/sync-common && /usr/local/bin/scap-rebuild-cdbs on mw1163. Should not repool until it's back in the dsh group. Should me manually synced just before repooling.
  • 15:36 logmsgbot: reedy Finished scap: testwiki to 1.23wmf19 and build l10n cache (duration: 15m 32s)
  • 15:21 logmsgbot: reedy Started scap: testwiki to 1.23wmf19 and build l10n cache
  • 15:03 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/118328'
  • 15:00 bd808: logstash stopped ingesting logs at 2014-03-19T22:37:54.000Z.
  • 14:56 bd808: restarted logstash on logstash1001.eqiad.wmnet
  • 03:23 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Mar 20 03:23:37 UTC 2014 (duration 23m 35s)
  • 02:32 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-20 02:32:10+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-20 02:17:34+00:00
  • 01:26 Tim: gave Sam Reed access to racktables
  • 00:26 bd808: scap-slave rsync servers have "hosts allow = 10.0.0.0/16 10.64.0.0/22 10.64.16.0/22 10.64.32.0/22 208.80.152.0/22"; missing new 10.64.48.0/22 subnet
  • 00:21 bd808: Subnet for row D still not in rsync server config?
  • 00:21 bd808: scaps failed because "@ERROR: access denied to common from mw1202.eqiad.wmnet (10.64.48.34)"
  • 00:16 bd808: finished dsh -c -M -m mw1201,mw1202,mw1203,mw1208,mw1209,mw1210 -- '/usr/local/bin/sync-common; /usr/local/bin/scap-rebuild-cdbs'
  • 00:13 bd808: dsh -c -M -m mw1201,mw1202,mw1203,mw1208,mw1209,mw1210 -- '/usr/local/bin/sync-common; /usr/local/bin/scap-rebuild-cdbs'

March 19

  • 23:57 logmsgbot: spage synchronized wmf-config 'Config change to enable extension Popups (Hovercards) on mediawikiwiki'
  • 23:29 logmsgbot: spage Finished scap: Retry Add Popups extension (Hovercards Beta Feature) to wmf17 and 18, not enabled yet (duration: 07m 15s)
  • 23:22 logmsgbot: spage Started scap: Retry Add Popups extension (Hovercards Beta Feature) to wmf17 and 18, not enabled yet
  • 23:22 logmsgbot: spage scap aborted: Add Popups extension (Hovercards Beta Feature) to wmf17 and 18, not enabled yet (duration: 69m 26s)
  • 23:20 bd808: restarted elasticsearch on logstash1002
  • 23:18 bd808: scap by spage looks hung with 50 hosts not finished rsyncing
  • 22:38 bd808: Manually ran sync-common and scap-rebuild-cdbs on mw1163
  • 22:33 mutante: depooling mw1163
  • 22:28 logmsgbot: ori synchronized wmf-config/mc.php 'Set serializer to php for production memcache'
  • 22:12 logmsgbot: spage Started scap: Add Popups extension (Hovercards Beta Feature) to wmf17 and 18, not enabled yet
  • 22:11 logmsgbot: spage updated /a/common to I3d59b8246: Add new Popups extension to list
  • 20:40 subbu: deployed Parsoid ff8c49e95d with deploy 2092317654e
  • 20:39 robh: parsoid service successfully restarted across hosts via dsh
  • 20:05 logmsgbot: yurik synchronized docroot/bits/WikipediaMobileFirefoxOS/
  • 20:03 mutante: shut down sq67-70
  • 20:00 akosiaris: disabled puppet on labsdb1004 until osm2pgsql completes
  • 19:21 mutante: sq67-70: disable puppet,revoke puppet certs,delete salt keys and stored configs (delete from icinga)
  • 19:15 mutante: shutting down ssl1-4
  • 18:59 bblack: all varnishes upgraded to 3.0.5plus-wmftest-wm5 (without restart, just adds header vmod)
  • 18:46 mutante: ssl1-4: puppet agent --disable, puppetstoredconfigclean, revoking puppet certs
  • 18:24 cmjohnson1: powering down mw1183 to reseat dimm
  • 17:56 cmjohnson1: mw1177 powering down to reseat DIMM
  • 17:42 awight: tools updated from 87dbe60 to de586ae
  • 17:32 mutante: DNS update - removing Tampa appservers
  • 17:15 awight: updated crm from f8b2dab to 8c8f0de
  • 13:51 hashar: Jenkins : applying crazy node/npm upgrade hack on Jenkins labs instances integration-slave1001 and integration-slave1002 ( ref: https://bugzilla.wikimedia.org/show_bug.cgi?id=61508#c2 and ops list)
  • 12:38 manybubbles: reindexed all group0 wikis now that we have the Cirrus SWAT deploy (from yesterday). Once the reindex is done we can deploy code to improve performance. Yummy. Starting on commons and eventually doing the rest of group1.
  • 11:24 hashar: Jenkins: all npm based jobs are broken due to nodejs self signed certificate being outdated on contint labs instances (see [[bugzilla:61508|bug 61508]] and mail to ops list)
  • 10:43 hashar: Jenkins pooling integration-slave1002.eqiad.wmflabs
  • 10:36 hashar: Jenkins attempting to bring up integration-slave1001.eqiad.wmflabs (aka migrating Jenkins slave nodes in labs from pmtpa to eqiad)
  • 09:13 logmsgbot: ori synchronized wmf-config/mc-labs.php 'I46a9d180b: Beta cluster MemcachedPeclBagOStuff: use PHP serialization'
  • 08:54 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 repool db1042'
  • 02:43 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Mar 19 02:43:28 UTC 2014 (duration 43m 27s)
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-19 02:12:42+00:00
  • 02:07 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-19 02:07:15+00:00
  • 00:36 logmsgbot: ori synchronized mediaviewer.dblist
  • 00:32 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'Ib83df6e31: Correct typo in $wmgMediaViewerLoggedIn var name'
  • 00:32 logmsgbot: ori updated /a/common to Ib83df6e31: Correct typo in $wmgMediaViewerLoggedIn var name
  • 00:19 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's4 depool db1042 while replag catches up'

March 18

  • 23:58 logmsgbot: maxsem synchronized php-1.23wmf18/extensions/MobileFrontend/MobileFrontend.php 'https://gerrit.wikimedia.org/r/119416'
  • 23:42 logmsgbot: ori synchronized wmf-config 'Ie7597f5f9: Follow-up: Ifeda5996385 – 'mediaviwerpilot.dblist' is ugly'
  • 23:42 logmsgbot: ori updated /a/common to Ie7597f5f9: Follow-up: Ifeda5996385 – 'mediaviwerpilot.dblist' is ugly
  • 23:38 logmsgbot: ori synchronized wmf-config 'I936d5abe3: Bump geosearch radius to 20km on Wikivoyage'
  • 23:38 logmsgbot: ori updated /a/common to I936d5abe3: Bump geosearch radius to 20km on Wikivoyage
  • 23:37 logmsgbot: ori synchronized wmf-config 'Ifeda59963: Add MMV feature flags for beta and pilot sites'
  • 23:36 logmsgbot: ori updated /a/common to Ifeda59963: Add MMV feature flags for beta and pilot sites
  • 18:28 logmsgbot: bd808 Purged l10n cache for 1.23wmf16
  • 18:21 logmsgbot: bd808 synchronized docroot/bits 'bits/static-current to 1.23wmf18; static-stable to 1.23wmf17'
  • 18:17 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.23wmf18
  • 17:16 MaxSem: Rebuilding GeoData index
  • 16:13 K4-713: updated fraud filter rules on payments
  • 15:35 logmsgbot: manybubbles synchronized php-1.23wmf18/extensions/CirrusSearch/
  • 15:04 logmsgbot: reedy updated /a/common to I72a7751f8: Let AbuseFilter block users on Spanish Wikivoyage
  • 14:37 logmsgbot: reedy synchronized wmf-config/
  • 14:29 Reedy: Create EducationProgram and Translate tables on legalteamwiki
  • 13:31 logmsgbot: hoo synchronized wmf-config/InitialiseSettings.php 'Enable GuidedTours on Wikidata'
  • 08:33 ori: 5xx resps spiked between 6:15 and 6:35 UTC; lvs1001 SSH check flapped between 6:48 and 6:56 UTC.

March 17

  • 22:03 springle: xtrabackup clone db1005 to db1037
  • 21:46 springle: synchronized wmf-config/db-eqiad.php 's1 depool db1037'
  • 20:09 gwicke: deployed Parsoid d0f0080a
  • 15:13 logmsgbot: maxsem synchronized wmf-config/ 'https://gerrit.wikimedia.org/r/#/c/114634/'
  • 14:24 apergos: disabling puppet on dataset2 for shuffling data sources around
  • 13:41 manybubbles: rebuilding CirrusSearch index for enwiki - its mapping is out of date, making the search results worse.
  • 02:01 logmsgbot: LocalisationUpdate failed: mwversionsinuse returned empty list

March 16

  • 17:27 hoo: ran rebuildEntityPerPage.php on wikidata to rebuild the epp table for 466 items
  • 10:19 matanya: shop.wikimeida.org is down - serves 503
  • 10:06 apergos: re-enabled puppet on dataset2, testing concluded
  • 07:18 apergos: isabled puppet on dataset2 for testing
  • 03:38 Reedy: Archived 2013 SAL
  • 03:18 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Mar 16 03:18:53 UTC 2014 (duration 53m 29s)
  • 02:44 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-16 02:44:36+00:00
  • 02:35 logmsgbot: LocalisationUpdate completed (1.23wmf17) at * Server admin log/Archive 22 (2013 Jan - 2013 Jul)2014-03-16 02:35:04+00:00
  • 02:28 Reedy: Running l10nupdate (slightly hacked version) manually
  • 02:04 logmsgbot: LocalisationUpdate failed: mwversionsinuse returned empty list

March 15

  • 14:55 ori: promoted self (User:Ori.livneh) to sysop/'creat on hewikisource to try and install a solution for 60939; then revoked both some 20 minutes later.
  • 14:54 ori: restarted morebots; too lazy to check logs to try and ascertain why and where it went

March 14

  • 23:54 awight: update crm from 80a0bfcbb2d380d42e995438e6a266c0b2c41ae2 to 8a1ecee511abefb46a06b980cc05bf630d145a2c
  • 23:41 K4-7131: adjusted anti-fraud rules on payments cluster
  • 22:59 awight: updated crm from 9388148fa7c1f4aeb1117c1bf50fc836ecce6591 to 80a0bfcbb2d380d42e995438e6a266c0b2c41ae2
  • 22:14 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php
  • 22:13 logmsgbot: demon synchronized wmf-config/CommonSettings.php
  • 22:13 logmsgbot: demon synchronized noncirrus.dblist
  • 21:49 awight: updated crm from 88fbebde7a4eaea22472812f4167235e3b46260f to 9388148fa7c1f4aeb1117c1bf50fc836ecce6591
  • 20:47 awight: crm updated from 79a88e00b3c67879ab167b219bf2ec66682fc1c0 to 88fbebde7a4eaea22472812f4167235e3b46260f
  • 20:07 mutante: shut down solr3, almost forgot
  • 19:25 mutante: shut down solr1/2
  • 19:24 jgage: jgage ran kafka preferred-replica-election on analytics1021 to rebalance
  • 18:43 mutante: solr[12] - disable puppet, puppetstoredconfigclean, remove from icinga
  • 11:10 logmsgbot: maxsem synchronized php-1.23wmf18/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/118681'
  • 11:09 logmsgbot: maxsem synchronized php-1.23wmf17/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/118681'
  • 10:01 logmsgbot: hashar synchronized wmf-config/InitialiseSettings.php 'adding Amsterdam Museum to the wgCopyUploadsDomains 118342'
  • 09:57 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 db1062 full steam'
  • 09:26 akosiaris: upgraded php5 on apt.wikimedia.org to php5_5.3.10-1ubuntu3.10+wmf1.
  • 06:58 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 pool db1062, warm up'
  • 03:51 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'small wikis done building'
  • 03:35 logmsgbot: ori synchronized php-1.23wmf17/includes/resourceloader/ResourceLoaderStartUpModule.php 'Ibeda834e9: Emit $wgSearchType as JavaScript config variable'
  • 03:35 logmsgbot: ori synchronized php-1.23wmf18/includes/resourceloader/ResourceLoaderStartUpModule.php 'Ibeda834e9: Emit $wgSearchType as JavaScript config variable'
  • 03:29 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1034, xtrabackup clone to db1062'
  • 03:20 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Mar 14 03:20:12 UTC 2014 (duration 20m 11s)
  • 02:48 logmsgbot: ori synchronized php-1.23wmf17/includes/resourceloader/ResourceLoaderStartUpModule.php 'Emit as JavaScript config variable'
  • 02:47 logmsgbot: ori synchronized php-1.23wmf18/includes/resourceloader/ResourceLoaderStartUpModule.php 'Emit as JavaScript config variable'
  • 02:47 logmsgbot: LocalisationUpdate completed (1.23wmf18) at 2014-03-14 02:47:13+00:00
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-14 02:12:53+00:00

March 13

  • 23:56 K4-713: synchronized payments to 56ba4e5a39ae0156
  • 23:56 mutante: shutting down ersch (former poolcounter)
  • 23:48 mutante: ersch: disable puppet, remove from monitoring
  • 23:28 logmsgbot: hoo synchronized php-1.23wmf17/extensions/Wikidata/ 'Fix a fatal error in ContentRetriever'
  • 23:19 logmsgbot: krinkle synchronized php-1.23wmf18/extensions/VisualEditor/ '52800602b2b487'
  • 23:16 mutante: tmh1,tmh2: removed from monitoring, shutting down
  • 23:16 logmsgbot: demon synchronized php-1.23wmf18/extensions/CirrusSearch/
  • 23:12 logmsgbot: demon synchronized php-1.23wmf17/extensions/CirrusSearch/
  • 23:09 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Moving all small wikis into cirrus'
  • 23:08 robh: restarting parsoid service
  • 23:04 logmsgbot: hoo synchronized php-1.23wmf18/extensions/Wikidata/ 'Fix a fatal error in ContentRetriever (for testwikidata first)'
  • 23:02 gwicke: deployed Parsoid 004c7acc, second attempt after trebuchet no-op the last few times [1]
  • 22:58 mutante: disabling puppet on tmh[12]
  • 18:51 mwalker: updating civicrm from 2353dbcda410645e02dd9f049ff9e1eb341d2f16 to 79a88e00b3c67879ab167b219bf2ec66682fc1c0 for thank you message updates
  • 18:47 csteipp: fix deployed for bug 62497
  • 18:46 bd808: mw1201,mw1202,mw1203,mw1208,mw1209,mw1210 in mediawiki-installation dsh group on tin; not sure why they didn't get the scap request
  • 18:38 bd808|deploy: running sync-common on mw1202,mw1203,mw1208,mw1209,mw1210 via dsh
  • 18:37 bd808|deploy: ran sync-common on mw1201 manually
  • 18:31 bd808|deploy: mw1201, mw1202, mw1203, mw1208, mw1209, mw1210 didn't get new branch during scap
  • 18:14 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf18
  • 18:06 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf17
  • 18:05 manybubbles: rebuilding search index for itwiki
  • 17:06 bd808|deploy: bits symlinks for 1.23wmf6 to 1.23wmf10 deleted
  • 17:05 logmsgbot: bd808 Finished scap: testwiki to php-1.23wmf18 and rebuild l10n cache (duration: 16m 14s)
  • 16:59 apergos: restarted gerrit which had been slow
  • 16:55 robh: gerrit is just really slow, but not quite down.
  • 16:49 logmsgbot: bd808 Started scap: testwiki to php-1.23wmf18 and rebuild l10n cache
  • 16:20 paravoid: deactivating Tele2 from eqiad, blackholing traffic
  • 16:08 paravoid: fixing pybal for mw1208/mw1209/mw1210 move; same issue as last time
  • 15:45 paravoid: announcing only Tampa's /23 from pmtpa/sdtpa
  • 15:04 ^d: restarting gerrit process, ytterbium was swapping
  • 12:40 cmjohnson1: shutting down and relocating mw1208, mw1209 and mw1210 to row D
  • 08:49 paravoid: API slowness (affecting Parsoid/VE, among others) there since ~12:00 UTC found and fixed
  • 07:42 RoanKattouw: Reverted API pybal weights back to original values; apparently makes sense given amount of memory
  • 07:05 RoanKattouw: Changed pybal weights for eqiad API cluster to # of CPUs on each machine; weights were backwards (machines with fewer CPUs had higher weights)
  • 03:23 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Mar 13 03:23:13 UTC 2014 (duration 23m 12s)
  • 02:33 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-13 02:33:54+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-13 02:17:46+00:00
  • 00:14 awight: updated crm from 445f1f0ce489e65bf3765a23403f5b52b77f23cf to 2353dbcda410645e02dd9f049ff9e1eb341d2f16

March 12

  • 23:49 logmsgbot: mwalker synchronized php-1.23wmf17/extensions/CentralNotice/
  • 23:49 logmsgbot: mwalker synchronized php-1.23wmf16/extensions/CentralNotice/
  • 23:20 logmsgbot: mwalker synchronized php-1.23wmf17/extensions/CentralNotice
  • 23:19 logmsgbot: mwalker synchronized php-1.23wmf16/extensions/CentralNotice
  • 21:57 logmsgbot: bd808 Finished scap: another no-diff scap to test script changes (duration: 03m 25s)
  • 21:54 logmsgbot: bd808 Started scap: another no-diff scap to test script changes
  • 21:54 ottomata: initiated controlled shutdown of an21, promoting an22 to leader of all partitions
  • 21:51 logmsgbot: bd808 Finished scap: no-diff scap to test script changes (duration: 21m 42s)
  • 21:30 logmsgbot: bd808 Started scap: no-diff scap to test script changes
  • 21:14 mutante: welcome to root, Chase, added key to root-auth-keys
  • 20:05 Coren: restarting parsoids as requested by gwicke
  • 20:02 gwicke: deployed Parsoid 004c7acc with deploy f97820a2; restart todo
  • 19:12 logmsgbot: yurik synchronized php-1.23wmf16/extensions/ZeroRatedMobileAccess/
  • 19:11 logmsgbot: yurik synchronized php-1.23wmf17/extensions/ZeroRatedMobileAccess/
  • 19:06 mutante: graceful apache on gallium
  • 17:35 logmsgbot: yurik synchronized php-1.23wmf17/extensions/ZeroRatedMobileAccess/
  • 17:27 logmsgbot: yurik synchronized php-1.23wmf16/extensions/ZeroRatedMobileAccess/
  • 17:16 awight: updated crm from f0837090e918b7016c07cb366b33c8d8f0d1c661 to 445f1f0ce489e65bf3765a23403f5b52b77f23cf
  • 17:02 logmsgbot: reedy synchronized wmf-config/ 'noop'
  • 16:56 logmsgbot: reedy synchronized docroot and w
  • 16:56 logmsgbot: reedy synchronized wmf-config/ 'I6f1f3f8af5b97aa0e537fbae308ce27b28071894'
  • 16:09 ottomata: initiating controlled shutdown of analytics1021 kafka broker to do some load testing and also fix runtime java version
  • 15:56 cmjohnson1: ms-be1005 going down to fix mgmt
  • 15:06 hoo: syncing to mw120[1-3] failed
  • 14:58 logmsgbot: hoo synchronized php-1.23wmf17/extensions/CentralAuth/ 'Fix global account deletion'
  • 14:37 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 db1063 full steam'
  • 14:24 paravoid: deploying new swift ring @ eqiad, setting weight from 100 to 2000 on all disks
  • 13:01 cmjohnson1: shutting down and relocating mw1201, mw1202, mw1203 to d5-eqiad
  • 10:41 logmsgbot: maxsem synchronized wmf-config/mobile.php 'https://gerrit.wikimedia.org/r/118227'
  • 10:17 springle: started s4 dump for toolserver on db72 /a
  • 09:52 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 pool db1063, warm up'
  • 09:44 hashar: rerestarting Jenkins
  • 09:16 hashar: kill -9 of Jenkins since it is unresponsive
  • 09:13 hashar: restarting Jenkins
  • 09:12 hashar: Jenkins broken again! Good morning.
  • 06:18 springle: xtrabackup clone db1018 to db1063
  • 05:38 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 pool db1061, warm up'
  • 04:52 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 db1051 warm up'
  • 02:43 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Mar 12 02:42:59 UTC 2014 (duration 42m 58s)
  • 02:32 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 drop load during xtrabackup clone db1051 to db1061'
  • 02:18 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-12 02:18:12+00:00
  • 02:10 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-12 02:10:25+00:00

March 11

  • 19:36 mutante: re-deleting salt keys for pmtpa appservers
  • 19:03 mutante: shut down mw86-mw125 (sdtpa row A, A5)
  • 18:52 mutante: shut down mw28-mw57
  • 18:35 logmsgbot: bd808 purged l10n cache for 1.23wmf15
  • 18:33 logmsgbot: bd808 purged l10n cache for 1.23wmf14
  • 18:08 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.23wmf17
  • 18:02 logmsgbot: bd808 updated /a/common to I06d07cc3e: beta: disable memcached accross datacenters
  • 17:31 mutante: shut down srv284-srv301 (sdtpa row B, B5)
  • 16:08 cmjohnson1: attempting to fix ps1-b5 and ps1-b6
  • 14:03 logmsgbot: reedy synchronized wmf-config/
  • 14:02 logmsgbot: reedy synchronized database lists files:
  • 13:54 logmsgbot: reedy synchronized wmf-config/CommonSettings.php 'I208d51b5db031d35518453e2b9de096f7f53f7a0'
  • 10:37 MaxSem: Manually disabled old broken job queue cronjobs on hume
  • 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Mar 11 02:57:11 UTC 2014 (duration 57m 10s)
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-11 02:22:39+00:00
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-11 02:12:35+00:00

March 10

  • 23:30 ^d: kicking gerrit to pick up bugfix.
  • 23:07 mutante: shutting down srv258-srv270
  • 22:18 bd808: Two instances of logstash were running on logstash1001; Killed both and started service again
  • 21:55 bd808: Restarted logstash on logstash1001; new events flowing in again now
  • 21:49 K4-713: synchronized payments to 5d20972.
  • 21:47 bd808: ganglia monitoring for elasticsearch on logstash cluster seems broken. Caused by 1.0.x upgrade having not happened there yet?
  • 21:46 bd808: Restarted elastcisearch on logstash1003; it was JVM heap thrashing at 98% heap used.
  • 21:44 RobH: arsenic reclaim per rt6522, ignore alerts
  • 21:40 sbernardin: ms-be5 swapping failed disk
  • 21:34 K4-713: synchronized payments cluster to 01f7af8
  • 21:27 bd808: No new data in logstash since 14:56Z. Bryan will investigate.
  • 20:26 gwicke: Coren fixed up the Parsoid deploy by running "salt-run deploy.restart 'parsoid/deploy' '10%'" from the salt master as a work-around for bugzilla:61882
  • 20:18 gwicke: deployed Parsoid 681f7b8d2 using deploy 77d17489; service restart incomplete due to bugzilla:61882
  • 19:43 logmsgbot: reedy synchronized wmf-config/ 'Use Vips for images over 20MP'
  • 19:17 ^d: enwiki search indexes don't look happy. sporadic reports of "search request timed out" from users, cpu usage pretty high. possibly unrelated bug logs getting spammed with: "Retry, temporal error for query..."
  • 18:50 awight: updated crm from b85722a7c14af6ec1b50d2cddcd9f84d8c88da3e to a1ca31664f771a48963425893b2b927a5148119b
  • 18:27 awight: crm updated from 1aa34fd565e818466515b70c6a00af2e30e47ce2 to b85722a7c14af6ec1b50d2cddcd9f84d8c88da3e
  • 18:11 logmsgbot: krinkle synchronized php-1.23wmf17/resources/mediawiki.api/mediawiki.api.watch.js 'I2ac9e0da0f1c825'
  • 18:07 logmsgbot: demon synchronized php-1.23wmf17/extensions/Elastica/Elastica.php 'I forgot how to use git'
  • 18:03 logmsgbot: demon synchronized php-1.23wmf17/extensions/Elastica/Elastica.php
  • 17:37 andrewbogott: upgrading mediawiki on wikitech
  • 16:33 hashar: restarted Zuul on gallium : leaked file descriptors (fixed upstream)
  • 15:43 logmsgbot: hoo synchronized multiversion/MWScript.php 'Path typo fix: I6a05447'
  • 15:05 logmsgbot: hoo synchronized wmf-config/session-labs.php 'Syncing labs file for consistency I5e1a3242'
  • 15:05 logmsgbot: hoo synchronized wmf-config/InitialiseSettings.php 'Enable Extension:GuidedTour on testwikidatawiki'
  • 14:57 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 14:56 logmsgbot: reedy updated /a/common to Ibcbd10044: Disable and remove ContactPageFundraiser
  • 14:44 logmsgbot: reedy synchronized wmf-config/ 'Remove ContactPageFundraiser, SkinPerPage, FormPreloadPostCache and skins used on foundationwiki'
  • 14:26 ottomata: put elastic1007 and elastic1013-1016 back in main elasticsearch pool
  • 09:11 hashar: Jenkins restarted and proceeding jobs again
  • 09:05 hashar: stopping Jenkins
  • 09:01 hashar: Jenkins unresponsive for some reason yeah!!!
  • 02:57 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Mar 10 02:57:32 UTC 2014 (duration 57m 31s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-10 02:25:14+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-10 02:14:08+00:00

March 9

  • 19:25 awight: crm updated from f06bd7920921205e97a142cec19f1096b9761768 to 1aa34fd565e818466515b70c6a00af2e30e47ce2

March 8

  • 03:49 mutante: shut down srv250-srv247 (this was stdpa row D-D2)
  • 03:41 mutante: shutting down srv193,srv235-srv247,srv248,249
  • 03:28 andrewbogott: upgrading nova-compute on virt100[1-6]
  • 03:15 mutante: shut down Tampa mw hosts 21-27,49,58,75-85 (that is sdpta row D:D1)
  • 03:04 mutante: shut down Tampa mw hosts 1-20
  • 02:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Mar 8 02:55:55 UTC 2014 (duration 55m 54s)
  • 02:25 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-08 02:25:00+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-08 02:14:25+00:00

March 7

  • 23:36 logmsgbot: ori updated /a/common to If2530281b: Order branch directories as version numbers
  • 22:13 mutante: deleting salt keys for all Tampa app servers removed from puppet
  • 21:54 hashar: Jenkins restarted
  • 21:49 hashar: restarting Jenkins it is broken
  • 21:45 mutante: killing all tampa appservers from puppetstoredconfigs
  • 21:41 mutante: disabling puppet agent on all tampa appservers
  • 21:15 logmsgbot: hoo synchronized wmf-config/InitialiseSettings-labs.php 'Syncing beta-only change for consistency'
  • 20:27 mutante: killing mw1-16 from puppet stored configs, icinga,..
  • 20:26 mutante: revoking puppet certs for Tampa appservers
  • 20:14 logmsgbot: catrope synchronized php-1.23wmf16/extensions/VisualEditor/modules/ve-mw/dm/nodes/ve.dm.MWBlockImageNode.js 'Fix image corruption bug'
  • 20:13 logmsgbot: catrope updated /a/common/php-1.23wmf16 to I4a10768ec: Update VisualEditor to wmf16 branch for cherry-pick
  • 18:17 bd808: Restored pre Ic56177a versions of wmf-config/*pmtpa* config files to mw31 again. Something wiped them out since 20:23Z yesterday even though "mw31" is not found in any dsh group files on tin.
  • 17:53 mutante: restarted squid on brewster
  • 04:22 springle: s4 commonswiki schema changes slave by slave, alter revision legacy primary key to match current tables.sql definition
  • 03:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-07 03:31:43+00:00
  • 02:43 logmsgbot: LocalisationUpdate completed (1.23wmf17) at 2014-03-07 02:43:40+00:00
  • 02:21 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-07 02:21:46+00:00
  • 02:14 manybubbles: [Elasticsearch upgrade] done. we'll take a while to catch up on jobs that piled up during the upgrade, but we'll get them in time.
  • 02:10 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Turn Cirrus back on for all wikis how it was before'
  • 02:09 logmsgbot: demon synchronized wmf-config/jobqueue-eqiad.php 'Turn Cirrus jobs back on'
  • 02:04 manybubbles: [Elasticsearch upgrade] restoring more sane recovery speed
  • 01:39 logmsgbot: mholmquist synchronized php-1.23wmf17/extensions/UniversalLanguageSelector/UniversalLanguageSelector.hooks.php 'Actually gate the beta feature for ULS'
  • 01:29 manybubbles: [Elasticsearch upgrade] temporarily raising recovery speed
  • 01:29 logmsgbot: jgonera synchronized php-1.23wmf16/extensions/MobileFrontend/ 'Touch MobileFrontend.i18n.php to update RL cache'
  • 01:19 logmsgbot: mholmquist synchronized wmf-config/InitialiseSettings.php 'Fix James_F's commit, follow-up, should gate ULS beta feature'
  • 01:18 logmsgbot: mholmquist updated /a/common to I48e98d28f: Follow-up: Icf0bef96306661 – missing file(!) from commit
  • 01:18 logmsgbot: mholmquist Finished scap: (no message) (duration: 10m 55s)
  • 01:07 rdwrer: That scap was for ULS, VE, and MobileFrontend fixes and updates.
  • 01:07 logmsgbot: mholmquist Started scap: (no message)
  • 01:03 manybubbles: [Elasticsearch upgrade] Reenabling puppet
  • 01:03 hashar: Jenkins back up
  • 01:03 hashar: Jenkins backup
  • 01:03 manybubbles: [Elasticsearch upgrade] All primary shards have started. Waiting on secondary.
  • 01:01 manybubbles: [Elasticsearch upgrade] Wait for the cluster to recover.
  • 01:00 hashar: killed wrong jenkins process (+1 for 2am fix up). Restarting jenkins
  • 00:59 manybubbles: [Elasticsearch upgrade] Verifying versions
  • 00:59 manybubbles: [Elasticsearch upgrade] Starting Elasticsearch
  • 00:58 hashar: Killing a duplicate Jenkins java process on gallium (init.d script sucks, I really need to get it fixed one day)
  • 00:52 logmsgbot: mholmquist updated /a/common to Iad8c84a7d: Don't use += with $wgJobTypesExcludedFromDefaultQueue
  • 00:51 manybubbles: [Elasticsearch upgrade] Upgrading Elasticsearch
  • 00:49 manybubbles: [Elasticsearch upgrade] Shutting down Elasticsearch
  • 00:48 manybubbles: [Elasticsearch upgrade] Turning off shard reallocation so we don't thrash while Elasticsearch shuts down
  • 00:47 manybubbles: [Elasticsearch upgrade] Disabling puppet so it doesn't restart Elasticsearch while we're upgrading it
  • 00:44 manybubbles: [Elasticsearch upgrade] Elasticsearch is now quiescent
  • 00:43 mutante: gracefull'ing rogue apaches, mw1131,mw1189,mw1190,mw1215
  • 00:40 mutante: gracefull'ing rogue apaches, mw1070,mw1089,mw1104,mw1111
  • 00:37 logmsgbot: demon synchronized wmf-config/jobqueue-eqiad.php 'Fixing $wgJobTypesExcludedFromDefaultQueue config'
  • 00:34 manybubbles: [Elasticsearch upgrade] Running puppet everywhere to make sure we have the newest config
  • 00:33 mutante: graceful mw1040
  • 00:03 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php 'Turn Cirrus of for the duration of the upgrade'
  • 00:02 logmsgbot: manybubbles synchronized wmf-config/jobqueue-eqiad.php 'Pausing Cirrus jobs for the duration of the upgrade.'
  • 00:02 manybubbles: Starting Elasticsearch upgrade

March 6

  • 22:20 ^d: restarting jenkins on gallium. It's totally hung and nothing's getting done. Jobs will probably need retriggering.
  • 21:23 bd808: "No space left on device" errors from snapshot1004.eqiad.wmnet during scap
  • 21:21 logmsgbot: bd808 Finished scap: php-1.23wmf17 l10n cache rebuild (duration: 11m 20s)
  • 21:13 bblack: all imagemagick-related packages updated to latest on image/video scalers
  • 21:10 logmsgbot: bd808 Started scap: php-1.23wmf17 l10n cache rebuild
  • 20:31 hoo: Syncing to snapshot[1-4] failed
  • 20:28 logmsgbot: hoo synchronized php-1.23wmf17/extensions/Wikidata/ 'Update Wikidata to fix an uncaught exception in claim html formatting'
  • 20:23 bd808: Restored pre Ic56177a versions of wmf-config/*pmtpa* config files to srv270, mw31 and mw40
  • 20:03 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf17
  • 19:48 bd808|deploy: on mw31, mw40 and srv270
  • 19:47 bd808|deploy: Touched wmf-config/PoolCounterSettings-pmtpa.php and jobqueue-pmtpa.php as more follow up to I332c0e8
  • 19:43 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf16
  • 19:36 logmsgbot: bd808 synchronized wmf-config/abusefilter.php 'Simplify the AbuseFilter configuration Ia472af8'
  • 19:32 cmjohnson1: ms-be1005 down for update
  • 19:11 mutante: touch /usr/local/apache/common-local/wmf-config/db-ptmpa.php on srv270,mw40,mw31 to fix Apache errors, details in change 117244
  • 18:57 akosiaris: lvresize -L +40G terbium/root, resize2fs /dev/mapper/terbium-root. See RT #6984
  • 18:21 logmsgbot: bd808 Finished scap: testwiki to php-1.23wmf17 and rebuild l10n cache (try #2) (duration: 15m 22s)
  • 18:09 mutante: shut down sockpuppet - bye for good
  • 18:06 logmsgbot: bd808 Started scap: testwiki to php-1.23wmf17 and rebuild l10n cache (try #2)
  • 17:57 logmsgbot: bd808 scap aborted: testwiki to php-1.23wmf17 and rebuild l10n cache (duration: 00m 41s)
  • 17:56 logmsgbot: bd808 Started scap: testwiki to php-1.23wmf17 and rebuild l10n cache
  • 17:20 mutante: sockpuppet - disabling puppet,disabling monitoring,remove from stored configs,revoke puppet cert,delete salt key
  • 16:18 logmsgbot: reedy synchronized wmf-config/
  • 16:17 logmsgbot: reedy synchronized docroot and w
  • 15:29 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 15:28 logmsgbot: reedy synchronized database lists files: Ia651454e773e6f26f84c334b51fa64cb3dd44762
  • 15:19 hashar: Jenkins: added phpunit/phpcs/kss on the labs slaves. Flagged them with label hasPhpUnit hasPhpcs
  • 14:50 logmsgbot: reedy synchronized wmf-config/ 'legalteamwiki config and touch of IS'
  • 14:48 logmsgbot: reedy updated /a/common to I495e02449: Initial setup for legalteamwiki
  • 14:48 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: add legalteamwiki
  • 14:46 logmsgbot: reedy synchronized database lists files:
  • 14:35 logmsgbot: hoo synchronized wikidataclient.dblist 'Setup test.wikidata as repo for test2 and test.wikipedia I6f4c512'
  • 14:34 logmsgbot: hoo synchronized wmf-config/ 'Setup test.wikidata as repo for test2 and test.wikipedia I6f4c512'
  • 09:43 logmsgbot: faidon synchronized wmf-config/CommonSettings.php 'revert wgCentralGeoScriptURL to false'
  • 09:42 logmsgbot: faidon updated /a/common to Id235690c6: Revert "CentralNotice: set $wgCentralGeoScriptURL to false"
  • 09:37 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 db1034 full steam'
  • 07:44 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 db1010 full steam, s1 db1034 warm up'
  • 04:19 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 repool db1010 warm up'
  • 03:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-06 03:34:41+00:00
  • 03:30 springle: s1 xtrabackup clone db1049 to db1034
  • 03:29 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1049'
  • 03:20 springle: db1034 testing /a ext4 noatime,barrier=0
  • 02:51 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-06 02:50:59+00:00
  • 02:26 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-03-06 02:26:16+00:00
  • 01:47 springle: s6 xtrabackup clone db1022 to db1010
  • 00:20 logmsgbot: catrope synchronized php-1.23wmf16/extensions/VisualEditor/modules/ve-mw/dm/nodes/ve.dm.MWBlockImageNode.js 'Fix 2-pixel image bug'
  • 00:20 logmsgbot: catrope updated /a/common/php-1.23wmf16 to Id675e756a: Update VisualEditor extension, for the right cherry-pick this time
  • 00:11 logmsgbot: catrope updated /a/common/php-1.23wmf16 to I706751cd2: Update VisualEditor to wmf16 branch for cherry-pick
  • 00:04 logmsgbot: anomie synchronized php-1.23wmf16/includes/htmlform/ 'Backport gerrit:117038 to fix regression'
  • 00:03 logmsgbot: anomie synchronized php-1.23wmf15/includes/htmlform/ 'Backport gerrit:117038 to fix regression'
  • 00:03 ^d: restarted gitblit service on antimony

March 5

  • 23:21 logmsgbot: bd808 Finished scap: no-diff scap to test script changes (duration: 27m 25s)
  • 22:53 logmsgbot: bd808 Started scap: no-diff scap to test script changes
  • 22:22 bd808: scap broken; working on a fix
  • 22:21 logmsgbot: bd808 scap failed: AttributeError 'int' object has no attribute 'find' (duration: 00m 00s)
  • 21:23 mutante: killing rhodium from puppet stored configs and icinga, was already removed from site.pp in change 115638
  • 20:48 mutante: starting NTP on analytics1004
  • 20:46 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Wikiquote Cirrus indexes done building, Beta for all!'
  • 20:10 mutante: installing php and security upgrades on bast1001
  • 20:09 mutante: installing security upgrades on iron
  • 18:27 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'Ic410bd788a: CentralNotice: set $wgCentralGeoScriptURL to false'
  • 18:26 logmsgbot: ori updated /a/common to Ic410bd788: CentralNotice: set $wgCentralGeoScriptURL to false
  • 18:24 awight: updated crm from from aa3cd54e1ebf21bc7b2c34e2431a4af32dac3b32 to f06bd7920921205e97a142cec19f1096b9761768
  • 18:24 awight: updated crm from 648337cad8d465b2e03421aac59bc1117a797fd0 to aa3cd54e1ebf21bc7b2c34e2431a4af32dac3b32
  • 18:18 ori: Cookie-based geolocation deployed to all text varnishes
  • 17:20 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Cirrus for all the wikiquotes'
  • 16:31 cmjohnson1: carbon going down
  • 16:16 akosiaris: upgraded php5 packages, php5-wmerrors and libmemcached packages on beta in preparation for full cluster upgrade. This will make puppet unhappy.
  • 16:04 cmjohnson1: shutting down mw1165 swapping DIMM
  • 14:39 ottomata: starting controlled shutdown (and restart) of kafka broker analytics1022 to bring in new replica.lag settings
  • 14:31 ottomata: starting controlled shutdown of kafka broker an21 to reload new replica.lag settings
  • 11:12 hashar: jenkins back
  • 11:11 logmsgbot: ori updated /a/common to I5277d2451: Update Schema:Echo revision to 7731316
  • 11:08 hashar: restarting Jenkins which has starving threads
  • 10:03 hashar: Jenkins: gallium overloaded with a few java threads taking 100% CPU starving the box :(
  • 05:21 Ryan_Lane: finished switch from the perl git-deploy to trigger
  • 04:17 Ryan_Lane: switching out git-deploy perl frontend for trebuchet trigger
  • 03:41 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-05 03:41:14+00:00
  • 03:37 springle: messing with innodb compression on db1007
  • 02:54 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-05 02:54:00+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-03-05 02:27:46+00:00
  • 00:24 logmsgbot: ori synchronized php-1.23wmf15/extensions/CentralNotice 'Update CentralNotice to tip of wmf_deploy for I7d8259fc4'
  • 00:17 logmsgbot: ori synchronized php-1.23wmf16/extensions/CentralNotice 'Update CentralNotice to tip of wmf_deploy for I7d8259fc4'

March 4

  • 23:09 mutante|away: DNS update - removing 'zhen'
  • 23:06 logmsgbot: bsitu synchronized wmf-config/CommonSettings.php 'Update Flow cache key to 3.0'
  • 23:06 logmsgbot: bsitu updated /a/common to I566e00bcc: Update flow cache version number
  • 22:58 logmsgbot: ebernhardson synchronized php-1.23wmf15/extensions/Flow/
  • 22:55 logmsgbot: ebernhardson synchronized php-1.23wmf16/extensions/Flow/
  • 22:31 ori: GeoIP cookie enabled on text frontend varnish on cp1066. Appears to work well. If it causes issues, roll back by reverting I7e5ca8e54 and running 'service varnish-frontend restart'.
  • 22:13 logmsgbot: ori synchronized wmf-config/CommonSettings-labs.php 'Syncing lab config change I5ca0adf39 to prod for cluster consistency'
  • 22:12 logmsgbot: ori updated /a/common to I5ca0adf39: Labs: set $wgCentralGeoScriptURL to false for GeoIP cookie testing
  • 20:45 hashar: operations/apache-config.git now has a "betacluster" branch to host the beta cluster apache configuration files bug 56395
  • 20:39 logmsgbot: catrope synchronized wmf-config/CommonSettings.php 'Actually fix VE on officewiki this time'
  • 20:38 logmsgbot: catrope updated /a/common to I7ff564e4a: Really actually fix VE on private wikis
  • 19:22 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: group1 to 1.23wmf16
  • 19:09 ottomata: depooled elastic1007 and elastic1013-1016 per request from manybubbles
  • 19:02 logmsgbot: bd808 updated /a/common to I62b32888b: Create an autopatrolled group on itwikiquote
  • 18:46 ottomata: s/analytics/elastic in previous log message
  • 18:46 ottomata: bringing analytics1007 and analytics1013-1016 into elasticsearch cluster
  • 17:21 manybubbles: rebuilding the search index on commons after failing yesterday now that I've deployed a fix to the timeout issue
  • 17:18 logmsgbot: manybubbles synchronized php-1.23wmf16/extensions/CirrusSearch/ '3 small fixes including search timeouts'
  • 16:13 hashar: gallium : killed leftover jenkins instance bug 51817 "Jenkins init script is crap"
  • 16:07 hashar: gallium has been sent to swap
  • 15:41 akosiaris: upgraded php5 packages, php5-wmerrors package and libmemcached11 on mw1017. This will make puppet and the corresponding icinga check unhappy.
  • 15:05 hashar: reenabling puppet on gallium
  • 14:53 andrewbogott: switched wikitech to allow eqiad access, turned off pmtpa instance creation
  • 14:52 hashar: stopping puppet on gallium to play with apache configuration
  • 13:51 logmsgbot: hoo synchronized wmf-config/InitialiseSettings.php 'I62b3288'
  • 11:19 hashar: Restarting Jenkins, it is stalled ....
  • 11:12 hashar: Jenkins web service unavailable, investigating. Builds should not be affected though since they dont use the web service (but gearman)
  • 03:29 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-04 03:29:53+00:00
  • 02:46 logmsgbot: hoo synchronized multiversion/activeMWVersions.php 'Comment only change I5e68518'
  • 02:43 logmsgbot: hoo synchronized multiversion/activeMWVersions.php 'Comment only change {{Gerrit'
  • 02:43 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-04 02:43:17+00:00
  • 02:28 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-03-04 02:28:16+00:00
  • 02:25 springle: schema change bug 31397 afl_namespace, slave by slave
  • 01:32 logmsgbot: catrope synchronized php-1.23wmf16/resources/oojs-ui/oojs-ui.js 'oojs-ui fixes'
  • 01:10 mutante: shutting down 'zhen' permanently
  • 00:49 mutante: zhen - disable puppet,revoke puppet cert,delete salt key,delete stored configs, disable monitoring...
  • 00:25 logmsgbot: catrope synchronized wmf-config/CommonSettings.php 'Fix VE on officewiki'
  • 00:21 logmsgbot: catrope synchronized php-1.23wmf16/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWMediaInsertDialog.js 'https://gerrit.wikimedia.org/r/#/c/116211/'

March 3

  • 23:16 mutante: zhen - stopping redis-server, preparing for decom
  • 22:46 mutante: powercycling ms-be1003
  • 22:33 mutante: DNS update - removing yvon
  • 22:21 mutante: yvon - deleted puppet stored configs, removed from icinga, shutdown -h now, kthxbye yvon
  • 22:08 logmsgbot: catrope synchronized wmf-config/CommonSettings.php 'Plumbing for $wmgUseParsoid'
  • 22:07 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'Add $wmgUseParsoid (true for all non-private wikis)'
  • 21:19 ottomata: restarting varnishkafka everywhere to pick up librdkafka 0.8.3 upgrade
  • 21:07 mutante: restarting parsoid with salt-run deploy.restart 'parsoid/deploy' '10%'
  • 21:06 gwicke: deployed Parsoid 98936e with deploy b070bcc
  • 20:56 mutante: yvon - disabling puppet,stopping services,revoking puppet cert,delete salt key
  • 20:42 ottomata: upgrading to librdkafak 0.8.3-1 on hosts running varnishkafka
  • 20:38 mutante: yvon - disable notifications/schedule downtime for host and all services
  • 20:35 mutante: upgrading openssl on iron
  • 19:15 logmsgbot: mlitn synchronized wmf-config/ 'Remove ArticleFeedbackv5 from Wikimedia wikis'
  • 19:09 logmsgbot: mlitn updated /a/common to Ia9f91e560: Removed unused SwiftCloudFiles extension
  • 18:01 manybubbles: performing Cirrus reindex for commons - its need one for a long time and I've been too distracted to give it the attention it deserves
  • 17:58 logmsgbot: mholmquist synchronized php-1.23wmf15/extensions/Wikidata/extensions/Wikibase/client/includes/scribunto/ 'Fix for scribunto/wikibase integration'
  • 17:56 logmsgbot: mholmquist synchronized php-1.23wmf16/extensions/Wikidata/extensions/Wikibase/client/includes/scribunto/ 'Fix for scribunto/wikibase integration'
  • 17:54 logmsgbot: mholmquist synchronized php-1.23wmf16/extensions/MultimediaViewer/ 'Fix JS errors in latest MMV code'
  • 04:28 springle: s3 testing semi-synchronous replication
  • 03:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-03 03:08:34+00:00
  • 02:37 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-03 02:37:24+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-03-03 02:20:49+00:00

March 2

  • 17:30 apergos: powercycled sodium, swapdeath
  • 02:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-02 02:08:45+00:00
  • 02:03 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-02 02:03:00+00:00
  • 02:02 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-03-02 02:02:12+00:00

March 1

  • 15:56 logmsgbot: reedy synchronized php-1.23wmf16/includes/specials/SpecialRecentchanges.php 'Id323e3b7daced1e7b6b1e1add4e9e1bf7df05e4e'
  • 03:44 Coren: Disabled puppet on pmtpa tool labs; ignore puppet whining.
  • 03:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-03-01 03:07:02+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-03-01 02:27:01+00:00
  • 02:25 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-03-01 02:25:01+00:00
  • 01:48 mwalker: another update to civi; this time from b67391cf5f4c288c50497e7fb7ef5a28f6dced9b to 648337cad8d465b2e03421aac59bc1117a797fd0
  • 00:54 mwalker: updated fundraising civicrm from e2727e7b293e7a20cb87a9bfc6b52f545ba7b548 to a3166ddc53bc326b318cf40d9db69bc5a60cf16b for thank you messages (and sender)

February 28

  • 23:51 logmsgbot: maxsem synchronized php-1.23wmf16/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/#/c/116174/'
  • 23:46 logmsgbot: maxsem synchronized php-1.23wmf16/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/#/c/116174/'
  • 23:19 logmsgbot: aaron synchronized wmf-config 'Removed unused SwiftCloudFiles extension'
  • 20:15 RobH: and i fixed it too, huzzah
  • 20:14 RobH: i killed bz during puppet run on server, fixing now
  • 20:12 RobH: fixing cert errors on etherpad.w.o, sorry if folks have service interruption
  • 18:33 Krinkle: Reloading zuul to deploy Ic3aeb6a5f13086b108
  • 14:15 logmsgbot: anomie synchronized php-1.23wmf15/includes/htmlform/HTMLFormField.php 'Backport fix for bug 61942 to 1.23wmf15 (reedy already did wmf16 last night)'
  • 09:39 hashar: Jenkins restarted.
  • 09:32 hashar: restarting Jenkins, some jobs registration is broken :(
  • 08:58 andrewbogott: disabled puppet on labstore1001 to allow unattended file copies
  • 03:50 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-28 03:50:40+00:00
  • 03:06 logmsgbot: reedy updated /a/common to Icb3159198: Fix arrays for $wgContactConfig
  • 03:06 logmsgbot: reedy synchronized wmf-config/interwiki.cdb 'Updating interwiki cache'
  • 03:05 logmsgbot: LocalisationUpdate completed (1.23wmf16) at 2014-02-28 03:05:28+00:00
  • 02:34 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-28 02:34:35+00:00
  • 02:26 andrewbogott: rebuilt ldap indexes on virt1000
  • 02:25 springle_: restarted apache on virt0
  • 01:38 mutante: allowing group mwupld write in ./releases/1.19 and 1.21, the other dirs were already like that
  • 01:26 mutante: fixing permissions for 1.22.3 files on caesium, let mwupld group own them like all the other files
  • 01:12 mutante: pdf1 - revoke puppet cert, kill from stored configs,...
  • 01:04 mutante: pdf1 - disable monitoring - downtime until ∞
  • 00:06 rdwrer: By what can only be described as kicking-down-the-door-style deployment, mwalker and I managed to deploy four FundraisingChart Jenkins jobs after about 15 tries each.
  • 00:02 mutante: restarting gitblit on antimony
  • 00:02 logmsgbot: reedy synchronized wmf-config/

February 27

  • 23:56 logmsgbot: reedy synchronized wmf-config/
  • 23:55 logmsgbot: reedy updated /a/common to Iade498a40: Get rid of remaining references to $wmfExtendedVersionNumber
  • 23:50 logmsgbot: reedy synchronized wmf-config/
  • 23:45 logmsgbot: reedy updated /a/common to Ibe70ebb6d: Remove ContactPageFundraiser from testwiki and donatewiki
  • 23:38 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'Remove ContactPageFundraiser from testwiki and donatewiki'
  • 23:38 mutante: streber - revoke puppet cert and salt key
  • 23:36 mutante|away: DNS update - removing streber (RT 2186)
  • 21:03 logmsgbot: aude synchronized php-1.23wmf15/extensions/EducationProgram
  • 20:55 logmsgbot: aude synchronized php-1.23wmf15/extensions/Wikidata
  • 20:51 mutante: DNS update - removing locke and squidlog which was a locke cname, locke is dead
  • 20:50 logmsgbot: aude synchronized php-1.23wmf16/extensions/EducationProgram
  • 20:23 logmsgbot: bd808 Finished scap: full scap; rebuild php-1.23wmf16 l10n cache (duration: 06m 21s)
  • 20:16 logmsgbot: bd808 Started scap: full scap; rebuild php-1.23wmf16 l10n cache
  • 20:09 bd808|deploy: scap-rebuild-cdbs failed with "Could not open directory '/upstream'."
  • 20:09 cmjohnson1: messing with serial connections ps1-b5 and ps1-b6-eqiad
  • 20:08 logmsgbot: bd808 Finished scap: rebuild php-1.23wmf16 l10n cache (duration: 10m 20s)
  • 20:08 logmsgbot: bd808 scap-rebuild-cdbs failed on 422 hosts
  • 19:58 logmsgbot: bd808 Started scap: rebuild php-1.23wmf16 l10n cache
  • 19:51 logmsgbot: aaron rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf16
  • 19:44 bd808|deploy: The problem with me and sync-wikiversions is my ssh-agent. It croaks when all 400+ connections happen at once.
  • 19:30 logmsgbot: aaron rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 19:28 bd808|deploy: sync-wikiversions + dsh still doesn't like my ssh key, but I can ssh to mw1010 from tin
  • 19:26 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf15 (third try)
  • 19:25 bd808|deploy: sync-wikiversions + dsh doesn't like my ssh key
  • 19:24 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf15 (second try)
  • 19:23 logmsgbot: bd808 rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf15
  • 19:23 AaronSchulz: Manually purged the job queue aggregator
  • 18:29 logmsgbot: bd808 Finished scap: testwiki to php-1.23wmf16 and rebuild l10n cache (duration: 20m 22s)
  • 18:09 logmsgbot: bd808 Started scap: testwiki to php-1.23wmf16 and rebuild l10n cache
  • 15:55 bd808: /srv/scap, not /src/scap
  • 15:55 bd808: Eleven mediawiki-installation dsh group hosts have stale /src/scap checkouts: fenari, mw1010, mw1066, mw1091, mw1107, mw1143, mw1150, mw1189, mw1204, mw1205, mw43
  • 15:40 akosiaris: disabled puppet on carbon, testing some autoinstall stuff
  • 13:35 Krinkle: Deploy integration/slave-scripts Icd2b25fd882b7953ee
  • 07:22 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1034'
  • 06:11 Ryan_Lane2: testing salt restarts of parsoid in batches of 10%
  • 03:11 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-27 03:11:24+00:00
  • 02:38 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-27 02:37:59+00:00
  • 02:28 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1055'
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-27 02:18:24+00:00
  • 01:31 logmsgbot: bd808 Finished scap: no-diff scap to test script changes (duration: 16m 44s)
  • 01:14 logmsgbot: bd808 Started scap: no-diff scap to test script changes
  • 00:56 mwalker: updated fundraising tools on lutetium from 7c2e15b006fec25fd1881c67f6bce9e406759535 to 87dbe60d2557a38553ae2bb602f777b4096915b8
  • 00:44 logmsgbot: aude synchronized php-1.23wmf15/extensions/Wikidata 'Fix references display, for real'
  • 00:34 logmsgbot: aude synchronized php-1.23wmf15/extensions/Wikidata 'Fix display of references'
  • 00:18 logmsgbot: mflaschen synchronized wmf-config/InitialiseSettings.php 'Deploy GettingStarted and GuidedTour to additional wikis. Add category-based suggestions to svwiki'

February 26

  • 22:57 greg-g: RobH set permissions on /srv/scap/docs directory to 2775 (was 2755)
  • 22:15 hashar: Jenkins restarted (yeah it is that fast tonight!!!!!)
  • 22:05 hashar: Restating Jenkins, it exploded.
  • 22:01 hashar: Jenkins web interface died around Feb 24th 6pm UTC ( private bug bug 61964 ). Impact is that one can't see the job details nor the console logs. Jobs are still triggered though since Zuul trigger them using a Gearman bus (yeah!!!). Way to solve it: restart Jenkins :-(
  • 21:25 gwicke: deployed Parsoid 804ead03 with deploy e22f0a0f76
  • 21:18 mutante: restarting parsoid on wtp1016
  • 19:47 mutante: DNS update - remove db9, already shutdown, (left mgmt)
  • 18:47 cmjohnson1: es1006 swapping failed disk
  • 17:33 ottomata: upgraded to librdkafka1 0.8.3 on cp3019, restarting varnishkafka
  • 16:39 paravoid: disabling puppet on sodium
  • 15:26 ottomata: rebooting analytics1004
  • 04:17 ori: enabling geo_cookie on cp1066 caused general protection fault, so reverted and restarted.
  • 03:14 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-26 03:14:37+00:00
  • 02:41 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-26 02:41:01+00:00
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-26 02:12:36+00:00
  • 01:52 logmsgbot: aaron synchronized php-1.23wmf15/includes/filebackend '5a7a77cf3fd118bc70aa79993b85fc5e737d7526'
  • 00:07 mutante: restarting gitblit on antimony

February 25

  • 22:20 logmsgbot: ebernhardson synchronized php-1.23wmf15/extensions/Flow
  • 22:13 logmsgbot: ebernhardson synchronized wmf-config/Wikibase.php
  • 20:22 awight: updated crm from 37bb48ea2cf2bff656e97b8455115d38fa5c3883 to e2727e7b293e7a20cb87a9bfc6b52f545ba7b548
  • 19:47 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'I5a2b7b360be808e4780f14dda375af17930dec97'
  • 19:37 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: non wikipedias to 1.23wmf15
  • 19:20 logmsgbot: reedy synchronized php-1.23wmf15/extensions/Wikidata/
  • 18:49 RobH: cp4009 came back from errors after power removal rt6890
  • 18:12 Jeff_Green: patched OTRS for XSS vulnerability
  • 15:50 logmsgbot: reedy updated /a/common to Ib45270536: close ukwikimedia
  • 13:44 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 13:40 logmsgbot: reedy synchronized database lists files:
  • 10:43 hashar: Jenkins setting email-ext notification content type to HTML
  • 08:42 hashar: wrong channel, I am not upgrading any production varnishes but the beta cluster ones.
  • 08:42 hashar: deployment-prep Upgrading all varnishes.
  • 03:38 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-25 03:38:50+00:00
  • 03:06 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 direct api traffic to db1043'
  • 02:53 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-25 02:53:09+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-25 02:27:26+00:00
  • 02:23 awight: crm updated from c2e57365220e3ddbdea8cf99a052adbfa7a5bac8 to 37bb48ea2cf2bff656e97b8455115d38fa5c3883
  • 02:14 awight: updated crm from 41dce289bc15ea1ca638c37b29ff2e3e709a2251 to c2e57365220e3ddbdea8cf99a052adbfa7a5bac8
  • 00:30 logmsgbot: catrope synchronized php-1.23wmf15/extensions/VisualEditor/modules/ve-mw/ui/pages/ve.ui.MWSettingsPage.js 'Fix adding redirects in VE'
  • 00:28 logmsgbot: catrope synchronized php-1.23wmf15/includes/filebackend/FileBackendStore.php 'd52a8af6 for real this time'
  • 00:27 logmsgbot: catrope synchronized php-1.23wmf14/includes/filebackend/FileBackendStore.php 'efb1e99fd for real this time'
  • 00:23 mutante: Bugzilla - replacing custom template with exiting hook (gerrit 114145)
  • 00:17 mutante: deploying new hook in Bugzilla's edit.html.tmpl for bug 36064
  • 00:09 mutante: restarting gitblit on antimony
  • 00:08 logmsgbot: catrope synchronized wmf-config/CommonSettings.php 'Fix popup video size by ordering transcode settings properly'
  • 00:08 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'Enable VE in the Recherce namespace on frwikiversity'
  • 00:07 logmsgbot: catrope synchronized visualeditor-default.dblist 'Enable VE by default on ptwikibooks'

February 24

  • 23:35 mutante: DNS update - removing kaulen
  • 23:33 RobH: all neon services no longer using wildcard, and wildcard shredded off system
  • 23:18 RobH: updating ishmael to use its own ssl cert
  • 23:01 RobH: icinga and icinga-admin now using their own certs
  • 22:55 mutante: kaulen - revoke puppet cert, revoke salt key, stored configs,...
  • 22:50 mutante: kaulen - shutdown -h now
  • 21:58 RobH: updated blog.w.o to wp3.8.1
  • 21:51 logmsgbot: demon synchronized php-1.23wmf15/extensions/Wikidata 'Updating wikidata build to fix test.wikidata'
  • 21:50 logmsgbot: aaron synchronized php-1.23wmf15/includes/filebackend 'd52a8af6c2f730d017e87e7217b6a0b299ab85be'
  • 21:49 logmsgbot: aaron synchronized php-1.23wmf15/includes/filebackend 'efb1e99fdf1e91bef4fc086b945b7933049e2a50'
  • 21:44 mutante: kaulen - stopping services, disabling monitoring
  • 21:26 RobH: techblog.w.o redirects now work without certificate errors
  • 21:24 RobH: updating blog apache configs to use techblog.w.o https cert
  • 21:23 gwicke: deployed Parsoid 51c71eb / deploy b684fea
  • 21:17 mutante: restarting parsoid on wtp1002
  • 20:33 bd808: Restarted elasticsearch on logstash1002 in attempt to clear stuck reallocations likely caused by OOM while running recovery
  • 19:29 Coren: remmoting virt1001 (sick stuck on bad mounts)
  • 17:06 bd808: Logs on logstash1003 show elasticsearch split brain starting at 2014-02-23T00:00:12. logstash1001 and logstash1003 both thought they were master. logstash1001 not responding to logstash1003's requests to become authoritative.
  • 16:38 bd808: Logstash elasticsearch split-brain resulted in loss of all logs for 2014-02-24 from 00:00Z to ~16:30Z
  • 16:16 bd808: Restarted elasticsearch on logstash1001
  • 03:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-24 03:27:55+00:00
  • 02:44 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-24 02:44:44+00:00
  • 02:31 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-24 02:31:50+00:00
  • 01:16 logmsgbot: tstarling updated /a/common/php-1.23wmf15 to I268599be9: [1.23wmf15] Make SiteStats (re)initializing more sane
  • 01:16 logmsgbot: tstarling synchronized php-1.23wmf14/includes/SiteStats.php

February 23

  • 20:29 Tim: updated ss_active_users on plwiki master to not be -1
  • 20:14 springle: killed SiteStatsInit from both wikiuser and wikiadmin on all s2 slaves
  • 20:01 Tim: killed SiteStatsInit queries on db1060
  • 19:57 logmsgbot: tstarling synchronized php-1.23wmf15/includes/SiteStats.php
  • 19:56 logmsgbot: tstarling synchronized php-1.23wmf14/includes/SiteStats.php
  • 19:48 RobH: operations folks are looking into site issues at present
  • 19:37 greg-g: < paravoid> something that has to do with SiteStatsInit, probably
  • 19:33 greg-g: < paravoid> it's all plwiki
  • 19:32 greg-g: < paravoid> tons of SELECT /* SiteStatsInit::edits */ COUNT(*) FROM `revision` LIMIT 1
  • 19:32 greg-g: < paravoid> it's s2
  • 02:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-23 02:08:36+00:00
  • 02:02 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-23 02:02:44+00:00
  • 02:01 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-23 02:01:56+00:00

February 22

  • 03:16 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-22 03:16:14+00:00
  • 02:35 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-22 02:35:25+00:00
  • 02:21 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-22 02:21:50+00:00
  • 02:07 Coren: undid the cert change on the virt0 LDAP; this has subtle impact in some other places because of the RapidSSL cert and will need planning.
  • 01:05 Coren: Shutting down LDAP briefly on virt0 for a config switch

February 21

  • 22:12 mwalker: updated civicrm from eb3536eb32cbc7400e4e5884d56fbf104e38fc2b to 41dce289bc15ea1ca638c37b29ff2e3e709a2251 for thank you templates
  • 21:40 bd808: mw1047 and mw1079 errors cleared after apache-graceful
  • 21:29 mutante: graceful'ing apache on mw1047 and mw1079 by request
  • 21:25 bd808: mw1047 and mw1079 throwing PHP exception that looks like APC corruption
  • 20:35 logmsgbot: bd808 Finished scap: no-diff scap; recording asciicast (duration: 03m 13s)
  • 20:31 logmsgbot: bd808 Started scap: no-diff scap; recording asciicast
  • 18:57 logmsgbot: catrope synchronized php-1.23wmf15/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.ViewPageTarget.js 'touch'
  • 18:57 logmsgbot: catrope synchronized php-1.23wmf15/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js 'touch'
  • 18:56 logmsgbot: catrope synchronized php-1.23wmf14/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.ViewPageTarget.js 'touch'
  • 18:56 logmsgbot: catrope synchronized php-1.23wmf14/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js 'touch'
  • 18:55 bd808: The 4 hosts that failed scap-rebuild-cdbs were snapshot[1234]; can we pull them from mediawiki-installation dsh group?
  • 18:54 logmsgbot: bd808 Finished scap: no-diff scap to test script changes; expect l10n updates (duration: 13m 38s)
  • 18:54 logmsgbot: bd808 scap-rebuild-cdbs failed on 4 hosts
  • 18:50 bd808: The 4 hosts that failed scap-1 were snapshot[1234]; all have old/bad python installs
  • 18:49 logmsgbot: bd808 scap-1 failed on 4 hosts
  • 18:40 logmsgbot: bd808 Started scap: no-diff scap to test script changes; expect l10n updates
  • 18:35 bd808: Forced update of /svr/scap to 6203585 across cluster
  • 18:24 ottomata: initiating kafka preferred replica election to rebalance partition leaders
  • 18:19 bblack: cp1054 healthy now, rebuilding persistent cache from scratch there...
  • 15:30 Jeff_Green: dist-upgrade and reboot boron
  • 13:29 akosiaris: just resized 208.80.155.64/26 to 208.80.155.64/28. This is Sandbox1-b-eqiad subnet. dickson.freenode.net needs to have it's netmask changed. I will talk with coren, mutante
  • 10:05 logmsgbot: ori updated /a/common to I10170d77c: Set $wmfExtendedVersionNumber = $wmfVersionNumber
  • 09:53 logmsgbot: ori synchronized php-1.23wmf14/extensions/MultimediaViewer/resources/mmv/mmv.performance.js 'I41b6e975353: Backport fix for stats.bandwidth == Infinity'
  • 07:32 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 db1027 full steam'
  • 07:24 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 repool db1027 warm up'
  • 07:01 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 depool db1027 for upgrade'
  • 05:48 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1056'
  • 05:33 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1056 schema change'
  • 05:30 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1051'
  • 05:14 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1051 schema change'
  • 04:38 Coren: rebooting labnet1001 -> nova-network seems quite dead
  • 03:48 logmsgbot: reedy updated /a/common to I8252f5a09: Swiched from using dat to json files for wikiversions
  • 03:47 logmsgbot: reedy synchronized docroot/noc/conf/index.php
  • 03:45 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-21 03:45:24+00:00
  • 03:36 logmsgbot: reedy synchronized docroot and w
  • 03:34 logmsgbot: reedy synchronized docroot and w
  • 03:33 logmsgbot: reedy synchronized wikiversions-labs.json
  • 03:32 logmsgbot: reedy synchronized wikiversions.json
  • 03:31 logmsgbot: reedy synchronized wikiversions.cdb
  • 03:29 logmsgbot: reedy updated /a/common to Ic27cb9581: Fix CDB file generation
  • 03:03 logmsgbot: ori updated /a/common to Ic804421a6: Change wikiversions to use json
  • 02:57 logmsgbot: LocalisationUpdate completed (1.23wmf15) at 2014-02-21 02:57:44+00:00
  • 02:26 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-21 02:26:42+00:00
  • 01:18 springle: online schema change in progress gerrit 92037 user_password_expires
  • 00:25 logmsgbot: ebernhardson synchronized php-1.23wmf15/extensions/Flow/
  • 00:23 logmsgbot: ebernhardson synchronized php-1.23wmf14/extensions/Flow/
  • 00:17 Krinkle: Reloading zuul to deploy I37ce89455724ed15
  • 00:12 logmsgbot: maxsem synchronized php-1.23wmf14/extensions/ConfirmEdit/
  • 00:10 logmsgbot: maxsem synchronized php-1.23wmf14/includes/api/ApiCreateAccount.php
  • 00:07 logmsgbot: maxsem synchronized php-1.23wmf15/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/114644'
  • 00:05 logmsgbot: reedy Finished scap: no op scap is no op? (duration: 39m 23s)

February 20

  • 23:42 Krinkle: Reloading zuul to deploy I15c43ec458b053a9
  • 23:28 MaxSem: No it's not, you're adding to global warming
  • 23:26 logmsgbot: reedy Started scap: no op scap is no op?
  • 23:22 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'I62011ada8: Update Schema:Echo to revision 7572295'
  • 23:22 logmsgbot: ori updated /a/common to I62011ada8: Update Schema:Echo to revision 7572295
  • 22:55 logmsgbot: ori synchronized php-1.23wmf14/extensions/Echo 'Update Schema:Echo revision to r7572295 (bug 61698)'
  • 22:54 logmsgbot: ori synchronized php-1.23wmf15/extensions/Echo 'Update Schema:Echo revision to r7572295 (bug 61698)'
  • 22:46 logmsgbot: csteipp synchronized php-1.23wmf14/includes 'bugs 60771, 61346, 61362'
  • 22:40 logmsgbot: csteipp synchronized php-1.23wmf15/includes 'bugs 60771, 61346, 61362'
  • 22:30 logmsgbot: reedy synchronized wmf-config/
  • 22:30 logmsgbot: reedy synchronized database lists files:
  • 22:20 Reedy: Created EducationProgram tables on nlwiki
  • 21:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf15
  • 21:16 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf14
  • 21:08 logmsgbot: bd808 Finished scap: testwiki to 1.23wmf15 and build l10n cache (take 5) (duration: 58m 26s)
  • 20:10 logmsgbot: reedy updated /a/common to I5cb2f67ea: Add symlinks
  • 20:10 logmsgbot: bd808 Started scap: testwiki to 1.23wmf15 and build l10n cache (take 5)
  • 19:49 Reedy: I'm making a note here: HUGE SUCCESS. testwiki is on 1.23wmf15
  • 19:36 Reedy: Running sync-common on mw1017 for testwiki in the meantime
  • 19:31 logmsgbot: reedy scap aborted: testwiki to 1.23wmf15 and build l10n cache (take 4) (duration: 06m 29s)
  • 19:25 logmsgbot: reedy Started scap: testwiki to 1.23wmf15 and build l10n cache (take 4)
  • 19:24 logmsgbot: reedy scap aborted: testwiki to 1.23wmf15 and build l10n cache (take 3) (duration: 03m 29s)
  • 19:21 logmsgbot: reedy Started scap: testwiki to 1.23wmf15 and build l10n cache (take 3)
  • 19:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Stuff back to their deployed versions
  • 19:12 logmsgbot: reedy scap failed: CalledProcessError Command '['sudo', '-u', 'mwdeploy', '/usr/bin/rsync', '-a', '--delete-delay', '--delay-updates', '--compress', '--delete', '--exclude=**/.svn/lock', '--exclude=**/.git/objects', '--exclude=**/.git/**/objects', '--exclude=**/cache/l10n/*.cdb', '--no-perms', 'tin.eqiad.wmnet::common', '/usr/local/apache/common-local']' returned non-zero exit status 23 (duration: 02m 01s)
  • 19:10 logmsgbot: reedy Started scap: testwiki to 1.23wmf15 and build l10n cache (take 2)
  • 19:10 logmsgbot: reedy scap aborted: testwiki to 1.23wmf15 and build l10n cache (duration: 131m 36s)
  • 19:08 Reedy: For those still following along, scap is still running
  • 18:42 cmjohnson1: flipping the switch to off in tampa
  • 18:39 awight: crm updated from 07c1943088355bd262786d7134763549c8070ceb to eb3536eb32cbc7400e4e5884d56fbf104e38fc2b
  • 17:42 bd808: mw1151 & mw1152 throwing PHP Fatal error that looks to be APC or other cache corruption
  • 16:58 logmsgbot: reedy Started scap: testwiki to 1.23wmf15 and build l10n cache
  • 16:56 logmsgbot: reedy updated /a/common to I19a8c3298: Remove UserThrottle extension and log group
  • 03:26 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-20 03:26:31+00:00
  • 02:41 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-20 02:41:15+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-20 02:22:04+00:00
  • 02:17 awight: update crm from 20684a7e5dd4270e615ca52e6a17b25e880e47c4 to 07c1943088355bd262786d7134763549c8070ceb
  • 01:04 bd808: mw-update-l10n test on tin completed
  • 01:01 bd808: Testing changes to mw-update-l10n on tin
  • 00:51 bd808: mw-update-l10n test on tin completed successfully
  • 00:47 bd808: Testing changes to mw-update-l10n on tin
  • 00:16 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Wikiversities and itwikiquote get Cirrus as beta'
  • 00:10 logmsgbot: mholmquist synchronized wmf-config/CommonSettings.php 'Set MMV network performance sampling level to 0.1% on production wikis'
  • 00:09 logmsgbot: mholmquist synchronized wmf-config/CommonSettings-labs.php 'Set MMV network performance sampling level to 100% on labs'
  • 00:07 logmsgbot: mholmquist updated /a/common to I1319008cb: Start sampling detailed network performance for Multimedia Viewer

February 19

  • 23:47 paravoid: restarting gmetad on nickel
  • 23:10 mwalker: updateding Fundraising SmashPig from 8a5e194a1c4827fc4fc75a2d6b298d3c30112d90 to 2fdf982b20f1cbeaf9f57af64ef21b5b69a36f6e for XFF logging
  • 21:41 logmsgbot: bd808 Finished scap: no-diff scap to test script changes (duration: 04m 52s)
  • 21:37 bd808: during scap: snapshot3: ImportError: No module named argparse
  • 21:36 logmsgbot: bd808 Started scap: no-diff scap to test script changes
  • 21:31 logmsgbot: aaron synchronized php-1.23wmf14/includes/filebackend/SwiftFileBackend.php '10ba3d7caa1e4bb4d521384bebbf42976cea4a22'
  • 21:31 logmsgbot: bd808 Finished scap: no-diff scap to test script changes (duration: 52m 06s)
  • 21:10 bd808: During scap: snapshot2: rsync error: timeout in data send/receive (code 30) at io.c(137) [sender=3.0.9]
  • 21:07 gwicke: deployed Parsoid deploy c73ea9d3 and code 76e9b66
  • 21:02 LeslieCarr: fixing firewall bastion to payments deny clause to actually deny
  • 20:59 ottomata: initiating controlled shutdown of analytics1022 kafka broker in order to reload configs for replica.lag.max.messages
  • 20:39 logmsgbot: bd808 Started scap: no-diff scap to test script changes
  • 20:19 ottomata: controlled shutdown of analytics1021 Kafka broker in order to reload configs for replica.lag.max.messages
  • 17:15 ^d: creating elasticsearch indexes for all wikiversities, may see some intermittent icinga spam as shards rebalance
  • 17:09 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'wikiversities and itwikiquote getting cirrus'
  • 17:08 logmsgbot: demon synchronized php-1.23wmf14/extensions/Elastica 'Elastica to master'
  • 17:07 logmsgbot: demon synchronized php-1.23wmf14/extensions/CirrusSearch 'Cirrus to master'
  • 17:07 logmsgbot: demon synchronized php-1.23wmf13/extensions/Elastica 'Elastica to master'
  • 17:06 logmsgbot: demon synchronized php-1.23wmf13/extensions/CirrusSearch 'Cirrus to master'
  • 06:53 springle: s5 slaves online reindexing wikidatawiki wb_terms
  • 06:01 springle: flowdb schema changes gerrit 111671
  • 04:58 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Turn on checkDelay for cirrus links update secondary jobs'
  • 03:23 andrewbogott: testing the log by logging a test
  • 02:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-19 02:55:36+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-19 02:27:49+00:00
  • 02:03 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-19 02:03:41+00:00
  • 00:04 mutante: restarting gitblit on antimony

February 18

  • 23:42 mutante: shutting down locke - killing 757 days of uptime and one more Tampa classic host
  • 23:38 mutante: locke - disable puppet, puppetstoredconfigclean on master, revoke puppet cert and salt key..
  • 22:59 logmsgbot: ebernhardson synchronized wmf-config/CommonSettings.php
  • 22:35 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings.php
  • 22:25 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings.php
  • 22:23 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings.php
  • 22:20 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings.php
  • 22:17 logmsgbot: ebernhardson synchronized php-1.23wmf14/extensions/Flow
  • 22:10 logmsgbot: ebernhardson synchronized php-1.23wmf13/extensions/Flow/includes/Data/RevisionStorage.php
  • 20:35 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Elastica is always included now'
  • 20:21 ottomata: upgraded librdkafka1 to 0.8.3 on cp1056, restarting varnishkafka there
  • 19:55 logmsgbot: aaron synchronized php-1.23wmf14/includes/filebackend/SwiftFileBackend.php '58fa613a75c2730cbf8f60e9e3f283a3f043f00b'
  • 19:45 ottomata: repooling cp3022 into bits esams. varnishkafka has emptied its outbuf since last night
  • 19:40 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: All non wikipedias to 1.23wmf14
  • 19:39 logmsgbot: reedy updated /a/common to I641a25ef9: Symlink in the extension-list files
  • 19:28 Krinkle: Jenkins jobs for npm are broken because the new integration-slave02 and integration-slave03 instances have SSL issues (different npm version and no certificates). And integration-slave01 (which was working) was deleted.
  • 04:11 logmsgbot: reedy synchronized docroot/noc/
  • 03:37 logmsgbot: reedy updated /a/common to Ifd09130b4: Ignore PhpStorm files
  • 02:53 springle: reindexing s1 slaves abuse_filter_log
  • 02:49 awight: payments updated from 9b936320b797bd01c4e61b1cd7c2e15b0820a24b to fe302a89e718dce7917acefb8c762ddc1c19c028
  • 02:48 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-18 02:48:14+00:00
  • 02:42 awight: rollback payments from 9e4d8b29581e2465d1acde8d7c2377fa6a8522a6 to 9b936320b797bd01c4e61b1cd7c2e15b0820a24b
  • 02:38 awight: rollback payments from ce6233998f4bc0266c2e027c44620a8ba9984681 to 9e4d8b29581e2465d1acde8d7c2377fa6a8522a6
  • 02:26 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-18 02:26:05+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-18 02:14:08+00:00

February 17

  • 20:56 ottomata: depooling cp3022.esams.wikimedia.org to investigate varnishkafka issues
  • 16:15 hashar: Jenkins deleting slave integration-slave01 (had only 2 CPU)
  • 16:14 hashar: Jenkins added two labs slaves with 4 CPU: integration-slave02 and integration-slave03
  • 08:46 hashar: Upgrading Jenkins, half an hour downtime
  • 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-17 03:19:28+00:00
  • 02:37 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-17 02:37:40+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-17 02:27:22+00:00

February 16

  • 19:32 logmsgbot: aaron synchronized php-1.23wmf13/includes/filebackend/SwiftFileBackend.php 'e14a87489d9f65fec85347c8e4a7825576f15be6'
  • 16:03 ottomata: restarted varnishkafka on esams bits varnishes
  • 15:58 ottomata: restarted varnishkafka on cp3019
  • 15:47 ottomata: starting kafka leader replica election to production load across both brokers evenly. Not yet sure why analytics1022 was the leader for all toppars…
  • 10:48 apergos: for the record, after the reboot I added back the 10.0.0.45 and ran start-nfs, still not happy
  • 10:11 ori: labstore4. dmesg: XFS (dm-0): xfs_log_force: error 5 returned. Rebooting.
  • 08:07 matanya: Labs NFS Issues: cannot open directory .: Stale NFS file handle XFS seems broken again
  • 02:42 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-16 02:42:25+00:00
  • 02:20 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-16 02:20:57+00:00
  • 02:11 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-16 02:11:05+00:00

February 15

  • 22:30 logmsgbot: reedy updated /a/common to Id33b8287c: Remove 1.23wmf1 through 1.23wmf5
  • 22:27 logmsgbot: reedy updated /a/common to I04d387adf: s1 substitute db1034 for db1055 during schema changes
  • 14:50 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 substitute db1034 for db1055 during schema changes'
  • 05:14 springle: xtrabackup clone db1055 to db1010
  • 03:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-15 03:31:58+00:00
  • 02:50 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-15 02:50:08+00:00
  • 02:29 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-15 02:29:05+00:00
  • 02:08 awight_: updated payments from 9e4d8b29581e2465d1acde8d7c2377fa6a8522a6 to ce6233998f4bc0266c2e027c44620a8ba9984681
  • 01:41 awight_: updated payments from 9b936320b797bd01c4e61b1cd7c2e15b0820a24b to 9e4d8b29581e2465d1acde8d7c2377fa6a8522a6
  • 00:55 logmsgbot: ori updated /a/common to I7e014f29e: Handle empty lines gracefully
  • 00:52 awight_: crm rolled back from 9385167c077ccef4c52db2f479b39e5907378b61 to 20684a7e5dd4270e615ca52e6a17b25e880e47c4
  • 00:50 awight_: update crm from 20684a7e5dd4270e615ca52e6a17b25e880e47c4 to 9385167c077ccef4c52db2f479b39e5907378b61
  • 00:44 logmsgbot: ori updated /a/common to Id87a90474: Remove unused third/fourth parameters for wikiversions

February 14

  • 23:59 logmsgbot: aaron synchronized php-1.23wmf13/extensions/Math '9e75a1b'
  • 23:31 Coren: tools Rebooted labstore4 -- XFS done got broken agun
  • 22:53 logmsgbot: kaldari synchronized php-1.23wmf13/extensions/VectorBeta/ 'sync update for VectorBeta on wmf13'
  • 22:30 logmsgbot: kaldari synchronized php-1.23wmf14/extensions/VectorBeta/ 'sync update for VectorBeta on wmf14'
  • 22:22 bd808: manually applied Ie56d3a5 on logstash100[123] hosts and restarted gmond
  • 22:22 bd808: enabled puppetd on logstash1002
  • 21:46 bd808: disabled puppetd on logstash1002 to test ganglia monitor fix
  • 21:43 bd808: lostash fatalmonitor dashboard working again after restarts to backend
  • 21:30 bd808: Upgraded and restarted elasticsearch on logstash1003
  • 21:29 bd808: Upgraded and restarted elasticsearch on logstash1001
  • 21:28 bd808: Upgraded and restarted elasticsearch on logstash1002
  • 21:25 awight_: crm updated from bd973432c957b6253f82b7b0590251402e75a0de to 20684a7e5dd4270e615ca52e6a17b25e880e47c4
  • 20:46 Reedy: ULSFO down, traffic to Asia etc affected. Being worked on
  • 20:43 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'Revert Add local interwiki for metawiki'
  • 20:09 mwalker: updated payments wiki from from bad46aad4b28435d10570aac8173a7f8dca8751d to 9b936320b797bd01c4e61b1cd7c2e15b0820a24b for version stamp and other 1.22.3 changes
  • 20:01 logmsgbot: ori synchronized php-1.23wmf13/extensions/NavigationTiming 'Update NavigationTiming for schema revision to 7494934'
  • 19:59 logmsgbot: ori synchronized php-1.23wmf14/extensions/NavigationTiming 'Update NavigationTiming for schema revision to 7494934'
  • 19:55 bd808: during scap test snapshot[1234] reported "sudo: no tty present and no askpass program specified"
  • 19:50 logmsgbot: bd808 finished scap: no-diff scap to test script changes (duration: 25m 26s)
  • 19:28 bd808: no-diff scap updated 366 JSON l10n files
  • 19:25 logmsgbot: bd808 started scap: no-diff scap to test script changes
  • 19:11 bd808: Updating scap on mediawiki-installation dsh hosts
  • 19:09 logmsgbot: aaron synchronized wmf-config/mc.php 'Set retry_timeout to -1 for memcached in eqiad only'
  • 19:02 bd808: Updated /src/scap on tin to b2d8042
  • 18:29 logmsgbot: aaron synchronized wmf-config/CommonSettings.php 'Set wgMathDisableTexFilter to fix performance regression'
  • 18:24 bd808: Starting ganglia-monitor on logstash1001. Filed bug 61384 about problem in elasticsearch_monitoring.py effecting the logstash cluster.
  • 17:53 bd808: Running gmond in foreground on logstash1001 to debug elasticsearch reporting
  • 17:11 bd808: Restarted ganglia-monitor on logstash1001 to see if that makes the elasticsearch and redis metrics show up in ganglia
  • 16:15 cmjohnson1: db1034 swapping cables
  • 15:26 manybubbles: aborted reindex due to https://bugzilla.wikimedia.org/show_bug.cgi?id=61377
  • 14:25 manybubbles: beginning cirrus reindex of all wikipedias running cirrus except enwiki
  • 14:14 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 db1056 full steam'
  • 13:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1056 warm up'
  • 13:39 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's6 pool db1030 depool db1010'
  • 13:30 apergos: powercycling ms-be1005, unresponsive even on mgmt console
  • 11:49 akosiaris: restarted parsoid on wtp1001
  • 11:47 akosiaris: depooled wtp1004
  • 11:44 akosiaris: restart parsoid on wtp1022.
  • 10:50 mark: restarted varnish backend on amssq57
  • 06:08 Krinkle: Reloading Zuul to deploy Ie02531143511f418a6
  • 05:29 logmsgbot: ori synchronized wmf-config/mc.php 'Iac9f51209: Revert 'Set Memcached retry_timeout to -1
  • 05:28 logmsgbot: ori updated /a/common to Iac9f51209: Revert "Set Memcached retry_timeout to -1"
  • 04:05 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-14 04:05:23+00:00
  • 03:43 springle: restarted opendj on virt0
  • 03:20 logmsgbot: LocalisationUpdate completed (1.23wmf14) at 2014-02-14 03:20:37+00:00
  • 03:00 logmsgbot: mholmquist synchronized php-1.23wmf14/extensions/MultimediaViewer/resources/mmv/ui/mmv.ui.metadataPanel.js 'Fix for arrow keys in MultimediaViewer'
  • 02:34 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1056 for pt-table-sync bug 61319'
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-14 02:27:02+00:00
  • 02:24 logmsgbot: aaron synchronized wmf-config/mc.php 'Set Memcached retry_timeout to -1'
  • 01:15 springle: xtrabackup clone db1010 to db1030
  • 01:09 ori: restarting EventLogging on vanadium
  • 00:53 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'x1 depool db1030 for maintenance'
  • 00:41 logmsgbot: mflaschen synchronized php-1.23wmf14/tests/qunit/suites/resources/mediawiki/mediawiki.jqueryMsg.test.js 'Sync for GENDER fix to jQueryMsg'
  • 00:40 logmsgbot: mflaschen synchronized php-1.23wmf14/resources/mediawiki/ 'Sync for GENDER fix to jQueryMsg'
  • 00:36 logmsgbot: mflaschen synchronized php-1.23wmf13/tests/qunit/suites/resources/mediawiki/mediawiki.jqueryMsg.test.js 'Sync for GENDER fix to jQueryMsg'
  • 00:34 logmsgbot: mflaschen synchronized php-1.23wmf13/resources/mediawiki/ 'Sync for GENDER fix to jQueryMsg'
  • 00:16 logmsgbot: mholmquist synchronized php-1.23wmf14/extensions/MultimediaViewer/resources/mmv/mmv.lightboxinterface.js 'Fix for arrow keys in MultimediaViewer'
  • 00:14 logmsgbot: ebernhardson synchronized php-1.23wmf13/extensions/Flow
  • 00:07 logmsgbot: mholmquist synchronized php-1.23wmf13/extensions/MultimediaViewer/resources/mmv/mmv.lightboxinterface.js 'Fix for arrow keys in MultimediaViewer'

February 13

  • 23:41 mutante: restarting apache on zirconium, moved BZ site first and deleted old site
  • 23:36 logmsgbot: ori synchronized php-1.23wmf13/extensions/EventLogging 'Update EventLogging for I3de7c406f: Have mtime as calculated by startup module increase on schema change'
  • 23:34 logmsgbot: ori synchronized php-1.23wmf14/extensions/NavigationTiming 'Update NavigationTiming to master for Ic0c9060c5: Don't log 'desktop-beta' as mobileMode'
  • 23:33 logmsgbot: ori synchronized php-1.23wmf13/extensions/NavigationTiming 'Update NavigationTiming to master for Ic0c9060c5: Don't log 'desktop-beta' as mobileMode'
  • 23:28 logmsgbot: reedy synchronized wmf-config/
  • 23:28 logmsgbot: reedy synchronized database lists files:
  • 23:04 Reedy: Created Translate tables on otrs_wikiwiki
  • 22:56 logmsgbot: ori synchronized php-1.23wmf13/extensions/WikimediaEvents 'Update WikimediaEvents to master for If3d214319: Don't log NewEditorEdit for anons'
  • 22:54 logmsgbot: ori synchronized php-1.23wmf14/extensions/NavigationTiming 'Update NavigationTiming to master for I4aa367e96: Round 'mediaWikiLoadComplete' to comply with schema'
  • 22:51 logmsgbot: ori synchronized php-1.23wmf13/extensions/NavigationTiming 'Update NavigationTiming to master for I4aa367e96: Round 'mediaWikiLoadComplete' to comply with schema'
  • 22:33 logmsgbot: reedy synchronized wmf-config/ 'All of the changes'
  • 22:12 ori: restarting EL on vanadium
  • 21:09 mark: Reenabled OSPF on cr2-knams:xe-1/1/0.0
  • 21:00 mark: Disabling OSPF on cr2-knams:xe-1/1/0.0
  • 20:23 logmsgbot: reedy synchronized php-1.23wmf14/extensions/FlaggedRevs 'Fix syntax errors'
  • 20:15 MaxSem: Rebuilding GeoData index
  • 19:14 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf14
  • 19:07 Reedy: mw1163: ssh: connect to host mw1163 port 22: Connection timed out
  • 19:07 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.23wmf13, testwiki back to 1.23wmf13 too
  • 19:05 Reedy: mw1094 is segfaulting too, but not so much
  • 19:04 mwalker: updated civicrm from 935b6dade2c230558c12538ae61dcb3ef18b3efd to bd973432c957b6253f82b7b0590251402e75a0de for some new submodules and thank you updates
  • 19:04 Reedy: mw1185 is segfaulting a lot
  • 18:59 logmsgbot: reedy finished scap: testwiki to 1.23wmf14 and build l10n cache (duration: 41m 59s)
  • 18:17 logmsgbot: reedy started scap: testwiki to 1.23wmf14 and build l10n cache
  • 18:17 logmsgbot: reedy scap aborted: testwiki to 1.23wmf14 and build l10n cache (duration: 04m 05s)
  • 18:13 logmsgbot: reedy started scap: testwiki to 1.23wmf14 and build l10n cache
  • 18:09 logmsgbot: reedy scap failed: CalledProcessError Command '/usr/local/bin/mw-update-l10n' returned non-zero exit status 1 (duration: 03m 04s)
  • 18:06 logmsgbot: reedy started scap: testwiki to 1.23wmf14 and build l10n cache
  • 18:06 logmsgbot: reedy scap aborted: testwiki to 1.23wmf14 and build l10n cache (duration: 00m 04s)
  • 18:05 logmsgbot: reedy started scap: testwiki to 1.23wmf14 and build l10n cache
  • 17:58 logmsgbot: reedy updated /a/common to I5cde3917b: db1056 and db1036 full steam
  • 15:17 manybubbles: I'll start both this evening, time permitting.
  • 15:17 manybubbles: all of this week's Cirrus index updates are done except those for the wikipedias to which cirrus is deployed and commons. I
  • 04:41 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1056 db1036 full steam'
  • 04:10 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1036 as slave'
  • 03:51 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1056 as slave'
  • 03:28 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-13 03:28:37+00:00
  • 02:57 mutante: DNS update - Bugzilla TTL back to 1H, migration over
  • 02:46 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-13 02:46:53+00:00
  • 02:33 springle_: db9 mysqld stopped for decom, db1001 slave stopped
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-13 02:27:05+00:00
  • 02:04 Krinkle: Jenkins has been unresponsive to urls that retrieve build results for the past few hours (e.g. https://integration.wikimedia.org/ci/job/mediawiki-core-jslint/11129/console)
  • 01:51 ^d: gerrit: turned BZ plugin back on since downtime is over. No less than 5 pings to do so ;-)
  • 01:36 awight: tools updated from ad90b56ab7e71142a7e98825bad3febe0823920a to 946011f1db995ee73794c6bfa2c40bf8628f960d
  • 00:56 mutante: Welcome to Bugzilla 4.4.1 in eqiad, served by puppet module bugzilla on zirconium
  • 00:55 mutante: DNS update - switching Bugzilla to zirconium
  • 00:50 springle: xtrabackup clone db1018 to db1036
  • 00:19 awight: checkout new phpmailer and drush
  • 00:15 awight: checkout twig and DonationInterface into new dirs

February 12

  • 23:53 awight: updated tools from 15fd20f80c52470033e09220a124e0d43bc42fd4 to ad90b56ab7e71142a7e98825bad3febe0823920a
  • 23:46 awight: tools updated from 7c2e15b006fec25fd1881c67f6bce9e406759535 to 15fd20f80c52470033e09220a124e0d43bc42fd4
  • 23:45 awight: tools updated from 7c4114f3db978996a460d154dcd5da2bd0ffb48d to 7c2e15b006fec25fd1881c67f6bce9e406759535
  • 23:29 springle: xtrabackup clone db1049 to db1056
  • 23:27 bd808: mw1185 continues to segfault at a average rate of ~800/hr for the last 48 hours
  • 23:06 mutante: checksetup.pl on zirconium doing db upgrades ..
  • 22:47 ^d: gerrit: disabled bz plugin during bz maintenance, spamming errors since it can't connect
  • 22:45 rdwrer: deployed change to jsduck jobs that will cause them to fail more often (but in a good way)
  • 22:39 mutante: Bugzilla in scheduled maintenance/upgrade period - be back in a bit
  • 22:31 logmsgbot: catrope synchronized php-1.23wmf13/extensions/VisualEditor/ 'cherry-picks'
  • 22:22 logmsgbot: catrope synchronized wmf-config/CommonSettings.php 'plumbing for as-yet-unused TemplateDataUseGUI setting'
  • 22:20 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'secondary tabs and disableforanons for eswiki'
  • 22:19 logmsgbot: catrope synchronized visualeditor-default.dblist 'VE no longer default on eswiki'
  • 22:17 logmsgbot: tstarling synchronized docroot/noc/conf/highlight.php
  • 22:17 logmsgbot: catrope synchronized visualeditor.dblist 'Enable VE on frwikiversity'
  • 22:17 logmsgbot: tstarling updated /a/common to I24ff4d72a: Replace easter egg by a more explaining message
  • 22:09 mutante: root@wtp1019:~# service parsoid restart
  • 22:07 mutante: restarting parsoids
  • 21:15 gwicke: updated parsoid to deploy repo 77f4aaf2 and code repo 96c1274
  • 21:02 mutante: DNS update - switch bz.wp to cluster redirect
  • 21:00 logmsgbot: csteipp finished scap: bug 60771 (duration: 36m 58s)
  • 20:56 mwalker: updating civicrm from 99463ceb353b452a82a7647baa88b59f65e981ea to 935b6dade2c230558c12538ae61dcb3ef18b3efd to try the queueing fix again
  • 20:23 logmsgbot: csteipp started scap: bug 60771
  • 20:13 mutante|away: DNS update - lowering bugzilla TTL
  • 19:48 manybubbles: when elastic1007 came back Elasticsearch wasn't able to balance shards to my liking so I'm forcing it to shuffle them some by temporarily lowering the disk thresholds to 75/80 instead of 85/95
  • 18:58 logmsgbot: yurik synchronized php-1.23wmf13/extensions/ZeroRatedMobileAccess/
  • 18:55 logmsgbot: yurik synchronized php-1.23wmf12/extensions/ZeroRatedMobileAccess/
  • 17:45 bd808: Updated scholaships.wm.o to fbffb96; l10n updates
  • 17:26 ottomata: purged oracle java 6 from analytics1012 and restarted hadoop daemons
  • 17:15 logmsgbot: demon synchronized php-1.23wmf13/extensions/CirrusSearch 'Performance fixes'
  • 17:14 logmsgbot: demon synchronized php-1.23wmf12/extensions/CirrusSearch 'Performance fixes'
  • 17:14 ottomata: stopping hadoop daemons on an12, oracle java 6 is running there, don't know why yet, apt says it is not installed….grrr
  • 16:51 awight: updated crm from 738bce68b56846ee3d493308629beaf743aa8653 to 99463ceb353b452a82a7647baa88b59f65e981ea (revert)
  • 14:59 manybubbles|away: just finished in place (Cirrus) reindex of all group1 wikis except commons
  • 08:50 akosiaris: restarted parsoid on wtp1004. After https://gerrit.wikimedia.org/r/112640 is was for some reason listening on 56547 and not 8000. The restart fixed it
  • 07:20 logmsgbot: aaron synchronized php-1.23wmf13/includes/filebackend/SwiftFileBackend.php '288941aa53adffb7de1af8ff4fbae4c1f0c26937 and 1bb7ef5a7816a57b38caa4a302fbff3e53f16d16'
  • 06:28 Tim: restarted apache on mw1184
  • 06:05 logmsgbot: tstarling synchronized php-1.23wmf13/includes/specials/SpecialUserrights.php
  • 04:48 Jeff_Green: disabling https/smtpd monitoring on iodine
  • 04:47 Jeff_Green: disabling mysql monitoring on db48 (prep before starting OTRS upgrade)
  • 03:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-12 03:19:05+00:00
  • 02:37 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-12 02:37:38+00:00
  • 02:35 mwalker: updated fundraising smashpig from 44e8ad85553bfcb4eeb46f67c3c534c5a7c4f8ad to 8a5e194a1c4827fc4fc75a2d6b298d3c30112d90 for queue handling updates
  • 02:18 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-12 02:18:44+00:00
  • 02:02 mwalker: updating fundraising civicrm from 2c152494487f7a212fca1f501291e050873e6738 to 738bce68b56846ee3d493308629beaf743aa8653 for reporting fix
  • 00:36 mwalker: updated fundraising civicrm from 06f7e4d6d6c2653f1d4aef1e8b8e6293a82b39ef to 2c152494487f7a212fca1f501291e050873e6738 for the eventual remove of the _badmsg queues
  • 00:03 springle: db1050 alive. do not repool.

February 11

  • 23:17 logmsgbot: ebernhardson synchronized php-1.23wmf13/extensions/Flow
  • 23:09 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings.php
  • 23:04 logmsgbot: ebernhardson synchronized wmf-config/InitialiseSettings.php
  • 22:46 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/112576'
  • 22:38 logmsgbot: ebernhardson synchronized wmf-config/
  • 22:33 logmsgbot: ebernhardson synchronized php-1.23wmf13/extensions/Flow
  • 22:32 logmsgbot: ebernhardson synchronized php-1.23wmf12/extensions/Flow
  • 22:23 logmsgbot: ebernhardson synchronized wmf-config/
  • 22:19 logmsgbot: maxsem finished scap: Guess what? MobileApp! (duration: 42m 13s)
  • 21:56 manybubbles: starting reindex of Cirrus indexes for wikis that got the new deployment today
  • 21:37 logmsgbot: maxsem started scap: Guess what? MobileApp!
  • 21:06 bblack: libav-tools (libavcodec53, libavformat53, etc) upgraded to 0.8.10-0ubuntu0.12.04.1 on all applicable toollabs hosts
  • 20:15 bblack: libav-tools (libavcodec53, libavformat53, etc) upgraded to 0.8.10-0ubuntu0.12.04.1 on all image/video scalers
  • 20:04 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: Move group1 wikis to wmf13
  • 20:00 logmsgbot: demon updated /a/common to I7530d4456: Revert "Non wikipedias to 1.23wmf12"
  • 19:56 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: Move group2 wikis back to wmf12 where they belong
  • 19:28 logmsgbot: demon synchronized wikiversions.cdb 'manually syncing cuz sync-wikiversions is busted for me'
  • 19:28 logmsgbot: demon synchronized wikiversions.dat 'manually syncing cuz sync-wikiversions is busted for me'
  • 19:26 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias to 1.23wmf12
  • 19:01 logmsgbot: reedy updated /a/common to Iecde9fe5a: Set enwiki back to 'All articles needing copy edit'
  • 18:50 logmsgbot: mflaschen synchronized wmf-config/InitialiseSettings.php 'Sync InitialiseSettings.php for Growth deploy after fixing enwiki cat'
  • 18:13 logmsgbot: mflaschen synchronized wmf-config/InitialiseSettings.php 'Sync InitialiseSettings.php for Growth deploy'
  • 18:13 logmsgbot: mflaschen synchronized wmf-config/CommonSettings.php 'Sync CommonSettings.php for Growth deploy'
  • 18:06 logmsgbot: mflaschen synchronized php-1.23wmf13/extensions/GettingStarted/ 'Sync GettingStarted on wmf13 again to fix our foreachwiki script'
  • 18:05 logmsgbot: mflaschen synchronized php-1.23wmf12/extensions/GettingStarted/ 'Sync GettingStarted on wmf12 again to fix our foreachwiki script'
  • 17:48 logmsgbot: mflaschen synchronized php-1.23wmf13/extensions/GettingStarted/ 'Sync GettingStarted on wmf13 for i18n rollout'
  • 17:46 logmsgbot: mflaschen synchronized php-1.23wmf12/extensions/GettingStarted/ 'Sync GettingStarted on wmf12 for i18n rollout'
  • 16:18 mutante: revoking puppet cert for locke
  • 16:00 mutante: locke is being decom'ed momentarily (purchase date 2006-12-04, heh)
  • 15:56 bd808: mw1094 segfaulting since 2014-02-11T22:42. Current rate ~45/hr.
  • 15:55 bd808: mw1185 continues to segfault at a rate of ~1000/hr
  • 15:48 mutante: DNS update - removing harmon
  • 08:39 logmsgbot: nikerabbit synchronized wmf-config/CommonSettings.php 'uls prep'
  • 08:38 logmsgbot: nikerabbit synchronized wmf-config/InitialiseSettings.php 'uls prep'
  • 08:10 logmsgbot: nikerabbit updated /a/common to Ib325d4aa9: Update ULS config
  • 07:47 gwicke: deployed parsoid hotfix to avoid recursive log spew filling up the logs
  • 03:39 RoanKattouw: Freed up disk space on wtp* by blanking /var/log/parsoid/parsoid.log
  • 03:29 RoanKattouw: Managed to restart Parsoid cleanly in the end. Turns out dsh -g parsoid service restart parsoid doesn't work but dsh -cM -g parsoid /etc/init.d/parsoid restart does work
  • 03:23 RoanKattouw: Doing rolling restart of the Parsoid cluster
  • 03:20 RoanKattouw: Copying wtp1005's Parsoid log for future reference
  • 02:43 Tim: restarting apache on servers with workers in a futex wait: mw1056,mw1082,mw1189,mw1196,mw1199,mw1201,mw1202,mw1203,mw1204,mw1208
  • 02:40 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-11 02:40:11+00:00
  • 02:19 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-11 02:19:13+00:00
  • 02:10 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-11 02:10:38+00:00
  • 02:02 awight: update crm from 7e5786aaf71363c04e5c766e6b403fa4767fe51a to 06f7e4d6d6c2653f1d4aef1e8b8e6293a82b39ef
  • 01:25 mwalker: updated fundraising civicrm from d96164f76877d75ab97c02e6b47449f7a45b31b3 to 7e5786aaf71363c04e5c766e6b403fa4767fe51a for unsbuscribe and quick search changes
  • 00:56 ^demon: gerrit upgraded from 2.8.1 stable to 2.8.1-1-g83098d0 (custom build) to work around mysql issue pending upstream release.
  • 00:34 logmsgbot: maxsem finished scap: MobileApp deployment (duration: 28m 19s)
  • 00:06 logmsgbot: maxsem started scap: MobileApp deployment
  • 00:04 logmsgbot: maxsem scap aborted: MobileApp deployment (duration: 06m 33s)

February 10

  • 23:58 logmsgbot: maxsem started scap: MobileApp deployment
  • 23:03 mutante: all parsoid machines reployed per gwicke's
  • 23:00 bd808: mw1185 segfaulting starting at 22:39Z. ~240 occurrences in last 20 minutes
  • 22:59 logmsgbot: aaron synchronized php-1.23wmf13/includes/db/LoadBalancer.php '8f6471e04ce0f33c64c090cbe5561deed82f60ee'
  • 22:59 springle: restarting db1050 for investigation
  • 22:45 mutante: restarting parsoid on wtp1008
  • 22:44 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'sync proper non-hot depool db1050'
  • 22:39 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'move s1 vslow dump'
  • 22:36 logmsgbot: maxsem synchronized wmf-config 'https://gerrit.wikimedia.org/r/112597'
  • 22:34 mark: Power cycled ms-be1001
  • 22:32 springle: pt-kill jobs on s1 slaves killing anything sleeping longer than 10s
  • 22:28 springle: killed thousands of broken connections on s1 slaves in Sleep state
  • 22:25 logmsgbot: maxsem scap aborted: Extension:MobileApp deployment (duration: 14m 41s)
  • 22:23 matanya: big dberror spike. "Error connecting" to various ips from various ips
  • 22:17 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1050 crashed, depool'
  • 22:12 mutante: fixing broken parsoid deploy on wtp*, one by one
  • 22:10 logmsgbot: maxsem started scap: Extension:MobileApp deployment
  • 22:02 mutante: wtp1016 - delete deployment/parsoid, salt-call fetch/checkout.., restart parsoid
  • 21:57 gwicke: unsuccessful Parsoid deploy as trebuchet failed to update the submodule with the parsoid source, need trebuchet bug fix
  • 21:21 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'touch'
  • 21:11 logmsgbot: catrope synchronized visualeditor-default.dblist 'fix missing entries'
  • 20:14 mutante: harmon - revoke puppet cert,disable puppet,disable icinga notifications, shutting down
  • 20:09 mutante: harmon - removing from puppet stored configs, complete decom, unused Tampa spare
  • 17:45 mwalker: updated civicrm from 97a5146124168096148b6167e2968052b3dda468 to d96164f76877d75ab97c02e6b47449f7a45b31b3 for thank you translations
  • 16:28 manybubbles: correction: done with link count update for cirrus
  • 16:28 manybubbles: done with links count update for cirurs
  • 15:58 hashar: Jenkins: migrating labs jenkins-deploy user homedir from /home/jenkins-deploy (GlusterFS) to local directories under /mnt/home/jenkins-deploy to avoid GlusterFS and race conditions between instances. bug 61144
  • 15:53 manybubbles: reindex went well. performing a links recount so we can push more code changes next week safely.
  • 15:28 manybubbles: reindexing phase 0 wikis after Cirrus deploy last Thursday
  • 12:20 hashar: Jenkins: deleted /srv/slave-scrips from old jenkins servers, everything should now use /srv/deployment/integration/slave-scripts
  • 03:41 logmsgbot: springle synchronized wmf-config/db-pmtpa.php 's2 switch master to db1024 (pmtpa)'
  • 03:40 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 switch master to db1024 (eqiad)'
  • 03:03 logmsgbot: springle synchronized wmf-config/db-pmtpa.php 'prepare for s2 master rotation db1036 to db1024 (pmtpa)'
  • 03:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'prepare for s2 master rotation db1036 to db1024 (eqiad)'
  • 02:44 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1060 warm up'
  • 02:33 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1002, depool db1060 schema changes'
  • 02:31 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-10 02:31:08+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-10 02:15:59+00:00
  • 02:08 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-10 02:08:14+00:00
  • 02:07 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 depool db1002 schema changes'

February 9

  • 02:34 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-09 02:34:15+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-09 02:17:56+00:00
  • 02:09 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-09 02:09:57+00:00

February 8

  • 21:00 logmsgbot: aaron synchronized php-1.23wmf12/extensions/Math 'd96b29ca8b17f35e7068f0d3a16b5e2644e084f9'
  • 20:41 logmsgbot: aaron synchronized php-1.23wmf13/extensions/Math '4844f52139593f4a324bf99b74d7abb91aac2e54'
  • 09:31 apergos: powercycling ms-be1001, 'soft lockup cpu stuck' on console, no login prompt
  • 02:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-08 02:47:46+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-08 02:27:14+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-08 02:13:35+00:00
  • 00:57 Nemo_bis: Job queue rather long (400k on en.wiki), OTRS reports of password resets not being delivered (bug 43936?), almost no jobs run today according to gdash

February 7

  • 22:22 mwalker: updating payments wiki from 5f042ece4fa9d19293415ed41536ef93392c44ab to bad46aad4b28435d10570aac8173a7f8dca8751d for minfraud... again
  • 21:46 logmsgbot: catrope synchronized php-1.23wmf13/extensions/MobileFrontend/includes/MobileFrontend.hooks.php 'Use feature flag for Minerva Beta Feature'
  • 21:45 logmsgbot: catrope synchronized php-1.23wmf13/extensions/MobileFrontend/MobileFrontend.php 'Feature flag for Minerva Beta Feature'
  • 21:34 mwalker: updated paymentswiki from fac21c9d394f6d7c9da172cee18c805304c8fe1b to 5f042ece4fa9d19293415ed41536ef93392c44ab for Amazon Recurring and Minfraud health check
  • 19:27 mutante: running puppet on db9,pc1,pc2 etc, fixes mysqld process monitoring
  • 19:04 mutante: revoked grosley's puppet cert
  • 18:49 akosiaris: reenable puppet on mchenry as well as the exim cronjob.
  • 18:27 akosiaris: temporarily disabling puppet on mchenry and then disable collect_exim_stats_via_gmetric cron. Seems like mchenry has not ganglia at all
  • 13:10 paravoid: staggered restart of cp4xxx localssl, to deploy Ie94ccc (committed Oct 29th)
  • 03:52 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-07 03:52:20+00:00
  • 03:50 Ryan_Lane: removing myself from ops and wmf groups
  • 03:12 logmsgbot: LocalisationUpdate completed (1.23wmf13) at 2014-02-07 03:12:27+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-07 02:27:34+00:00
  • 01:39 bd808: segfaults on mw1215 stopped after graceful restart
  • 01:22 ori: graceful'd mw1215
  • 01:19 bd808: mw1215 logging 10-30 segfaults per minute since 19:56
  • 00:54 logmsgbot: kaldari synchronized php-1.23wmf12/extensions/MobileFrontend 'syncing MobileFrontend make sure all the js is up to date'
  • 00:49 ori: graceful'd mw1142
  • 00:43 logmsgbot: krinkle synchronized php-1.23wmf12/extensions/VisualEditor 'I1cc789596dd (re-sync, forgot to update inner submodule)'
  • 00:31 logmsgbot: krinkle synchronized php-1.23wmf12/extensions/VisualEditor 'I1cc789596dd'
  • 00:29 logmsgbot: krinkle synchronized php-1.23wmf13/extensions/VisualEditor 'I156b24551a40'
  • 00:21 logmsgbot: ebernhardson synchronized php-1.23wmf12/extensions/Flow/ 'LD two patches to Flow - 1.23wmf12'
  • 00:21 logmsgbot: ebernhardson synchronized php-1.23wmf12/extensions/Echo/ 'LD two patches to Echo - 1.23wmf12'
  • 00:15 logmsgbot: ebernhardson synchronized php-1.23wmf13/extensions/Flow/ 'LD two patches to Flow-1.23wmf13'
  • 00:14 logmsgbot: ebernhardson synchronized php-1.23wmf13/extensions/Echo/ 'LD two patches to Echo-1.23wmf13'
  • 00:10 logmsgbot: maxsem synchronized wmf-config/squid.php 'https://gerrit.wikimedia.org/r/111927'

February 6

  • 23:15 mwalker: updating fundraising civicrm for thankyou messages from 309a69545b4c89b323bffc500f71bdee82c35e42 to 97a5146124168096148b6167e2968052b3dda468
  • 23:03 Reedy: running sync-common on mw1142
  • 22:41 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 db1055 resume groupLoadsByDB'
  • 22:28 logmsgbot: demon synchronized php-1.23wmf13/extensions/Math 'Unfataling wmf13 Math too'
  • 22:28 logmsgbot: demon updated /a/common/php-1.23wmf13 to I5ba48ca51: Reverting Math to known-good 2b8534793fad9db18fcdb9621dc8d79ff36fdeb1
  • 22:16 logmsgbot: demon synchronized php-1.23wmf12/extensions/Math 'New math causes cache stampedes but without fatals'
  • 21:54 logmsgbot: demon synchronized php-1.23wmf13/extensions/Math 'Revert to known-good 2b85347 from wmf11'
  • 21:52 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedia back to 1.23wmf12, group0 to 1.23wmf13, Math reverted to 1.23wmf11 state in both branches
  • 21:50 logmsgbot: demon updated /a/common/php-1.23wmf13 to I233eaf25c: Revert "Add sequence support for externallinks table"
  • 21:40 logmsgbot: demon updated /a/common/php-1.23wmf12 to I91e982773: Update Echo and Flow
  • 21:39 logmsgbot: demon synchronized php-1.23wmf12/extensions/Math/ 'Revert to known-good 2b85347 from wmf11'
  • 21:17 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1009 warm up'
  • 19:45 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Revert earlier version changes
  • 19:31 logmsgbot: reedy started scap: run 2, should be a noop
  • 19:28 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf13
  • 19:13 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.23wmf12
  • 18:36 logmsgbot: demon synchronized wmf-config/CirrusSearch-labs.php 'Enabled interwiki searches for beta -- no-op in prod'
  • 17:45 paravoid: depooling mw1165
  • 17:42 Reedy: mw1165 is segfaulting a lot
  • 17:18 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki back to 1.23wmf12 till window
  • 17:17 logmsgbot: reedy finished scap: testwiki to 1.23wmf13 and build l10n cache (duration: 69m 51s)
  • 16:50 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 16:48 Nemo_bis: #Wikipedia and #Wikimedia wikis have been partly down for about 10 min, now recovered/ing
  • 16:41 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 16:07 logmsgbot: reedy started scap: testwiki to 1.23wmf13 and build l10n cache
  • 16:04 logmsgbot: reedy started scap: testwiki to 1.23wmf13 and build l10n cache
  • 16:00 logmsgbot: reedy started scap: testwiki to 1.23wmf13 and build l10n cache
  • 15:57 hashar: restarted jenkins by mistake :-(
  • 15:28 logmsgbot: reedy updated /a/common to I1e5ea52ae: s2 increase db1024 load after warm up
  • 13:13 paravoid: US/Canada pacific states being served by ulsfo
  • 12:43 paravoid: pointing the rest of East Asia (except CN) to ulsfo
  • 11:14 hashar: jenkins: added label hasTox on integration-slave01.pmtpa.wmflabs. Will let us run tox based Jenkins jobs there.
  • 11:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 increase db1024 load after warm up'
  • 07:29 Ryan_Lane: redeploying parsoid/deploy on wtp*
  • 07:01 springle: xtrabackup clone db1018 to db1009 (take #2)
  • 06:47 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 pool db1024 warm up'
  • 05:48 ori: varnish on cp1054: CPU wait spiked at 05:27. dmesg|tail: XFS: possible memory allocation deadlock in kmem_alloc. not investigating further.
  • 05:44 logmsgbot: ori finished scap: no new code. testing scap changes. (again.) (duration: 05m 00s)
  • 05:39 logmsgbot: ori started scap: no new code. testing scap changes. (again.)
  • 05:15 springle: restart labsdb1003 mariadb instances
  • 05:09 Ryan_Lane: scratch that, will redploy parsoid/deploy in about an hour
  • 05:08 Ryan_Lane: redeploying parsoid/deploy on wtp*
  • 04:28 logmsgbot: ori finished scap: no new code. testing scap changes. (duration: 04m 35s)
  • 04:24 logmsgbot: ori started scap: no new code. testing scap changes.
  • 03:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-06 03:22:26+00:00
  • 03:10 springle: xtrabackup clone db1018 to db1024
  • 02:45 springle: xtrabackup clone db1034 to db1009
  • 02:44 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-02-06 02:44:10+00:00
  • 02:23 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 depool db1009 schema changes'
  • 02:21 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-06 02:21:47+00:00
  • 02:01 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 repool db1034 warm up'
  • 01:06 mutante: apt-get remove libnet-stomp-perl on neon, i just removed that from puppet but didn't think it should stay in as an "absent" package forever
  • 01:00 logmsgbot: ori finished scap: no-diff scap to test script changes (duration: 02m 34s)
  • 00:58 logmsgbot: ori started scap: no-diff scap to test script changes
  • 00:44 ebernhardson: finished deploying Special:Notifications fix to Echo and Flow
  • 00:44 logmsgbot: ebernhardson synchronized php-1.23wmf11/extensions/Flow/ 'Update flow for Special:Notifications fix'
  • 00:43 logmsgbot: ebernhardson synchronized php-1.23wmf11/extensions/Echo/ 'Update echo for Special:Notifications fix'
  • 00:38 logmsgbot: ebernhardson synchronized php-1.23wmf12/extensions/Flow/ 'Update flow for Special:Notifications fix'
  • 00:37 logmsgbot: ebernhardson synchronized php-1.23wmf12/extensions/Echo/ 'Update echo for Special:Notifications fix'
  • 00:04 logmsgbot: ori finished scap: (no message) (duration: 29m 26s)
  • 00:04 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files:

February 5

  • 23:34 logmsgbot: ori started scap: (no message)
  • 22:11 ottomata: stopping puppet on analytics1021. Trying to get it to catch up on replica lag
  • 20:13 mwalker: updated fundraising civicrm from ec1e1fc75193ac8c0e6237fc0b80f58b3f8d55fc to 309a69545b4c89b323bffc500f71bdee82c35e42 for logging updates
  • 18:57 logmsgbot: yurik synchronized php-1.23wmf12/extensions/ZeroRatedMobileAccess/
  • 18:49 logmsgbot: yurik synchronized php-1.23wmf11/extensions/ZeroRatedMobileAccess/
  • 18:38 logmsgbot: yurik synchronized docroot/bits/WikipediaMobileFirefoxOS
  • 14:10 paravoid: ms-be1002/sdd: megacli -DiscardPreservedCache, -CfgEachDskRaid0, puppet run
  • 13:57 paravoid: remove XO-Level3 avoided as-path from cr1/2-eqiad despite no ticket reply; seems to work now
  • 13:57 hashar: compressing Jenkins console logs on gallium.wikimedia.org using gzip -9
  • 08:47 logmsgbot: ori finished scap: (no message) (duration: 03m 31s)
  • 08:47 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 08:44 logmsgbot: ori started scap: (no message)
  • 08:17 Ryan_Lane: restart parsoid on wtp1012
  • 08:13 Ryan_Lane: stopping parsoid on wtp1012 shortly
  • 07:53 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 07:51 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 07:15 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 06:59 logmsgbot: ori finished scap: (no message) (duration: 00m 35s)
  • 06:59 logmsgbot: ori rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 06:59 logmsgbot: ori started scap: (no message)
  • 05:52 logmsgbot: springle synchronized wmf-config/db-pmtpa.php 'update pmtpa config for s1 master change'
  • 05:47 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'switch s1 master to db1052'
  • 05:32 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'prepare for s1 master rotation db1056 to db1052'
  • 05:18 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-02-05 05:18:46+00:00
  • 05:07 ori: Last sync-file: connect to host mw1163 port 22: Connection timed out
  • 05:07 logmsgbot: ori synchronized README 'Ensuring that sync-file works after Ia210f3ced'
  • 05:04 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-05 05:04:13+00:00
  • 04:23 logmsgbot: LocalisationUpdate failed: git pull of extensions failed
  • 03:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 depool db1055 schema changes'
  • 03:49 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's1 repool db1043 warm up'
  • 02:01 logmsgbot: LocalisationUpdate failed: git pull of extensions failed
  • 01:54 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's2 depool db1034 for schema changes'
  • 00:56 logmsgbot: springle synchronized wmf-config/db-eqiad.php 's3 db1027 full steam'
  • 00:51 springle: schema changes db1047 s1-analytics-slave repl stopped
  • 00:21 logmsgbot: krinkle synchronized php-1.23wmf12/extensions/VisualEditor 'I1214378b5452b37'
  • 00:20 logmsgbot: krinkle synchronized php-1.23wmf11/extensions/VisualEditor 'Idfbbf2e43a7de'
  • 00:16 logmsgbot: kaldari synchronized php-1.23wmf12/extensions/VectorBeta

February 4

  • 23:39 logmsgbot: ori started scap: Test of scap modifications; no changes going out.
  • 23:38 logmsgbot: ori started scap: Test of scap modifications; no changes going out.
  • 23:24 logmsgbot: ori started scap: Test of scap modifications; no changes going out.
  • 22:34 rdwrer: deployed mwalker's chicken-out commit making sphinx non-voting, to gallium
  • 22:31 logmsgbot: ori updated /a/common to Ia15a12587: Revert "Add 'scap' submodule"
  • 22:03 logmsgbot: bsitu synchronized php-1.23wmf12/extensions/Flow 'Update Flow'
  • 22:03 logmsgbot: aaron synchronized php-1.23wmf12/includes/specials/SpecialActiveusers.php '6e1fd797c58c4ce01c19c57d9ffe06b13acc816a'
  • 21:49 manybubbles1: rebuilding search indexes for non-wikipedias after cirrus update on the train went to them earlier today.
  • 21:30 logmsgbot: bsitu synchronized php-1.23wmf11/extensions/Flow 'Update Flow'
  • 20:52 logmsgbot: aaron synchronized php-1.23wmf12/includes 'faf18db37b0553bd4f42b72d24cc6f7f297a0b5f'
  • 19:10 rdwrer: deployed zuul change for fundraising's very own mwalker
  • 19:08 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Update non wikipedias to 1.23wmf12
  • 18:36 ottomata: disabled puppet on cp3019 again, trying batch.num.messages there (just want these flappy alerts to be quiet!)
  • 17:43 logmsgbot: demon synchronized php-1.23wmf11/extensions/LiquidThreads/pages/TalkpageView.php
  • 17:43 logmsgbot: demon synchronized php-1.23wmf12/extensions/LiquidThreads/pages/TalkpageView.php
  • 17:31 ottomata: reenabling puppet on cp3021, this reverts the batch.num.messages varnishkafka.conf change, let's see if cp3021 starts dropping again tomorrow
  • 17:29 ottomata: reenabling puppet on cp3019
  • 17:05 hashar: Jenkins: updated bin/multigit.sh script to point to zuul.eqiad.wmnet instead of non working integration.wikimedia.org
  • 13:51 logmsgbot: reedy updated /a/common to I8a1341149: increase db1059 load (96G ram compared to 64G for s4 siblings)
  • 11:17 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1059 LB increase'
  • 11:08 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1027 in s3, warm up'
  • 03:41 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-04 03:41:31+00:00
  • 02:46 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-04 02:46:12+00:00
  • 02:25 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-02-04 02:25:34+00:00
  • 02:01 logmsgbot: krinkle synchronized php-1.23wmf12/extensions/VisualEditor/modules/ve-mw/ui/inspectors/ve.ui.MWGalleryInspector.js 'touch for scap i18n race condition breakage'
  • 01:58 logmsgbot: krinkle synchronized php-1.23wmf11/extensions/VisualEditor/modules/ve-mw/ui/inspectors/ve.ui.MWGalleryInspector.js 'touch for scap i18n race condition breakage'
  • 01:47 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'in s1: db1049 to full steam, repool db1050, depool db1043 for schema changes'
  • 01:45 logmsgbot: krinkle synchronized php-1.23wmf11/extensions/VisualEditor/modules/ve-mw/ui/inspectors/ve.ui.MWExtensionInspector.js 'touch for scap i18n race condition breakage'
  • 01:40 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1059, warm up'
  • 01:03 logmsgbot: mwalker finished scap: Krinkle, Mwalker and superm401 LD super scap (duration: 26m 24s)
  • 00:41 logmsgbot: mwalker started scap: Krinkle, Mwalker and superm401 LD super scap
  • 00:03 logmsgbot: jgonera finished scap (duration: 23m 55s)

February 3

  • 23:44 logmsgbot: jgonera started scap
  • 23:37 logmsgbot: jgonera synchronized php-1.23wmf11/
  • 23:32 logmsgbot: aaron synchronized php-1.23wmf12/includes/filebackend/FileOpBatch.php 'aa3cb6330324ff18ce35fbb68f67b42035de3496'
  • 23:29 logmsgbot: maxsem synchronized wmf-config/mobile.php 'https://gerrit.wikimedia.org/r/110087'
  • 23:27 logmsgbot: aaron synchronized php-1.23wmf12/includes/filebackend/SwiftFileBackend.php '1eb5600fe6518071b4c3d2eaa980f70143451be0'
  • 23:09 springle: xtrabackup clone db1035 to db1027
  • 22:40 springle: xtrabackup clone db1042 to db1059
  • 22:26 bd808: Forced puppet run on logstash1001 so I could watch the logs as it restarted
  • 22:24 bd808: puppetd --enable on logstash1001
  • 21:34 logmsgbot: bsitu synchronized wmf-config/InitialiseSettings.php 'Enable Flow on two enwiki WikiProject pages'
  • 21:29 ottomata: stopping puppet on cp3021 to test troubleshoot varnishkafka delivery errors with a different change than cp3019
  • 21:27 bsitu: that was not our change, it's previous commit. Ours is "Enable Flow on two enwiki WikiProject pages"
  • 21:25 logmsgbot: bsitu updated /a/common to I365611ca3: Bump wgCacheEpoch for Wikidata
  • 21:06 ottomata: stopping puppet on cp3019 to investigate varnishkafka errors
  • 20:46 bd808: Disabled puppet on logstash1001 to work around bug 60772 with local config hack
  • 20:41 logmsgbot: reedy synchronized wmf-config/
  • 20:30 logmsgbot: reedy synchronized php-1.23wmf12/extensions/Wikidata/
  • 20:28 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1049, warm up'
  • 20:15 logmsgbot: awight synchronized php-1.23wmf11/extensions/EducationProgram
  • 20:11 Reedy: mw1163 seems to be down. No ping/ssh response
  • 20:06 logmsgbot: awight synchronized php-1.23wmf12/extensions/EducationProgram
  • 20:06 logmsgbot: awight synchronized php-1.23wmf12/extensions/EducationProgram
  • 19:51 Reedy: fixed user_last_timestamp column on wikis where it was not "varbinary(14) NULL default NULL" - bug 49196
  • 19:38 awight: crm updated from 95665543297b4f6016e443bcd2194fa5e0f0c96f to ec1e1fc75193ac8c0e6237fc0b80f58b3f8d55fc
  • 19:34 manybubbles1: removed elastic1007 from the elasticsearch cluster because it crashed again unexpectedly
  • 19:33 gwicke: deployed parsoid deploy 855eeb06 with source 4d5d9a0a
  • 19:11 Reedy: Added org_last_active_date column to ep_orgs on all wikis running EducationProgram - bug 60775
  • 18:19 bd808: !log Disabled puppet on logstash1001 to debug fitlers
  • 18:16 logmsgbot: !log reedy synchronized multiversion/
  • 18:15 logmsgbot: !log reedy updated /a/common to Ie5e46a9fe: getMWVersion.php prints its result, don't print it here too
  • 18:12 paravoid: !log setting cr1-ulsfo xe-0/0/3 (GTT) down for redundancy testing
  • ... morebots / wm-bot were away ...
  • 16:07 maxsem synchronized wmf-config/Wikibase.php 'https://gerrit.wikimedia.org/r/#/c/110955/'
  • ... morebots / wm-bot were away ...
  • 10:07 apergos: powercycled mw1054 and mw1073, both hung even at mgmt console; 1073 had / fs errors on reboot, repaired
  • ... morebots / wm-bot were away ...
  • 09:08 logmsgbot: !log hashar synchronized docroot/noc/conf/highlight.php 'reinstaure easter egg'
  • 09:07 logmsgbot: !log hashar synchronized docroot/noc/conf/highlight.php 'reinstaure easter egg'
  • ... morebots / wm-bot were away ...
  • 06:55 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1049 for schema changes'
  • 05:44 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1051, warm up'
  • 03:40 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1051 for schema changes'
  • 02:15 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-03 02:15:14+00:00
  • 02:06 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-03 02:06:16+00:00
  • 02:06 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-02-03 02:03:45+00:00
  • 02:06 Reedy: wmflabs bastion /home is giving "Read-only file system"
  • 01:27 logmsgbot: springle synchronized wmf-config/db-pmtpa.php 'rotate s4 master db1059 to db1040'
  • 01:26 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'rotate s4 master db1059 to db1040'
  • 01:08 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'prepare for s4 master rotation db1059 to db1040'

February 2

  • 22:01 logmsgbot: ori synchronized tests/noc-conf/highlightTest.php 'I8352c4cfc: Fix test broken by Idda8cff80 (2/2)'
  • 21:59 logmsgbot: ori synchronized docroot/noc/conf/highlight.php 'I8352c4cfc: Fix test broken by Idda8cff80 (1/2)'
  • 21:58 logmsgbot: ori updated /a/common to I8352c4cfc: Fix test broken by Idda8cff80
  • 21:40 logmsgbot: ori synchronized docroot/noc/conf/highlight.php 'Idda8cff80: Replace easter egg by a more explaining message'
  • 21:37 logmsgbot: ori updated /a/common to Idda8cff80: Replace easter egg by a more explaining message
  • 20:20 awight_: enabling campaigns: C14_enWW_frm_FR C13_wpdr_enSG_FR C13_wpnd_mlotWW_FR C13_wpnd_zhCN_FR C13_wpnd_mlWW_FR
  • 19:03 awight_: disabled campaigns: C14_enWW_frm_FR C13_wpdr_enSG_FR C13_wpnd_mlotWW_FR C13_wpnd_zhCN_FR C13_wpnd_mlWW_FR
  • 18:15 awight_: stomp updated from 0fa5283 to 361c219
  • 15:13 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1037 and db1004'
  • 10:53 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1037, schema changes'
  • 09:52 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1004, schema changes'
  • 05:47 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1011, warm up'
  • 03:51 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-02 03:50:59+00:00
  • 03:11 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-02 03:11:33+00:00
  • 02:44 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-02-02 02:44:17+00:00

February 1

  • 20:20 logmsgbot: ori synchronized php-1.23wmf11/extensions/FeaturedFeeds 'I8a600d37a: Reject FeedItem timestamps set too far in the future'
  • 20:19 ori: sync-dir: mw1054: No route to host; mw1073: Connection timed out
  • 20:18 logmsgbot: ori synchronized php-1.23wmf12/extensions/FeaturedFeeds 'I8a600d37a: Reject FeedItem timestamps set too far in the future'
  • 12:21 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1052 increase LB'
  • 11:26 paravoid: restarting gdnsd on ns0/1/2 to reload maxmind geoip databases (FOSDEM geolocated esams, among others ;))
  • 02:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-02-01 02:25:34+00:00
  • 02:17 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-02-01 02:17:03+00:00
  • 02:02 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-02-01 02:02:04+00:00

January 31

  • 23:04 awight_: updated tools to 7c4114f
  • 23:03 awight_: updated crm from 99c1702 to 9566554
  • 21:47 logmsgbot: ori finished scap: scap 1.23wmf12 for wikidata i18n changes (duration: 14m 26s)
  • 21:37 logmsgbot: demon synchronized php-1.23wmf12/extensions/LiquidThreads/pages/TalkpageView.php 'Fix for broken search bar'
  • 21:36 logmsgbot: demon synchronized php-1.23wmf11/extensions/LiquidThreads/pages/TalkpageView.php 'Fix for broken search bar'
  • 21:35 logmsgbot: ori started scap: scap 1.23wmf12 for wikidata i18n changes
  • 21:29 logmsgbot: ori synchronized php-1.23wmf12/extensions/Wikidata 'Iab3e51adf, I1364bfc86, I25dbed450: bug-fixes for 60670'
  • 20:40 mwalker|alt: updating civicrm from 600e09e to deploy master for thank you messages
  • 20:04 ori: stopping sysv parsoid service on wtp1015 to test I74ba5f649
  • 16:55 ottomata: rebooting analytics1003
  • 16:43 ottomata: rebooting an03 to test kafka ulmits
  • 16:22 ottomata: reenabling puppet on cp3019
  • 16:01 ottomata: stopping puppet on cp3019 to experiement with varnishkafka buffer levels
  • 14:45 ottomata: analytics1022 back up with higher nofile ulimit, now handling all kafka traffic, analytics1021 wont' boot
  • 14:10 ottomata: upping noflie open file limit on analytics1021 and analytics1022, rebooting
  • 06:59 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1052, warm up'
  • 06:51 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1020 full steam. depool db1011 for schema changes'
  • 05:03 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1052 for schema changes'
  • 04:57 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1020, warm up'
  • 03:40 logmsgbot: tstarling finished scap: mostly no-op, did a reset to revert the wikidata changes (duration: 05m 00s)
  • 03:37 logmsgbot: tstarling started scap: mostly no-op, did a reset to revert the wikidata changes
  • 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-31 02:07:39+00:00
  • 02:02 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-01-31 02:02:37+00:00
  • 02:01 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-31 02:01:50+00:00
  • 01:25 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1050 for schema changes'
  • 00:19 springle: xtrabackup clone db1042 to db1020
  • 00:19 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-31 00:19:28+00:00
  • 00:10 logmsgbot: LocalisationUpdate completed (1.23wmf12) at 2014-01-31 00:10:17+00:00

January 30

  • 23:40 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-30 23:40:27+00:00
  • 23:14 logmsgbot: demon synchronized wmf-config/Wikibase.php 'I50dfdc42: Enable quantity values'
  • 22:59 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: wikidatawiki to wmf12
  • 22:36 RobH: powercycling mw1061, system frozen
  • 22:35 RobH: rebooting frozen systems mw1057, mw1058, mw1060
  • 22:22 RobH: powercycleing mw1039 as its also crashed
  • 22:19 RobH: mw1033 crashed, powercycling
  • 22:14 RobH: mw1036 crashed, unresponsive to console or ssh, rebooting
  • 21:09 logmsgbot: reedy synchronized php-1.23wmf12/extensions/Wikidata/
  • 20:38 logmsgbot: reedy finished scap: Scap take 2 for 1.23wmf12 (duration: 18m 08s)
  • 20:22 logmsgbot: reedy started scap: Scap take 2 for 1.23wmf12
  • 20:16 logmsgbot: reedy synchronized php-1.23wmf12/extensions/ContactPageFundraiser
  • 20:04 logmsgbot: reedy synchronized php-1.23wmf12/extensions/ContactPageFundraiser/ContactPage.php
  • 19:55 Reedy: Update indexes on wb_terms for testwikidatawiki https://gerrit.wikimedia.org/r/#/c/99660
  • 19:52 Reedy: Changed wb_items_per_site.ips_row_id and wb_terms.term_row_id to BIGINT on testwikidatawiki
  • 19:43 logmsgbot: reedy synchronized php-1.23wmf12/includes/api/ApiQueryRevisions.php 'bug 60635'
  • 19:36 logmsgbot: reedy synchronized php-1.23wmf12/extensions/PdfHandler
  • 19:31 manybubbles: rebuilding search index on test2wiki went perfectly. proceeding with test, testwikidatawiki, and mediawikiwiki
  • 19:29 manybubbles: rebuilding search index for test2wiki and checking that everything is sane
  • 19:25 awight_: updated crm from 2516fe6 to 600e09e
  • 19:20 logmsgbot: reedy synchronized wmf-config/
  • 19:20 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.23wmf12
  • 19:19 logmsgbot: reedy updated /a/common to Ie5c11239f: Update php symlink to php-1.23wmf11
  • 19:08 Reedy: testwiki back to 1.23wmf11 even
  • 19:05 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: all wikipedias to 1.23wmf11. testwiki back to 1.23wmf10
  • 19:02 logmsgbot: reedy finished scap: testwiki to 1.23wmf12 and build l10n cache (duration: 18m 45s)
  • 19:01 awight_: crm updated from 9dc1887 to 2516fe6
  • 18:57 ^d: jenkins: aborted another one of those pywiki jobs, was starting to back things up again. this job is broken methinks.
  • 18:46 logmsgbot: reedy started scap: testwiki to 1.23wmf12 and build l10n cache
  • 18:42 logmsgbot: reedy synchronized docroot and w
  • 18:42 logmsgbot: reedy synchronized php-1.23wmf12 'staging'
  • 18:32 ^d: jenkins: killed job, queue seems to be trying to catch up now
  • 18:31 ^d: jenkins backed up due to pywikibot-core-test job stuck again
  • 18:25 awight_: crm updated from 7c67502 to 9dc1887
  • 17:51 logmsgbot: reedy updated /a/common to Id0633e5a0: depool db1020 for schema changes
  • 17:41 cmjohnson1_: labnet1001 down for card install
  • 17:26 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Id0dea6e4: Revert "New extra language for wikidata: Ottoman Turkish (ota)"'
  • 17:20 awight_: crm updated from 704a029 to 7c67502
  • 04:09 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1020 for schema changes'
  • 03:21 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1042'
  • 02:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-30 02:09:15+00:00
  • 02:02 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-30 02:02:54+00:00
  • 02:02 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-30 02:02:05+00:00
  • 01:04 awight: crm updated from f71526f to e0dbcef
  • 00:52 awight: rollback crm to f71526f
  • 00:51 awight: crm updated from f71526f to 37c6dc5
  • 00:40 awight: rollback crm to f71526f
  • 00:39 awight: updated crm from f71526f to fa36e6c
  • 00:18 awight: updated payments from 3cda905 to fac21c9

January 29

  • 23:58 RobH: all lists.w.o updates done and now on individual certificate
  • 23:26 awight: rollback payments to 3cda905
  • 23:14 awight: updated payments from 65d6fa8 to bcf5ec4
  • 22:41 awight: updating payments from 3cda905 to 65d6fa8
  • 22:12 RobH: otrs now using its own cert per rt 6702, confirmed working chain
  • 21:48 RobH: svn.w.o on own cert, confirmed chain is properly functioning
  • 21:45 RobH: updating svn to use own cert, service disruption may (but shouldnt) occur
  • 19:49 cmjohnson1: tungsten replacing failing disk at slot 10
  • 19:42 cmjohnson1: db1002 replacing disk at slot 1
  • 18:53 manybubbles: shutting down searchidx1001 for hardware fix
  • 17:27 logmsgbot: demon synchronized wmf-config/db-eqiad.php 'Removing underscores from class names'
  • 17:27 logmsgbot: demon synchronized wmf-config/db-labs.php 'Removing underscores from class names'
  • 17:26 logmsgbot: demon synchronized wmf-config/db-pmtpa.php 'Removing underscores from class names'
  • 17:06 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Enable Cirrus for huwiki + some shard config'
  • 17:05 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Turn commons file searching back on'
  • 16:53 LeslieCarr: aggregated labnet1001 secondary port
  • 14:50 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1040 LB full steam'
  • 14:14 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1040, warm up'
  • 12:06 andrewbogott: deployed (broken!) havana/nova on virt1000, 1001, 1002, 1003, labnet1001. Should be safe, but any recent breakage on virt1000 is most likely a side-effect.
  • 09:15 paravoid: reenable ospfv3 on the eqiad/esams link
  • 08:54 paravoid: applying NTP access lists on cr{1,2}-{esams,knams,eqiad,pmtpa,sdtpa,ulsfo}, csw2-esams, pfw1-eqiad
  • 07:45 springle: powercycled unresponsive db1042, /a tank data mount failed on boot, vgchange -a y + mount + xfs_check. still investigating
  • 06:51 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1042 for schema changes'
  • 02:58 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-29 02:58:21+00:00
  • 02:31 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-29 02:31:30+00:00
  • 02:16 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-29 02:16:57+00:00
  • 00:34 logmsgbot: reedy updated /a/common to I2294bac73: Throttle now handles IP ranges.
  • 00:34 ori: another recurrent error in antimony:/var/log/upstart/gitblit.log : "org.eclipse.jgit.api.errors.JGitInternalException: Garbage collection failed." repeats for each repository. traces: <http://p.defau.lt/?VKPpRReIO7aGDxmHfSJeWw>
  • 00:34 logmsgbot: reedy synchronized wmf-config/missing.php
  • 00:28 ori: gitblit on antimony crashed with org.eclipse.jetty.io.EofException. trace: <http://p.defau.lt/?1E8Wwj7_089XS6dtu6nqfA>. lots of java.lang.NullPointerException due to malformed URLs, but these appear to happen continuously.
  • 00:20 logmsgbot: reedy synchronized wmf-config/

January 28

  • 23:35 Reedy: Created FlaggedRevs tables on cewiki
  • 22:27 logmsgbot: reedy updated /a/common to I6b16562d0: Rename phase1 dblist to group0
  • 22:19 logmsgbot: reedy synchronized database lists files:
  • 22:19 logmsgbot: reedy synchronized docroot and w
  • 20:31 RobH: fixing virt0 wikitech cert, wikitech may restart
  • 19:46 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 19:45 logmsgbot: reedy synchronized database lists files:
  • 19:28 logmsgbot: reedy synchronized php-1.23wmf11/extensions/Wikibase
  • 19:23 logmsgbot: reedy updated /a/common to I4abcbb866: Non Wikipedias to 1.23wmf11
  • 19:22 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non Wikipedias to 1.23wmf11
  • 18:40 logmsgbot: reedy updated /a/common to Ic2ea727b4: Fixup default file permissions
  • 18:36 logmsgbot: reedy updated /a/common to If45e33ae9: Move re-usable code from checkoutMediaWiki to checkoutMediaWiki.php
  • 18:20 logmsgbot: reedy updated /a/common to Iac50ca3fb: Deploy Extension:MobileApp to betalabs
  • 17:33 cmjohnson1: replacing ethernet cable db1024 rt6672
  • 11:45 springle: running fresh s5 dump for toolserver on db73
  • 10:41 mutante: professor: revoked puppet cert, deleted salt minion key, powering down. bye bye
  • 10:34 mutante: disabled host and service checks/notifications for professor. running puppet on icinga
  • 10:28 mutante: removed professor from dsh,puppet,puppetmaster: on palladium Killing professor.pmtpa.wmnet...done.
  • 04:18 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1040 for schema changes'
  • 04:18 logmsgbot: springle synchronized wmf-config/db-pmtpa.php 'depool db1040 for schema changes'
  • 03:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-28 03:32:36+00:00
  • 03:11 logmsgbot: springle synchronized wmf-config/db-pmtpa.php 'rotate s6 master, demote db1027, promote db1023'
  • 03:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'rotate s6 master, demote db1027, promote db1023'
  • 02:57 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1006 to LB 400. prep db1027 for s6 master rotation'
  • 02:53 springle: wikitech /a full again as per Ryan Lane email to ops@ on 2014-01-14. Deleted two oldest backup sets *2014012[01]*.
  • 02:52 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-28 02:52:02+00:00
  • 02:49 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-28 02:25:44+00:00
  • 01:43 ^d: gerrit back up, running v2.8.1 stable now
  • 01:36 ^d: gerrit down for upgrade
  • 01:26 logmsgbot: bsitu synchronized php-1.23wmf11/extensions/Flow 'Update Flow with some special contribs cherry-picks'
  • 01:19 logmsgbot: ori synchronized php-1.23wmf10/extensions/WikimediaEvents
  • 01:18 logmsgbot: ori synchronized php-1.23wmf11/extensions/WikimediaEvents
  • 01:05 logmsgbot: ori synchronized php-1.23wmf10/extensions/WikimediaEvents 'Ifc697cbe6: Revert I829790cd5, removing module storage logging'
  • 00:56 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1006, warm up'
  • 00:56 logmsgbot: ori synchronized php-1.23wmf11/extensions/WikimediaEvents 'Ifc697cbe6: Revert I829790cd5, removing module storage logging'

January 27

  • 22:02 logmsgbot: csteipp synchronized php-1.23wmf10/includes 'bug 60339'
  • 21:58 logmsgbot: csteipp synchronized php-1.23wmf11/includes 'bug 60339'
  • 21:08 logmsgbot: csteipp synchronized php-1.23wmf10/extensions/PdfHandler 'bug 60339'
  • 21:07 logmsgbot: csteipp synchronized php-1.23wmf11/extensions/PdfHandler 'bug 60339'
  • 21:02 ori: dns update
  • 20:54 ori: reload-vcl on cp1043: Backend host '"holmium.eikimrfis.org"' could not be resolved to an IP address: Name or service not known. (culprit: https://gerrit.wikimedia.org/r/#/c/109008/)
  • 20:35 logmsgbot: aaron synchronized php-1.23wmf11/includes/actions/InfoAction.php '5f94782'
  • 19:12 logmsgbot: reedy synchronized docroot/bits/favicon/commons.ico
  • 19:08 logmsgbot: reedy synchronized wmf-config/
  • 19:07 logmsgbot: ori synchronized php-1.23wmf10/extensions/WikimediaShopLink 'I21e34fe0f: Update WikimediaShopLink to master for PHP link insert'
  • 19:05 logmsgbot: ori synchronized php-1.23wmf11/extensions/WikimediaShopLink 'I21e34fe0f: Update WikimediaShopLink to master for PHP link insert'
  • 18:45 ori: cp1054 appears to have recovered, ops to investigate
  • 18:37 ori: dmesg on cp1054 [16167206.414548] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
  • 18:35 ori: varnishd saturating cpu on cp1054
  • 17:26 logmsgbot: csteipp synchronized php-1.23wmf10/extensions/TimedMediaHandler 'bug 56699 refix'
  • 13:35 springle: xtrabackup clone db1022 to db1006
  • 13:26 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1006'
  • 10:44 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1015, warm up'
  • 07:58 springle: xtrabackup clone db1022 to db1015
  • 07:48 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1015'
  • 07:00 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1023'
  • 04:10 springle: xtrabackup clone db1022 to db1023
  • 03:16 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1010, depool db1023'
  • 03:09 andrewbogott: I'm about to do a bunch of experimental bullshit in the production puppet repo, because I can't test network topology in labs.
  • 02:44 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-27 02:44:23+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-27 02:22:27+00:00
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-27 02:12:03+00:00

January 26

  • 21:42 MaxSem: Recreating GeoData index
  • 06:41 logmsgbot: reedy updated /a/common to I3f9759db9: Let admins add users to new groups on zhwikivoyage
  • 03:01 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-26 02:02:40+00:00
  • 02:01 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-26 02:01:52+00:00

January 25

  • 07:47 ori: auth-dns update for I0a3bf9967: Add carbon-relay & statsd service aliases
  • 03:58 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 03:50 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'Set wgVariantArticlePath for zhwikivoyage'
  • 03:49 logmsgbot: reedy updated /a/common to If943ca6e3: repool db1018
  • 02:45 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-25 02:45:51+00:00
  • 02:26 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-25 02:26:16+00:00
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-25 02:14:21+00:00
  • 00:43 awight: payments updated from ad67f1c to b458c8c
  • 00:43 awight: legacy-listener updated from 1ebcfa8 to 5e18193
  • 00:42 awight: SmashPig updated from 3a2a746 to d0d59d5
  • 00:42 awight: tools updated from 7411ecf to d67b054

January 24

  • 21:52 RobH: old outdated (had already beed revoked) star.wikimedia.org cert removed from fenari
  • 21:22 RobH: confirmed no services using old star.wikimedia.org cert on zirconium, shredded cert
  • 21:07 RobH: removing wildcard cert from formey, which appears to be a now defunct test host per rt6134.
  • 07:18 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1018'
  • 03:02 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-24 03:02:40+00:00
  • 02:34 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-24 02:34:30+00:00
  • 02:31 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1018 for schema changes'
  • 02:26 springle: xtrabackup clone db1022 to db1010
  • 02:18 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-24 02:18:35+00:00
  • 02:08 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1010 for schema changes'

January 23

  • 23:39 hashar: Zuul fixed up by killing a stuck job. That somehow prevented Zuul from triggering more jobs :/
  • 23:33 hashar: killed job https://integration.wikimedia.org/ci/job/pywikibot-core-tests/412/
  • 23:32 hashar: jenkins / zuul broken somehow with jobs triggered but not being run in jenkins. Looking
  • 22:20 paravoid: testing ulsfo redundancy, ignore possible ulsfo alerts
  • 20:45 paravoid: setting new ulsfo-eqiad transport
  • 20:45 paravoid: setting up new transit for ulsfo
  • 20:19 RobH: further tweaks on gallium, had to fully disable all 443 use, should all be ok now
  • 19:51 RobH: gallium has had a few patchsets applied recently for removing https support on it, apache restarted, etc.
  • 18:30 hashar: Gerrit: granted jenkins-mwext-sync user the ability to CR+2, V+2 and submit changes on mediawiki/extensions.git repository
  • 18:08 cmjohnson1: lutetium replacing failing disk at slot 8
  • 18:08 cmjohnson1: lutetium replacing failing disk at slot 8
  • 18:07 cmjohnson1: ms-be1008 replacing failing disk slot 11
  • 14:14 mutante: fixing cron spam from planet, was caused by wrong permissions on logs, did somebody run update as root? fixed, but please use sudo -u planet
  • 09:27 ori: restarting carbon on tungsten
  • 05:33 logmsgbot: reedy updated /a/common to Iab1639fd8: Add meta charset
  • 05:33 logmsgbot: reedy synchronized docroot/noc/conf/index.php
  • 05:30 logmsgbot: reedy synchronized docroot/noc/conf/index.php
  • 05:30 logmsgbot: reedy updated /a/common to Ie8d6d2266: Fix broken a tags
  • 04:10 logmsgbot: reedy synchronized docroot/noc/conf/index.php
  • 04:07 logmsgbot: reedy synchronized multiversion/
  • 04:05 logmsgbot: reedy synchronized docroot and w
  • 04:04 logmsgbot: reedy updated /a/common to I06964b159: dirname( __FILE__ ) to __DIR__
  • 02:18 logmsgbot: reedy synchronized php-1.23wmf11/extensions/EducationProgram/
  • 02:14 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I0dbeff07d: : true everywhere'
  • 02:13 logmsgbot: ori updated /a/common to I0dbeff07d: $wgResourceLoaderStorageEnabled: true everywhere
  • 02:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-23 02:09:57+00:00
  • 02:03 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-23 02:03:47+00:00
  • 02:02 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I16738b6ef: : true everywhere except enwiki'
  • 02:02 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-23 02:02:42+00:00
  • 02:02 logmsgbot: ori updated /a/common to I16738b6ef: $wgResourceLoaderStorageEnabled: true everywhere except enwiki
  • 01:08 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'I321ce06ae: Enable module storage on it, de, es and nlwikis'
  • 01:08 logmsgbot: ori updated /a/common to I321ce06ae: Enable module storage on it, de, es and nlwikis
  • 00:56 hasharMeeting: integration.wikimedia.org and doc.wikimedia.org have been successfully migrated behind the misc varnish
  • 00:52 ori: doing staggered graceful restarts on bits app servers for I9e311c98f
  • 00:37 hasharMeeting: migrating (with RobH) integration.wikimedia.org being misc varnish
  • 00:24 hasharMeeting: restarted zuul again
  • 00:17 hasharMeeting: upgrading Zuul (lightning deploy of doom)

January 22

  • 23:40 hasharMeeting: Going to upgrade Zuul, need a patch to make it send Cache-Control: no-cache so we can migrate Zuul service behind misc varnish (being done with RobH)
  • 22:58 ottomata: restarted icinga (a while ago) and puppet freshness checks are all boogery(?) will check up on this tomorrow
  • 22:51 RobH: doc.wikimedia.org updates complete, once dns re-propagates it should be again reachable over https
  • 22:43 RobH: doc.wikimedia.org and doc.mediawiki.org relocating behind misc-web-lb. dns ttl was set to 5 minutes yesterday, so this change should result in less than 5 minutes of unreachability by anyone
  • 22:09 logmsgbot: reedy updated /a/common to Ie95fdf8e9: Adjust linkpurge and renderfile limits
  • 21:26 matanya: git.wikimedia.org is back up
  • 20:44 RobH: all plugins updated except caching cuz i hate it and i plan to move it behind misc-web-lb
  • 20:42 RobH: blog updated
  • 19:40 matanya: git.wikimedia.org is down
  • 19:32 logmsgbot: reedy synchronized wmf-config/
  • 17:40 RobH: virt0 puppet agent re-enabled and resumed normal service
  • 17:05 RobH: issues with change, don't re-enable puppet agent on virt0 yet
  • 16:59 RobH: rolling certificate changes to virt1000 and virt0. virt0 puppet agent disabled for moment
  • 16:33 Nemo_bis: still no data in Global_JobQueue_length ganglia graph on hume (3 days now)
  • 14:34 logmsgbot: csteipp synchronized php-1.23wmf10/includes/media 'bug60339'
  • 14:33 logmsgbot: csteipp synchronized php-1.23wmf11/includes/media 'bug60339'
  • 11:17 apergos: powercycled kaulen, it had died the swap death
  • 10:02 apergos: restarted mw1206 a little while ago, it was inaccessible even via mgmt
  • 06:31 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 05:37 ori: restarting hafnium for kernel upgrade
  • 03:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-22 03:33:21+00:00
  • 00:55 marktraceur: deployed config for jsduck jobs for MultimediaViewer
  • 00:22 logmsgbot: demon synchronized php-1.23wmf11/extensions/CirrusSearch 'Cirrus to master, with better job queues'
  • 00:21 logmsgbot: demon synchronized php-1.23wmf10/extensions/CirrusSearch 'Cirrus to master, with better job queues'
  • 00:18 logmsgbot: demon synchronized php-1.23wmf10/extensions/Elastica 'Elastica to master'
  • 00:15 logmsgbot: demon synchronized php-1.23wmf10/extensions/CirrusSearch 'Rolling back Cirrus -- fatals'
  • 00:11 logmsgbot: demon synchronized php-1.23wmf10/extensions/CirrusSearch 'Cirrus to master, with better job queues'

January 21

  • 23:55 marktraceur: deployed documentation jenkins jobs for MultimediaViewer
  • 23:52 awight: crm updated from e82d2c5 to f71526f
  • 20:24 cmjohnson1: removing iptable rules for loudon and erzurumi
  • 20:20 awight: updated listeners from dc21b3b to 5e18193
  • 20:06 awight: updated listeners from 888561a to dc21b3b
  • 19:51 awight: updated listeners from 7c5396c to 888561a
  • 19:42 logmsgbot: ori synchronized wmf-config/CommonSettings.php 'Ie9ff56755: Add wmgUniversalLanguageSelectorDefault (2/2)'
  • 19:42 logmsgbot: ori synchronized wmf-config/InitialiseSettings.php 'Ie9ff56755: Add wmgUniversalLanguageSelectorDefault'
  • 19:40 logmsgbot: ori updated /a/common to Ie9ff56755: Add 'wmgUniversalLanguageSelectorDefault'
  • 19:17 logmsgbot: ori synchronized php-1.23wmf10/extensions/UniversalLanguageSelector/Resources.php 'I05c76e478: Make ext.uls.mediawiki depend upon ext.uls.init'
  • 19:07 ori: sync-file: mw1206: ssh: connect to host mw1206 port 22: No route to host
  • 19:07 logmsgbot: ori synchronized php-1.23wmf11/extensions/UniversalLanguageSelector/Resources.php 'I05c76e478: Make ext.uls.mediawiki depend upon ext.uls.init'
  • 18:58 awight: updated SmashPig to 43a27bd
  • 17:06 logmsgbot: demon finished scap: No code updates, just rebuilding all i18n (duration: 58m 29s)
  • 16:12 logmsgbot: demon started scap: No code updates, just rebuilding all i18n
  • 11:59 twkozlowski: bits application servers back to normal since around 11:46 UTC
  • 11:32 ori: bits application server overload due to I71b70d8ee
  • 11:25 logmsgbot: ori synchronized php-1.23wmf10/extensions/UniversalLanguageSelector 'I71b70d8ee: Add user preference to enable ULS'
  • 11:24 logmsgbot: ori synchronized php-1.23wmf11/extensions/UniversalLanguageSelector 'I71b70d8ee: Add user preference to enable ULS'
  • 08:19 andrewbogott: For wikitech labs interface: deployed new servicegroup schema and supporting code; ran maintenance/transitionServiceGroupSchema.php
  • 03:47 logmsgbot: ori synchronized php-1.23wmf10/extensions/TimedMediaHandler/TimedMediaHandler.hooks.php 'Update TimedMediaHandler for I7a6da6c62'
  • 03:46 logmsgbot: ori synchronized php-1.23wmf11/extensions/TimedMediaHandler/TimedMediaHandler.hooks.php 'Update TimedMediaHandler for I7a6da6c62'
  • 02:50 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-21 02:50:24+00:00
  • 02:29 ori: restarting gitblit using new upstart job def (I3f32dedf1)
  • 02:28 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-21 02:28:40+00:00
  • 02:15 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-21 02:15:20+00:00

January 20

  • 13:51 akosiaris: finished virt1000 reinitalization from virt0
  • 13:51 akosiaris: started virt1000 reinitialization from virt0
  • 13:44 akosiaris: stopped, backed up virt1000 and virt0 and started them up. Preparing for reinitializing virt1000 from virt0
  • 11:13 akosiaris: ms-be1003 kernel log show first lockups around Jan 19 06:30 UTC being XFS related.
  • 11:00 ori: restarting Gitblit on antimony to test upstart script
  • 10:58 akosiaris: powercycling ms-be1003. Console full of messages BUG: soft lockup - CPU#stuck for #s
  • 02:40 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-20 02:40:25+00:00
  • 02:22 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-20 02:22:28+00:00
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-20 02:12:45+00:00
  • 02:12 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1022, db1039 full steam'
  • 01:39 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1039, warm up'

January 19

  • 18:59 logmsgbot: faidon synchronized wmf-config/InitialiseSettings.php 'switch wmgRC2UDPAddress back to ekrem'
  • 18:56 logmsgbot: faidon updated /a/common to I48a026eb8: Revert wmgRC2UDPAddress to ekrem
  • 18:48 paravoid: cr1-sdtpa/cr2-pmtpa: rolling back config to pre-Tampa outage
  • 11:01 ori: graceful'd apache on antimony to verify Ibd528f8f2
  • 07:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1022'
  • 07:04 springle: xtrabackup clone db1007 to db1039
  • 06:27 logmsgbot: springle synchronized wmf-config/db-pmtpa.php 's7 master update'
  • 05:01 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'promote db1033 s7 master'
  • 04:09 ebernhardson: cherry-picked security fix to Flow for php-1.23wmf11
  • 03:58 logmsgbot: ebernhardson synchronized php-1.23wmf11/extensions/Flow
  • 03:19 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1024'
  • 03:12 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1028'
  • 03:12 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-19 03:12:20+00:00
  • 02:58 p858snake|l: [09:59AEST] <bd808> !log Updated kibana to bef3db2
  • 02:56 scfc_de: Restarted morebots (production-logbot)

January 18

  • 15:38 springle: xtrabackup clone db1007 to db1028
  • 15:31 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1028'
  • 15:14 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1033 in s7'
  • 11:04 springle: xtrabackup clone db1007 to db1033
  • 10:52 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1041'
  • 10:35 paravoid: powercycling ms-be1012; down for 12h, console unresponsive
  • 03:52 Coren: marc (on neon)
  • 03:50 Coren: marc truncated /var/log/ganglia/ganglia_parser.log again. Check incoming email to ops-l
  • 03:39 logmsgbot: ori synchronized php-1.23wmf11/extensions/UniversalLanguageSelector 'Update ULS to master for I862b01e6b (@embed fix)'
  • 03:38 logmsgbot: ori synchronized php-1.23wmf11/extensions/UniversalLanguageSelector/resources/js/ext.uls.webfonts.js 'Update UniversalLanguageSelector to master for I2da436caa: Wait till rendering thread completion before applying webfonts (Bug: 59958)'
  • 03:32 logmsgbot: ori synchronized php-1.23wmf10/extensions/UniversalLanguageSelector 'Update ULS to master for I862b01e6b (@embed fix)'
  • 02:46 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-18 02:46:09+00:00
  • 02:26 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-18 02:26:05+00:00
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-18 02:12:58+00:00
  • 01:28 LeslieCarr: Leslie Carr is signing off. au revoir!
  • 01:23 Reedy: [01:18:21] <LeslieCarr> ok !log exporting no routes over XO transit in sdtpa

January 17

  • 22:12 awight: updated legacy paypal a474870 to 725055a
  • 21:52 ori: to restart morebots: ssh to tool labs, become morebots, then: qdel $(qstat | grep production | cut -d' ' -f 1) ; sleep 5 ; jstart -N production /usr/lib/adminbot/adminlogbot.py --config ./confs/production-logbot.py
  • 21:51 ori: restarted morebots
  • 21:51 LeslieCarr: removed myself from all mail aliases
  • 17:41 logmsgbot: faidon synchronized wmf-config/filebackend.php 'filebackend: disable pmtpa backend'
  • 17:38 logmsgbot: faidon updated /a/common to I0647eb641: filebackend: disable pmtpa backend
  • 17:21 hashar: pmtpa apparently unreacheable
  • 16:14 logmsgbot: hashar synchronized wmf-config/InitialiseSettings.php 'touch, restore $oaiAgentRegex'
  • 16:14 logmsgbot: hashar synchronized wmf-config/CommonSettings.php 'restore $oaiAgentRegex'
  • 09:44 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'LB 'dump' s[2-7]'
  • 08:50 manybubbles: finished elasticsearch upgrade. we're now 0.90.10 all the way.
  • 08:50 ori: Replacing StatsD on tungsten with txStatsD; see commit I19ecf608d for rationale.
  • 08:50 ori: restarted morebots. morebots missed Sean's 'repool db1007, depool db1041' @ 6:26 UTC.
  • 08:46 ori: Replacing StatsD on tungsten with txStatsD; see commit I19ecf608d for rationale.
  • 05:43 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1007, depool db1041'
  • 05:00 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1007'
  • 04:03 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-17 04:03:24+00:00
  • 03:26 logmsgbot: LocalisationUpdate completed (1.23wmf11) at 2014-01-17 03:26:07+00:00
  • 02:32 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-17 02:32:01+00:00

January 16

  • 21:34 manybubbles: enabled disk space aware allocator on Elasticsearch cluster. now it won't do as many stupid things!
  • 21:00 logmsgbot: ori synchronized php-1.23wmf10/extensions/UniversalLanguageSelector/resources/js/ext.uls.webfonts.js 'Update UniversalLanguageSelector to master for I2da436caa: Wait till rendering thread completion before applying webfonts (Bug: 59958)'
  • 20:49 Coren: Deleted all but last 50K lines from /var/log/ganglia/ganglia_parser.log (never rotated, not at 11G) from neon
  • 20:46 Coren: Deleted pacct.0 on neon; activity still silly high
  • 19:30 logmsgbot: reedy synchronized wmf-config/
  • 19:22 logmsgbot: reedy synchronized database lists files:
  • 19:20 Reedy: Created zhwikivoyage echo tables on db1029
  • 19:18 LeslieCarr: reseating pem0 on cr2-ulsfo to try and clear problem
  • 19:13 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki, testwiki, testwikidatawiki and mediawikiwiki to 1.23wmf11
  • 19:12 ^d: dropped all cirrusSearchLinksUpdate jobs from enwiki job queue. Job queue back to ~330k entries, far better than the ~3mil
  • 19:06 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedias to 1.23wmf10
  • 18:50 paravoid: powercycling ms-be1002, down, console unresponsive
  • 18:44 manybubbles: starting Elasticsearch upgrade from 0.90.9 to 0.90.10
  • 18:01 logmsgbot: reedy synchronized php-1.23wmf11/includes/specials/SpecialWatchlist.php
  • 17:53 logmsgbot: reedy synchronized docroot and w
  • 17:45 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki back to 1.23wmf9 till window
  • 17:29 logmsgbot: reedy finished scap: testwiki to 1.23wmf11 and build l10n cache (duration: 23m 24s)
  • 17:08 logmsgbot: reedy started scap: testwiki to 1.23wmf11 and build l10n cache
  • 16:48 cmjohnson1: es1007 replacing disk slot 6
  • 16:47 cmjohnson1: tungsten replacing failing disk slot 3
  • 16:47 cmjohnson1: db1004 replacing failing disk slot 9
  • 16:43 logmsgbot: reedy synchronized php-1.23wmf11
  • 16:14 bblack: rebooting cp1055 to clear out mess from XFS/kmem_alloc bug
  • 15:48 cmjohnson1: disabling puppet on db29
  • 11:55 akosiaris: fixed the cp1065 puppet freshness check constantly bugging us and reenabled notifications after that
  • 09:19 logmsgbot: ori updated /a/common to Id13e614e5: repool db1042
  • 06:01 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1042'
  • 05:58 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1033'
  • 05:21 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1042'
  • 03:28 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at 2014-01-16 03:28:58+00:00
  • 03:00 Coren: Deleted all non-current pacct on neon; activity seems to have increased tenfold and /var/log is OOS
  • 02:51 logmsgbot: LocalisationUpdate completed (1.23wmf9) at 2014-01-16 02:51:29+00:00
  • 02:27 logmsgbot: LocalisationUpdate completed (1.23wmf10) at 2014-01-16 02:27:00+00:00
  • 01:30 logmsgbot: catrope synchronized php-1.23wmf10/extensions/VisualEditor 'Forgot to run git submodule update earlier'
  • 01:28 logmsgbot: catrope synchronized php-1.23wmf10/resources/startup.js 'touch'
  • 01:27 logmsgbot: catrope synchronized php-1.23wmf10/extensions/VisualEditor/modules/ve-mw/dm/nodes/ve.dm.MWTransclusionNode.js 'touch'
  • 01:26 logmsgbot: catrope synchronized php-1.23wmf9/extensions/VisualEditor/modules/ve-mw/dm/nodes/ve.dm.MWTransclusionNode.js 'touch'
  • 01:00 logmsgbot: aaron synchronized php-1.23wmf10/includes/filebackend/SwiftFileBackend.php '1218cc7e9cba8342d0184523f1d7c1fec608e656'
  • 00:49 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'Config new Cirrus logs'
  • 00:43 logmsgbot: catrope synchronized php-1.23wmf10/extensions/VisualEditor 'Update VE for cherry-picks'
  • 00:43 logmsgbot: catrope synchronized php-1.23wmf9/extensions/VisualEditor 'Update VE for cherry-picks'
  • 00:29 logmsgbot: mflaschen synchronized php-1.23wmf10/extensions/GettingStarted/ 'Sync GettingStarted wmf10 for hotfix'
  • 00:29 logmsgbot: mflaschen synchronized php-1.23wmf9/extensions/GettingStarted/ 'Sync GettingStarted wmf9 for hotfix'
  • 00:16 logmsgbot: mflaschen synchronized wmf-config/InitialiseSettings.php 'Deploy GuidedTour to astwiki, fawiki, and ruwiki'
  • 00:07 Tim: rebooting cp1066 following XFS deadlock

January 15

  • 23:33 Tim: on cp1066: stopping varnish -- kswapd has gone crazy and is continuously flushing all buffers
  • 21:30 hashar: Jenkins: generation of MediaWiki documentation with doxygen got broken since Zuul upgrade. Fix: 107718
  • 21:17 logmsgbot: aaron synchronized php-1.23wmf10/includes/filebackend/SwiftFileBackend.php 'bc2a0ddbff4c3e4bff8e931163fd3d1c0340a78e'
  • 20:50 ^d: gerrit: running gc on all repositories. think weekly cron to do this might be broken.
  • 20:49 manybubbles: deployment-prep updating elasticsearch to 0.90.10
  • 20:26 cmjohnson1: db1031 swapping failing disk slot 11
  • 19:37 cmjohnson1: replacing failing disk db1031 slot 8
  • 18:22 awight: crm updated from 43a7475df8461f59ab49d8a932ccd76257dc6d72 to e82d2c5e64e67bee9bfa1f703fc94602addea2cb
  • 17:18 ottomata: updated ganglios in apt to 1.3
  • 14:58 logmsgbot: reedy updated /a/common to I26f2f5322: db1040 to full steam
  • 14:48 cmjohnson1: replacing failing disk db1031 slot 6
  • 14:42 cmjohnson1: replacing failing disk on db1029 slot 8
  • 14:37 cmjohnson1: replacing failed disk ms1001
  • 12:42 logmsgbot: hashar synchronized wmf-config/InitialiseSettings.php 'adding '*.openbeelden.nl' to the wgCopyUploadsDomains array. 107138'
  • 09:08 hashar: jenkins: restarting Zuul to make sure it uses the new python-git
  • 09:02 hashar: jenkins: python-git package receiving an "update" 0.3.2~RC1-1~precise1 -> 0.3.2.RC1-1 (same version assuming we repackaged it somehow)
  • 05:13 Ryan_Lane: make that python-git
  • 05:12 Ryan_Lane: adding git-python to the apt repo. it comes from the saltstack ppa.
  • 04:30 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1040 to full steam'
  • 03:45 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1040'
  • 03:13 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jan 15 03:13:57 UTC 2014
  • 02:40 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Wed Jan 15 02:40:09 UTC 2014
  • 02:21 logmsgbot: LocalisationUpdate completed (1.23wmf10) at Wed Jan 15 02:21:14 UTC 2014
  • 00:39 Tim: on ssl1, testing CPU hotplug feature for possible power impact

January 14

  • 23:04 logmsgbot: maxsem synchronized wmf-config/CommonSettings.php 'https://gerrit.wikimedia.org/r/107499'
  • 22:20 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/107489'
  • 22:13 logmsgbot: maxsem synchronized wmf-config 'https://gerrit.wikimedia.org/r/107483'
  • 22:08 logmsgbot: maxsem synchronized wmf-config 'https://gerrit.wikimedia.org/r/107483'
  • 22:01 logmsgbot: maxsem finished scap: Preparing for TextExtracts deployment (duration: 30m 26s)
  • 21:32 logmsgbot: maxsem started scap: Preparing for TextExtracts deployment
  • 21:26 logmsgbot: reedy updated /a/common to Ie5fe8bd7e: Update interwiki cache
  • 21:25 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 21:24 logmsgbot: reedy synchronized wikidataclient.dblist
  • 21:22 logmsgbot: maxsem started scap
  • 21:02 logmsgbot: reedy updated /a/common to I24dd4a00e: Create Chinese Wikivoyage (zhwikivoyage)
  • 21:02 logmsgbot: reedy synchronized wmf-config/interwiki.cdb 'Updating interwiki cache'
  • 20:55 logmsgbot: reedy synchronized php-1.23wmf10/extensions/Wikibase/lib/resources/wikibase.Site.js 'touch'
  • 20:51 logmsgbot: reedy synchronized php-1.23wmf10/resources 'touch'
  • 20:36 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 20:35 logmsgbot: reedy synchronized database lists files:
  • 20:28 logmsgbot: reedy synchronized database lists files:
  • 20:27 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 20:26 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files:
  • 20:15 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php 'touch'
  • 20:15 logmsgbot: reedy synchronized database lists files: Enable Wikidata on wikisources
  • 20:12 logmsgbot: reedy synchronized wmf-config/
  • 19:27 logmsgbot: reedy synchronized php-1.23wmf10/includes/deferred/LinksUpdate.php
  • 19:06 ottomata: deployed nginx log format changes on ssl*
  • 19:05 logmsgbot: reedy synchronized php-1.23wmf10/extensions/Wikibase 'https://gerrit.wikimedia.org/r/#/c/107375'
  • 19:04 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Closed wikipedias to 1.23wmf10 too
  • 19:02 logmsgbot: reedy updated /a/common to Ic854d268b: All non wikipedias to 1.23wmf10
  • 19:01 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: All non wikipedias to 1.23wmf10
  • 18:14 Reedy: Created and populated sites and site_identifier tables on all wikisource projects
  • 17:22 logmsgbot: reedy updated /a/common to I8d1013484: db1049 to full steam
  • 15:04 hashar: Jenkins: uninstalled phpunit has provided by PEAR (pear uninstall pear.phpunit.de/PHPUnit) We are using Wikimedia deployment system now (integration/phpunit)
  • 14:38 springle: xtrabackup clone db1042 to db1040
  • 14:30 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1049 to full steam'
  • 11:41 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1049'
  • 05:20 springle: xtrabackup clone db1022 to db1040
  • 03:27 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jan 14 03:27:39 UTC 2014
  • 03:08 springle: xtrabackup clone db1050 to db1049
  • 03:04 springle: synchronized wmf-config/db-eqiad.php 'depool db1040 and db1049'
  • 01:19 logmsgbot: kaldari synchronized php-1.23wmf10/extensions/MobileFrontend 'updating MobileFrontend in wmf10'
  • 00:35 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1055 and db1023'
  • 00:32 logmsgbot: kaldari synchronized php-1.23wmf9/extensions/MobileFrontend 'updating MobileFrontend in wmf9'

January 13

  • 23:25 LeslieCarr: started indexing on searchidx1001 with " su -s /bin/bash -c "/a/search/lucene.jobs.sh inc-updater-start" lsearch"
  • 21:37 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Typofix in variable name'
  • 20:01 logmsgbot: aaron synchronized php-1.23wmf10/includes/filebackend/SwiftFileBackend.php 'c1ab935f1307876a2127b17ad8ba1108f3e877b8'
  • 19:56 logmsgbot: aaron synchronized wmf-config/CommonSettings.php 'Prevent ParsoidCacheUpdateJobOnDependencyChange from running in the main loops'
  • 18:44 logmsgbot: aaron synchronized php-1.23wmf9/includes/job/jobs/HTMLCacheUpdateJob.php 'Remove live sleep() hack in html cache jobs'
  • 18:43 logmsgbot: aaron updated /a/common/php-1.23wmf9 to Ic44c352a8: Update MobileFrontend to wmf/1.23wmf9 tip
  • 18:35 logmsgbot: catrope synchronized wmf-config/InitialiseSettings.php 'touch'
  • 17:53 logmsgbot: catrope synchronized visualeditor-default.dblist 'Enable VE by default on phase 4 Wikipedias'
  • 17:06 ^d: cirrus indexes created for enwiki
  • 17:05 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php 'enwiki gets Cirrus as secondary index'
  • 17:04 logmsgbot: demon synchronized wmf-config/CirrusSearch-common.php 'Commonswiki + enwiki Cirrus settings'
  • 16:58 logmsgbot: mark synchronized wmf-config/PoolCounterSettings-eqiad.php
  • 16:58 logmsgbot: mark updated /a/common to I89b765424: Revert "Raise ArticleView pool size by 50%"
  • 15:12 logmsgbot: mark synchronized wmf-config/CommonSettings.php 'Revert Remove = true'
  • 15:11 logmsgbot: mark updated /a/common to Ib336530c8: Revert "Remove $wgCategoryTreeDynamicTag = true"
  • 14:45 akosiaris: revoked hooper.wikimedia.org in puppetCA, Salt, stored configs in puppet cleaned
  • 14:00 akosiaris: powering off hooper
  • 13:47 logmsgbot: mark synchronized wmf-config/PoolCounterSettings-eqiad.php 'Raise ArticleView pool queue size by 50%'
  • 13:46 logmsgbot: mark updated /a/common to I0442878ea: Raise ArticleView pool size by 50%
  • 12:47 akosiaris: started poolcounter on potassium
  • 12:46 mutante: starting poolcounter on heloum
  • 12:45 MaxSem: that was https://bugzilla.wikimedia.org/show_bug.cgi?id=59993
  • 12:44 akosiaris: restarted poolcounter on potassium, helium after MaxSem's request
  • 10:26 hashar: upgrading packages on gallium and lanthanum
  • 05:18 ^d: enwiki reporting lsearchd hasn't updated in days. Cursory investigation says this is right. Nothing in searchidx1001's logs seems telling, yet.
  • 02:39 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jan 13 02:39:15 UTC 2014
  • 02:21 logmsgbot: LocalisationUpdate completed (1.23wmf10) at Mon Jan 13 02:21:29 UTC 2014
  • 02:11 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Mon Jan 13 02:11:41 UTC 2014

January 12

  • 19:53 ori: gerrit intermittently unresponsive; restarting.
  • 19:51 ori: gerrit lockup on ytterbium - dozens of gerrit procs busy-waiting on futex
  • 02:42 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jan 12 02:42:38 UTC 2014
  • 02:23 logmsgbot: LocalisationUpdate completed (1.23wmf10) at Sun Jan 12 02:23:36 UTC 2014
  • 02:12 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Sun Jan 12 02:12:44 UTC 2014
  • 00:04 logmsgbot: reedy synchronized wmf-config/ '//gerrit.wikimedia.org/r/107008'
  • 00:01 logmsgbot: reedy updated /a/common to I54a2246d7: db1060 to full steam

January 11

  • 21:19 ori: ytterbium: CPU %user spike starting 19:40, gerrit very slow.
  • 05:27 springle: xtrabackup clone db1022 to db1023
  • 05:11 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1060 to full steam'
  • 03:21 Krinkle: Reloading zuul to deploy I1053812d8acfe9
  • 02:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jan 11 02:47:58 UTC 2014
  • 02:28 logmsgbot: LocalisationUpdate completed (1.23wmf10) at Sat Jan 11 02:28:04 UTC 2014
  • 02:14 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Sat Jan 11 02:14:21 UTC 2014
  • 01:29 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1060 in s2'

January 10

  • 21:09 RobH: cp1043/1044 both updated and back online
  • 21:01 RobH: stopping nginx on cp1043 and tinkering with its star.wikimedia.org certificate stuff, cp1044 is still online to handle misc-web-lb
  • 19:06 logmsgbot: aaron synchronized wmf-config/filebackend.php 'Setup and enabled redisLockManager for all file backends in use'
  • 13:56 hashar: Gerrit: adding WikidataJenkins user to groups "Non-Interactive Users" and "Wikidata"
  • 11:52 mutante: welcome new software deployer Kartik from Language Engineering
  • 07:58 springle: xtrabackup clone db1018 to db1060
  • 05:55 springle: xtrabackup clone db1050 to db1055
  • 03:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jan 10 03:32:03 UTC 2014
  • 03:00 logmsgbot: spage synchronized php-1.23wmf10/extensions/Flow/includes/Repository/TreeRepository.php 'Flow cache key fix to 1.23wmf10'
  • 01:46 paravoid: swift: set_weight 0 to ms-be11/sde1, disk failed
  • 01:45 logmsgbot: maxsem synchronized php-1.23wmf10/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/#/c/106639/'
  • 01:41 logmsgbot: maxsem synchronized php-1.23wmf9/extensions/MobileFrontend/ 'https://gerrit.wikimedia.org/r/#/c/106639/'
  • 01:34 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'Reduce LB general traffic during reindexing/partitioning on groupLoadsBySection slaves'
  • 01:21 paravoid: restoring RT's mail interface; expect delayed RT email to arrive soon
  • 01:19 paravoid: apt: installing trusty's libio-socket-ssl-perl into precise-wikimedia; SNI-capable
  • 01:06 RobH: gerrit seems faster again for now
  • 01:02 RobH: gerrit is slowed down and memory is maxed out on system, restarting gerrit to see if it helps
  • 00:08 logmsgbot: maxsem synchronized wmf-config/InitialiseSettings.php 'https://gerrit.wikimedia.org/r/106629'

January 9

  • 22:28 logmsgbot: ori synchronized php-1.23wmf10/extensions/VisualEditor/ApiVisualEditor.php 'Update VisualEditor for cherry pick I5cc44c5ef35 (bug 59867)'
  • 22:19 logmsgbot: aaron synchronized php-1.23wmf10/includes/libs '1a5ac00f8991905d8fd643d53cb744024143c558'
  • 20:51 logmsgbot: aaron synchronized php-1.23wmf10/includes/filebackend '94cb6f164ed2b9f7dbdf809e2d861ae04ab1df21'
  • 20:20 logmsgbot: aaron synchronized php-1.23wmf10/includes/filebackend 'e45db51adeddcc71a97e6547a6f4ecaf5f320a8c'
  • 20:19 logmsgbot: aaron synchronized php-1.23wmf10/includes/filebackend 'e45db51adeddcc71a97e6547a6f4ecaf5f320a8c'
  • 20:18 logmsgbot: aaron synchronized php-1.23wmf10/includes/filebackend 'e45db51adeddcc71a97e6547a6f4ecaf5f320a8c'
  • 20:08 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: mediawikiwiki to 1.23wmf10 aswell
  • 20:07 logmsgbot: reedy finished scap (duration: 19m 42s)
  • 19:50 logmsgbot: reedy started scap
  • 19:41 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: phase1 wikis minux mediawikiwiki to 1.23wmf10
  • 19:38 logmsgbot: reedy started scap: Update 1.23wmf10 l10n cache
  • 19:34 logmsgbot: reedy started scap: Update 1.23wmf10 l10n cache
  • 19:33 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: 1.23wmf10 looks brokened
  • 19:30 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: phase1wikis to 1.23wmf10
  • 19:07 logmsgbot: reedy synchronized wmf-config/ 'AssertEdit only on 1.23wmf9'
  • 19:04 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Wikipedias to 1.23wmf9
  • 18:22 logmsgbot: reedy updated /a/common to Ib6f2c4d96: cleanup old Wikibase settings which are same as default now
  • 18:22 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: testwiki back to 1.23wmf9 till window
  • 18:20 logmsgbot: reedy finished scap: testwiki to 1.23wmf10 build l10n cache (duration: 31m 21s)
  • 17:53 logmsgbot: reedy started scap: testwiki to 1.23wmf10 build l10n cache
  • 17:47 logmsgbot: reedy synchronized php-1.23wmf10
  • 17:33 logmsgbot: reedy updated /a/common to Iaada76f43: pdf1 died, replace with pdf2
  • 17:31 logmsgbot: robh synchronized wmf-config/
  • 17:17 logmsgbot: reedy synchronized docroot and w
  • 17:16 logmsgbot: reedy updated /a/common to I5f2816468: Segregate watchlist and recentshangeslinked queries on all shards.
  • 17:14 logmsgbot: reedy synchronized php-1.23wmf10
  • 04:31 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'watchlist/recnetchangeslinked LB on s[234567]'
  • 03:33 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jan 9 03:33:55 UTC 2014
  • 03:03 springle: partition logging tables on logpager slaves: s2 db1002, s4 db1004, s5 db1026, s6 db1040
  • 03:03 springle: wikitech down. restarted apache on virt0. phusion_passenger exception + MaxClients hit
  • 02:03 awight: crm updated from 1234a7f9deb8cb7a526b0d7e12bd6ee369b570a1 to 43a7475df8461f59ab49d8a932ccd76257dc6d72
  • 01:14 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'repool db1041'
  • 00:38 logmsgbot: ori synchronized php-1.23wmf8/extensions/EventLogging 'Update EventLogging to master'
  • 00:25 logmsgbot: maxsem synchronized php-1.23wmf8/extensions/Collection/ 'Shit hit fan'
  • 00:21 logmsgbot: maxsem synchronized php-1.23wmf8/extensions/Collection/ 'https://gerrit.wikimedia.org/r/106293'
  • 00:15 logmsgbot: maxsem synchronized wmf-config 'https://gerrit.wikimedia.org/r/106236'
  • 00:10 logmsgbot: mholmquist synchronized php-1.23wmf9/extensions/MultimediaViewer/
  • 00:08 logmsgbot: mholmquist synchronized php-1.23wmf8/extensions/MultimediaViewer/

January 8

  • 22:51 logmsgbot: aaron finished scap: timing test (beta) (duration: 17m 15s)
  • 22:36 logmsgbot: aaron started scap: timing test (beta)
  • 22:11 Jeff_Green: moved fundraising.wikimedia.org to 208.80.154.12, flipped DNS
  • 21:52 awight: payments updated from 1bb7171fbf1d3a304cbede927737c26db4c24afb to b7e3ddf438d457d02d54da232ad7c7ba1a2d8794
  • 18:50 LeslieCarr: removing all messages older than 24 hours from exim queue on sodium (all bounces)
  • 18:48 LeslieCarr: removing all messages older than 7 days from exim queue on sodium
  • 18:31 awight: crm updated from 2a8de10439f22762abc10d3d3c41a1020c866454 to 1234a7f9deb8cb7a526b0d7e12bd6ee369b570a1
  • 18:12 RobH: cp1043 and cp1044 have had the reissued star.wikimedia.org certificate put into place
  • 17:41 RobH: reissued old star.wikimedia.org certificate
  • 17:33 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php
  • 17:32 logmsgbot: demon synchronized php-1.23wmf8/extensions/CirrusSearch
  • 17:32 logmsgbot: demon synchronized php-1.23wmf9/extensions/CirrusSearch
  • 17:14 logmsgbot: demon synchronized wmf-config/InitialiseSettings.php
  • 17:14 logmsgbot: demon synchronized cirrus.dblist
  • 17:02 logmsgbot: demon synchronized wmf-config/CommonSettings.php 'No changes, code cleanup. Ia4910d22'
  • 16:57 logmsgbot: demon synchronized multiversion/ 'No changes, code cleanup. Ia4910d22'
  • 15:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'depool db1041, forced index query errors during partitioning test'
  • 13:54 hashar: restarting Zuul, it lost track of some jobs and keep changes in its queues for no real reason :/
  • 13:04 hashar: "upgrading" pep8 package 1.4.6-1 --> 1.4.6-1.1 (simply provides python3-pep8) RT #6420
  • 13:01 hashar: jenkins: migrating jobs git url from integration.wikimedia.org to zuul.eqiad.wmnet bug 59774 106116
  • 10:30 hashar: screwed gallium by doing a chown -R jenkins-slave:jenkins-slave on /srv/ that includes the Zuul git repositories :(
  • 03:26 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jan 8 03:26:40 UTC 2014
  • 03:00 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Wed Jan 8 02:27:28 UTC 2014
  • 01:56 halfwight: crm updated from 41e624c124dce0bd6e0d2a0f72e0cd91ee34f3fa to 2a8de10439f22762abc10d3d3c41a1020c866454
  • 01:25 logmsgbot: bsitu synchronized php-1.23wmf9/extensions/Flow 'Revert "Utilize BufferedCache in TreeRepository"'
  • 01:20 logmsgbot: reedy synchronized wmf-config/
  • 01:18 logmsgbot: kaldari synchronized php-1.23wmf9/extensions/MobileFrontend/
  • 01:17 logmsgbot: reedy updated /a/common to Ib7ed56216: Fix indenting in $wmgUseCategoryTree
  • 01:14 logmsgbot: reedy synchronized wmf-config/
  • 01:06 logmsgbot: reedy updated /a/common to I613b194f9: Remove $wgCategoryTreeDisableCache, leave as default of true
  • 01:06 logmsgbot: reedy synchronized php-1.23wmf9/extensions/CategoryTree/CategoryTreeFunctions.php 'rv'
  • 01:04 logmsgbot: reedy synchronized wmf-config/
  • 01:01 logmsgbot: reedy updated /a/common to Ic65d167b3: Use uca-cy collation on Welsh projects
  • 00:43 logmsgbot: reedy synchronized php-1.23wmf9/resources 'touch'
  • 00:41 logmsgbot: reedy synchronized php-1.23wmf9/extensions/CategoryTree/CategoryTreeFunctions.php
  • 00:35 logmsgbot: kaldari synchronized php-1.23wmf9/extensions/VectorBeta
  • 00:01 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php

January 7

  • 23:44 logmsgbot: reedy updated /a/common to I0e19ec875: All non wikipedias to 1.23wmf9
  • 23:28 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Non wikipedias back to 1.23wmf9
  • 23:14 logmsgbot: bsitu synchronized php-1.23wmf9/extensions/Flow 'Update Flow to master'
  • 23:07 logmsgbot: reedy updated /a/common to I2ec865afd: votewiki really doesn't need TMH for anything
  • 19:58 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Revert All non wikipedias to 1.23wmf9
  • 19:52 logmsgbot: csteipp synchronized php-1.23wmf8/includes 'bug 58699'
  • 19:47 ottomata: merged changes to eventlogging varnishncsa instances. user agent added, esams instances now send directly to vanadium instead of gadolinium relay
  • 19:46 logmsgbot: csteipp synchronized php-1.23wmf9/includes 'bug 58699'
  • 19:45 logmsgbot: csteipp synchronized php-1.23wmf9/img_auth.php 'bug 57016'
  • 19:00 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: All non wikipedias to 1.23wmf9
  • 18:17 logmsgbot: reedy synchronized wmf-config/CommonSettings.php 'Switch math to using version independent math binary'
  • 18:16 Reedy: Copied math binary to /usr/local/apache/uncommon/bin without version variant
  • 18:10 logmsgbot: reedy synchronized wmf-config/
  • 18:08 Reedy: Deleting old math binaries across mediawiki-installation
  • 17:53 logmsgbot: reedy updated /a/common to I98121adf9: db1006 back to full steam
  • 17:03 ottomata: added varnishkafka 1.0.1-1 to apt; installed on mobile varnishes
  • 16:09 akosiaris: upgrading php-wikidiff2 on test.wikimedia.org
  • 10:18 hashar: mediawiki-core-lint job is broken on gallium (can't clone repository)
  • 09:32 hashar: restarted Zuul with the Gearman backend. Jobs might ends up being broken :(
  • 09:27 hashar: restarting Zuul and enabling Gearman plugin in Jenkins
  • 09:24 hashar: upgrading Zuul
  • 09:18 logmsgbot: nikerabbit finished scap: CLDR 24 plural rule backport (duration: 40m 10s)
  • 08:42 logmsgbot: nikerabbit started scap: CLDR 24 plural rule backport
  • 08:31 springle: logging table partitioning test on s7 db1041
  • 08:31 logmsgbot: nikerabbit synchronized php-1.23wmf9/extensions/UniversalLanguageSelector/lib/jquery.uls/src/jquery.uls.regionfilter.js 'ULS bugfix'
  • 08:29 logmsgbot: nikerabbit synchronized php-1.23wmf9/extensions/UniversalLanguageSelector/resources/css/ext.uls.css 'ULS bigfix'
  • 06:19 ebernhardson: ebernhardson Flow pages on mw were targeted by a spambot, deployed Ic096bfca35f3da5ebe508e224fecac6361753061 to put flow in read-only mode.
  • 06:11 logmsgbot: ori synchronized php-1.23wmf9/extensions/Flow 'I00d32ecec: Use Container namespace'
  • 06:03 logmsgbot: ori synchronized php-1.23wmf9/extensions/Flow 'Ic096bfca3: Emergency spam prevention'
  • 04:10 logmsgbot: tstarling synchronized php-1.23wmf8/extensions/GlobalBlocking/SpecialGlobalBlock.php
  • 03:59 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1006 full steam'
  • 03:18 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'warm up db1006'
  • 03:16 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jan 7 03:16:06 UTC 2014
  • 01:41 logmsgbot: aaron finished scap: timing test (beta) (duration: 02m 28s)
  • 01:39 logmsgbot: aaron started scap: timing test (beta)
  • 01:37 logmsgbot: aaron finished scap: timing test (beta) (duration: 04m 39s)
  • 01:33 logmsgbot: aaron started scap: timing test (beta)
  • 01:19 logmsgbot: aaron finished scap: timing test (beta) (duration: 25m 28s)
  • 00:56 logmsgbot: aaron started scap: timing test (beta)

January 6

  • 23:54 logmsgbot: ori synchronized php-1.23wmf8/resources/mediawiki/mediawiki.js 'Ifa97d36d3a: Restore module storage experiment'
  • 23:54 logmsgbot: ori synchronized php-1.23wmf8/extensions/WikimediaEvents 'Ieef052279: Update WikimediaEvents for module storage exp.'
  • 23:52 logmsgbot: ori synchronized php-1.23wmf9/resources/mediawiki/mediawiki.js 'Ifa97d36d3a: Restore module storage experiment'
  • 23:51 logmsgbot: ori synchronized php-1.23wmf9/extensions/WikimediaEvents 'Ieef052279: Update WikimediaEvents for module storage exp.'
  • 23:06 logmsgbot: manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch 'update cirrus to master to hopefully reduce load on elasticsearch'
  • 23:04 logmsgbot: manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch 'update cirrus to master to hopefully reduce load on elasticsearch'
  • 22:27 hashar: killed nrpe on gallium
  • 22:17 logmsgbot: reedy synchronized wmf-config/CommonSettings.php
  • 22:16 awight: crm updated from 8e264d4b2baa7f6d7e7cfef09280f923ee400c58 to 41e624c124dce0bd6e0d2a0f72e0cd91ee34f3fa
  • 22:08 logmsgbot: reedy updated /a/common to Ie0884149b: reduce db1006 LB during reindexing
  • 21:37 awight: updated crm from 92a58a634bcc78bc53f7eae0af45f75693a37187 to 8e264d4b2baa7f6d7e7cfef09280f923ee400c58
  • 21:23 awight: crm updated from d32118b16a45f2d80388c615df262a0854974478 to 92a58a634bcc78bc53f7eae0af45f75693a37187
  • 20:40 RobH: ganglia cert update complete, its all workin
  • 20:34 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'reduce db1006 LB during reindexing'
  • 20:31 RobH: going to merge cert update on ganglia, so if it dies, i messed up
  • 20:29 hashar: jenkins: blacklisted l10n-bot in Zuul so it should no more trigger anything 102636
  • 20:21 awight: crm updated from 6964b1e4a337b93a89c10cf119eb81f3512bcbba to d32118b16a45f2d80388c615df262a0854974478
  • 20:19 awight: crm updated from 81d06d3d21c26c38fb80cd027f8b66308ca3fac7 to 6964b1e4a337b93a89c10cf119eb81f3512bcbba
  • 20:04 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'LB changes'
  • 19:52 mutante: disabled notificatios for puppet fresheness on cp1065, annoying icinga-wm told us about it evevery second
  • 18:21 logmsgbot: manybubbles synchronized wmf-config/InitialiseSettings.php
  • 18:01 logmsgbot: manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/
  • 17:56 logmsgbot: manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch/
  • 17:38 logmsgbot: manybubbles synchronized cirrus.dblist
  • 17:37 logmsgbot: manybubbles synchronized wmf-config/
  • 17:27 logmsgbot: manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch/
  • 17:18 logmsgbot: manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/
  • 17:17 logmsgbot: manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/
  • 17:12 logmsgbot: manybubbles synchronized php-1.23wmf8/extensions/CirrusSearch/
  • 17:09 logmsgbot: manybubbles synchronized php-1.23wmf9/extensions/CirrusSearch/
  • 17:03 logmsgbot: reedy synchronized php-1.23wmf9/resources/mediawiki/images
  • 16:41 logmsgbot: reedy synchronized wmf-config/
  • 16:28 logmsgbot: reedy finished scap: Rebuild lolcalisation cache after https://gerrit.wikimedia.org/r/#/c/105684 (duration: 39m 45s)
  • 15:50 logmsgbot: reedy started scap: Rebuild lolcalisation cache after https://gerrit.wikimedia.org/r/#/c/105684
  • 15:46 cmjohnson1: rebooting gadolinium
  • 14:40 logmsgbot: reedy updated /a/common to I349589a49: Enable Wikibase Client on testwiki
  • 14:26 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 13:56 hashar: jenkins: migrated mediawiki extension loader from /tools/extensions-loader.php to mediawiki/conf.d/50_mw_ext_loader.php 105680
  • 11:17 hashar: jenkins: uninstalling phpcs on gallium (was installed with pear, now deployed using our deploy system) bug 57064
  • 10:50 hashar: jenkins: made mediawiki-core-phpunit-parser job executable concurrently, might cause race conditions. 102152
  • 10:15 hashar: jenkins: refreshing jslint files for mediawiki extensions, making sure they are in sync
  • 06:38 apergos: rebooting searchidx1001, hanging on various things, first sign of trouble was 'task kswapd0:115 blocked for more than 120 seconds.'
  • 05:36 logmsgbot: aaron started scap: timing test (beta)
  • 03:00 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jan 6 03:00:27 UTC 2014
  • 02:35 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Mon Jan 6 02:35:14 UTC 2014
  • 02:18 logmsgbot: LocalisationUpdate completed (1.23wmf8) at Mon Jan 6 02:18:30 UTC 2014
  • 00:59 logmsgbot: reedy synchronized wmf-config/ '[x] Yes to AJAX'
  • 00:52 logmsgbot: reedy synchronized wmf-config/ '$wgUseAjax = true;'
  • 00:49 logmsgbot: reedy synchronized wmf-config/CommonSettings.php 'Fix notice'
  • 00:48 logmsgbot: reedy updated /a/common to I23469a49a: Disable a few more extensions from VoteWiki and LoginWiki
  • 00:47 logmsgbot: reedy synchronized wmf-config/
  • 00:44 logmsgbot: reedy updated /a/common to I854ea451b: Disable a handful more extensions form loginwiki and votewiki
  • 00:37 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 00:35 logmsgbot: reedy updated /a/common to Ie45a06c38: Disable BetaFeatures on votewiki and loginwiki
  • 00:20 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 00:18 logmsgbot: reedy updated /a/common to Ibdd8671a1: Fixup CategoryTree config, disable where necessary
  • 00:15 logmsgbot: reedy synchronized wmf-config/
  • 00:09 logmsgbot: reedy updated /a/common to I624133239: Disable more stuff on loginwiki and votewiki

January 5

  • 23:59 logmsgbot: reedy synchronized wmf-config/InitialiseSettings.php
  • 20:24 apergos: gadolinium down/unreachable, can't get to mgmt console, no ping even
  • 20:18 logmsgbot: aaron finished scap: timing test (beta) (duration: 02m 49s)
  • 20:15 logmsgbot: aaron started scap: timing test (beta)
  • 20:13 logmsgbot: aaron finished scap: timing test (duration: 06m 51s)
  • 20:11 paravoid: rebooting cp3011, kmem_allock deadlock
  • 20:06 logmsgbot: aaron started scap: timing test
  • 19:26 bblack: varnish mobile caches updated to -wm27 package (crash fix, http/0.9 patch removal to test effect on logging anomalies)
  • 08:21 mark: Power cycled ms-be1012
  • 02:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jan 5 02:08:47 UTC 2014
  • 02:03 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Sun Jan 5 02:03:44 UTC 2014
  • 02:01 logmsgbot: LocalisationUpdate completed (1.23wmf8) at Sun Jan 5 02:01:57 UTC 2014
  • 00:55 logmsgbot: reedy updated /a/common to I3543a34aa: db1040 back to full steam

January 4

  • 21:06 hashar: jenkins: purged php5-mysqld package from lanthanum and gallium. It was no more installed but had some conf file left in /etc/php5/conf.d/
  • 14:26 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'db1040 back to full steam'
  • 08:54 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'reduce db1026 load while reindexing wikidatawiki.wb_terms'
  • 07:42 logmsgbot: springle synchronized wmf-config/db-eqiad.php 'reduce db1040 load while adding indexes'
  • 04:13 springle: schema change, ad-hoc, additional indexes on recentchanges & wb_terms for recent slow queries
  • 03:45 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jan 4 03:45:39 UTC 2014
  • 03:13 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Sat Jan 4 03:13:45 UTC 2014
  • 02:38 logmsgbot: LocalisationUpdate completed (1.23wmf8) at Sat Jan 4 02:38:29 UTC 2014
  • 01:33 hashar: jenkins fixed up mwext-VisualEditor-qunit job, its configuration got reverted to some old/incorrect version when I downgraded the git plugin. Retriggered all changes
  • 00:02 logmsgbot: ori synchronized php-1.23wmf9/skins/common 'Fix SVG MIME-type detection by reverting a9b855eea52ba3 (part 2/2)'
  • 00:01 logmsgbot: ori synchronized php-1.23wmf9/skins/vector/images 'Fix SVG MIME-type detection by reverting a9b855eea52ba3'

January 3

  • 22:07 jeremyb: del
  • 22:03 awight: tools updated from b54854fcdd7a272ca2af8475affa033517456d07 to 0df36fd9bc55b24ffad49c0dbd0db67d67a2b1f7
  • 21:53 mwalker: updating fundraising-tools on al from b54854fcdd7a272ca2af8475affa033517456d07 (for statistics generation in json format)
  • 19:33 ori: Some migration pains while moving graphite from professor to tungsten; expect graphite & gdash flakiness
  • 19:01 hashar: apparently jenkins is back up and happy. Had to revert the git plugin to a previous version ...
  • 18:42 logmsgbot: aaron started scap
  • 18:36 logmsgbot: aaron started scap: active Testing timing
  • 18:11 hashar: Jenkins downgrading git plugin client to 1.4.6 and restarting jenkins
  • 17:52 hashar: Jenkins downgrading git plugin from 2.0 to 1.5 , we might be hit by https://issues.jenkins-ci.org/browse/JENKINS-21057
  • 16:15 logmsgbot: reedy synchronized php-1.23wmf8/extensions/
  • 16:14 logmsgbot: reedy synchronized php-1.23wmf9/extensions/
  • 14:53 hashar: jenkins restarted
  • 14:43 akosiaris: zapped /vol/root export from nas-1001-a
  • 14:23 hashar: restarting Jenkins , some git plugin are misbehaving
  • 13:35 springle: killed msnbot spike on s2
  • 12:43 Krinkle: Jenkins still unable to use integration-slave01 (restarted the node in labs, and disconnected/re-launched slave agent afterwards, too; no effect)
  • 12:14 Krinkle: Jenkins jobs executed on integration-slave01 are failing early due to jenkins-slave being unresponsive for filesystem access.
  • 09:50 apergos: restarted parsoid on wtp1010 and 1023, several gigs of logs full of "Error: Can't set headers after they are sent." from ServerResponse.OutgoingMessage.setHeader
  • 07:37 apergos: streber from mgmt console reports eth0 link down (hence it appears down to icinga and ganglia)
  • 07:06 mark: Disabled OSPF3 on csw2-knams:xe-1/1/0.0
  • 04:27 Ryan_Lane: added python-keystone-redis to apt repo
  • 03:24 springle: start schema changes for bug 59236, indexing only, ipblocks ipb_parent_block_id
  • 03:20 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jan 3 03:20:58 UTC 2014
  • 02:47 logmsgbot: LocalisationUpdate completed (1.23wmf9) at Fri Jan 3 02:47:34 UTC 2014
  • 02:17 logmsgbot: LocalisationUpdate completed (1.23wmf8) at Fri Jan 3 02:17:55 UTC 2014
  • 01:57 logmsgbot: reedy finished scap: Fix Disambiguator hewiki magicwords
  • 01:30 logmsgbot: reedy started scap: Fix Disambiguator hewiki magicwords
  • 00:23 logmsgbot: tstarling synchronized wmf-config/CommonSettings.php
  • 00:02 logmsgbot: maxsem synchronized php-1.23wmf9/extensions/MobileFrontend 'https://gerrit.wikimedia.org/r/105108'

January 2

  • 22:43 logmsgbot: aaron finished scap: active Timing test
  • 22:39 logmsgbot: aaron started scap: active Timing test
  • 22:28 logmsgbot: reedy synchronized echowikis.dblist 'Remove loginwiki'
  • 22:26 logmsgbot: reedy updated /a/common to I2c2836d40: Rest of phase1 to 1.23wmf9
  • 22:25 logmsgbot: aaron finished scap: active Timing test
  • 22:13 ori: loaded VCL change from Id54312fd5 (scholarship app) on cp1043 & cp1044 (misc-eqiad)
  • 22:03 logmsgbot: aaron started scap: active Timing test
  • 21:30 logmsgbot: bsitu synchronized php-1.23wmf9/extensions/Flow 'Update Flow to master'
  • 21.09 domas: someone's schema change hit http://bugs.mysql.com/bug.php?id=61548 and broke replication on s7, I dropped triggers and renamed target table into… well, you will see.
  • 21.08 Nemo_bis: All? #Wikimedia wikis read only since about 20.40 UTC, s7 database replication halted
  • 19.33 -!- morebots [local-more@208.80.153.164] has quit [Ping timeout: 240 seconds]
  • 19:25 Reedy: stuff has been broken
  • 19:24 ori: restarted morebots
  • 18.43 logmsgbot: reedy started scap: active rebuild localisation cache with updated wikidata config
  • 18.35 logmsgbot: reedy finished scap: 1.23wmf9 testwiki to 1.23wmf9 and build l10n cache
  • Commons, Wikivoyage etc. DOWN
  • 17.46 logmsgbot: reedy started scap: 1.23wmf9 testwiki to 1.23wmf9 and build l10n cache
  • 17.39 logmsgbot: reedy synchronized docroot and w
  • 17.21 logmsgbot: reedy updated /a/common to I49a405d8a: Wikibase: fix extension-list paths
  • 17.11 logmsgbot: reedy synchronized php-1.23wmf9 'staging'
  • 15.05 manybubbles: starting rolling upgrade of Elasticsearch servers. Going from 0.90.7 to 0.90.9.
  • 14.49 logmsgbot: hashar synchronized wmf-config 'Wikibase tweak for beta 976f2e9..7f80acb'
  • 14.40 paravoid: swift: setting weight of ms-be5 sde1 to 0, pending RT 6555
  • 9.07 hashar: jenkins restarted
  • 8.36 hashar: gallium / jenkins upgrading Jenkins from 1.509.4 to 1.532.1
  • 3.02 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jan 2 03:02:15 UTC 2014
  • 2.38 logmsgbot: LocalisationUpdate completed (1.23wmf7) at Thu Jan 2 02:38:41 UTC 2014
  • 2.26 logmsgbot: LocalisationUpdate completed (1.23wmf8) at Thu Jan 2 02:26:41 UTC 2014

January 1

  • 20:09 -!- Netsplit *.net <-> *.split quits: [...] morebots
  • 03:39 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jan 1 03:39:22 UTC 2014
  • 03:10 logmsgbot: LocalisationUpdate completed (1.23wmf7) at Wed Jan 1 03:10:51 UTC 2014
  • 02:13 logmsgbot: LocalisationUpdate completed (1.23wmf8) at Wed Jan 1 02:13:47 UTC 2014


Archives