Server Admin Log

From Wikitech
(Redirected from Server admin log)
Jump to: navigation, search

2016-02-09

2016-02-08

  • 23:30 logmsgbot: ori@mira Synchronized php-1.27.0-wmf.12/includes/interwiki/Interwiki.php: ac6e170fa5: Fix-up for I5a979f047031e (duration: 01m 18s)
  • 23:24 logmsgbot: bd808@mira Synchronized wmf-config/logging.php: logging: Collect mw1017 logs for debugging (9d6d0e0) (duration: 01m 18s)
  • 23:20 logmsgbot: bd808@mira Synchronized wmf-config/logging.php: logging: Send all udp2log eligible messages to $wmgDefaultMonologHandler (cd25586) (duration: 01m 17s)
  • 23:16 logmsgbot: bd808@mira Synchronized wmf-config/logging.php: Monolog: Add mwversion to udp2log log events (9b54967) (duration: 01m 18s)
  • 23:05 logmsgbot: bd808@mira Synchronized wmf-config/logging.php: Monolog: normalize messages before PSR3 expansion (e5ee5d8) (duration: 01m 18s)
  • 22:52 logmsgbot: ori@mira Synchronized wmf-config/missing.php: Ib5407c560: Update missing.php for interwiki.php (duration: 01m 18s)
  • 22:51 logmsgbot: ori@mira Synchronized docroot and w: Ifd7fe8c3c: createTxtFileSymlinks.sh: drop interwiki.cdb; add interwiki.php (duration: 01m 21s)
  • 22:43 logmsgbot: bd808@mira Synchronized php-1.27.0-wmf.12/includes/debug/logger/monolog/WikiProcessor.php: Add $wgVersion to MediaWiki\Logger\Monolog\WikiProcessor (3cea726) (duration: 01m 19s)
  • 21:51 logmsgbot: ori@mira Synchronized wmf-config/CommonSettings.php: Ie9bdd77fb: Use interwiki.php on all wikis; delete unused interwiki.json (duration: 01m 19s)
  • 21:31 logmsgbot: ori@mira Synchronized wmf-config/CommonSettings.php: I39c9ecd4b: Enable static PHP interwiki cache on mediawikiwiki and testwiki (duration: 01m 18s)
  • 21:28 ori: Restarting HHVM on mw1017 to wipe APC cache
  • 21:23 logmsgbot: ori@mira Synchronized wmf-config/CommonSettings.php: Ib599f9984a: Add interwiki.php; use it on mw1017 & on labs (2/2) (duration: 01m 16s)
  • 21:22 logmsgbot: ori@mira Synchronized wmf-config/interwiki.php: Ib599f9984a: Add interwiki.php; use it on mw1017 & on labs (1/2) (duration: 01m 20s)
  • 21:19 subbu: finished deploying parsoid version 4d44fcc7
  • 21:10 subbu: synced code; restarted parsoid on wtp1003 as a canary
  • 21:04 subbu: starting parsoid deploy
  • 21:02 bblack: resuming rolling reboots of cpNNNN caches for kernel updates
  • 20:34 mobrovac: restbase deploy end of c929ceb
  • 20:26 mobrovac: restbase deploy start of c929ceb
  • 19:36 logmsgbot: mattflaschen@mira Synchronized wmf-config/InitialiseSettings-labs.php: Beta Cluster-only change (duration: 01m 20s)
  • 19:04 papaul: sarin - signing puppet certs, salt-key, initial run
  • 19:03 urandom: restart restbase on restbase-test2001.codfw (staging)
  • 18:54 mobrovac: mathoid deployed 4bdb2f18c
  • 18:43 urandom: rolling Cassandra restart in restbase staging complete
  • 18:35 MatmaRex: Reopened 54 Phabricator tasks that someone merged into one, hope I haven't made more of a mess than it was before
  • 18:30 jynus: applying ferm on dbstore1001 and dbstore1002
  • 18:29 urandom: performing rolling restart of Cassandra in staging (to pickup /usr/share/cassandra/lib/cassandra-brotli-1.0.0-a64ce47.jar in classpath)
  • 18:25 elukey: re-enabled puppet on mc1004
  • 18:17 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-common.php: dont reindex wgCirrusSearchNamespaceWeights from 0 (duration: 01m 17s)
  • 18:13 chasemp: cleanup snapshots on labstore1001
  • 17:03 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Set category collation on gd.wikipedia gerrit:267820 (duration: 01m 21s)
  • 16:59 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Deploy Translate extension on ru.wikimedia gerrit:267822 (duration: 01m 17s)
  • 16:52 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable signature button for the Project namespace in ru.wiki gerrit:267997 (duration: 01m 19s)
  • 16:45 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespaces configuration on mai.wikipedia gerrit:268573 (duration: 01m 17s)
  • 16:39 logmsgbot: thcipriani@mira Synchronized wmf-config/mobile.php: SWAT: Use custom generator for mobile search on Wikidata Part II gerrit:254645 (duration: 01m 19s)
  • 16:37 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Use custom generator for mobile search on Wikidata Part I gerrit:254645 (duration: 01m 18s)
  • 16:32 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Do not request pageprops for mobile search/nearby on wikidata gerrit:268208 (duration: 01m 20s)
  • 16:28 logmsgbot: thcipriani@mira Synchronized wmf-config/Wikibase-production.php: SWAT: Add $wgWBRepoSettings[sparqlEndpoint] gerrit:268467 (duration: 01m 18s)
  • 16:27 elukey: restarted nutcracker in G@cluster:appserver and G@site:eqiad due to connect error issues (5 hosts per batch)
  • 16:27 _joe_: reinstalling pybal's new version (reduced) on ulsfo and codfw caches
  • 16:24 jynus: reverting slaves topology back to db1024 master
  • 16:19 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: wgRCWatchCategoryMembership true everywhere except wikisource gerrit:264734 (duration: 01m 26s)
  • 16:01 andrewbogott: restarting pdns on labservices 1001 to test loglevels
  • 15:30 elukey: stopping redis and memcached for mc1004.eqiad.wmnet due to Jessie re-image
  • 15:30 chasemp: restarting pdns and pdns-recursor on labservices1001
  • 15:01 jynus: restarting and upgrading dbstore1001 (db backups agent host)
  • 14:41 bblack: mobile LVS service decom complete (IPs now belong to text service)
  • 14:03 bblack: starting mobile LVS service decom (IPs moving to text) - puppet disabled on text caches and high-traffic1 LVSes
  • 13:56 bblack: cpNNNN rolling reboots paused (3038 still coming up)
  • 13:12 bblack: start up more rolling cache reboots for kernels (cpNNNN)
  • 13:09 elukey: updated hhvm on mw2016.codfw.wmnet, mw2161.codfw.wmnet, mw2199.codfw.wmnet, mw1259.eqiad.wmnet, mw1260.eqiad.wmnet
  • 13:05 _joe_: roll back installation of pybal, issues with upd and ipv6
  • 12:56 elukey: updated hhvm on mw1080, mv1084, mw1241
  • 12:32 elukey: restarting hhvm on mw1052, mw1075, mw1080, mw1081, mw1094, mw1095 to rollout the new version
  • 12:32 _joe_: uploaded a new pybal package; installing on codfw and ulsfo backups
  • 12:05 _joe_: restarted cron on tin, to catch up with the uid change for the l10nupdate user
  • 11:53 bblack: rebooting cp1074, cp3047 (for kernels, also to compare bios/drac settings...)
  • 11:26 jynus: stopping mysql at db2012
  • 11:25 jynus: starting mysql at db2012
  • 11:05 moritzm: rebooting db2012 for kernel update
  • 11:00 moritzm: rebooting terbium for kernel update
  • 10:26 moritzm: rebooting es2006,es2008 for kernel update
  • 10:25 moritzm: upgrading jobrunners/imagescalers in eqiad for hhvm float timeout fix
  • 10:20 jynus: changing s2 replication topology in preparation for master failover
  • 09:45 jynus: starting es2004
  • 09:29 moritzm: rebooting es2005,es2007,es2009,es2010 for kernel update
  • 09:15 elukey: hhvm restarted on mw1044.eqiad.wmnet due to hhvm package update
  • 09:15 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Feb 8 09:15:11 UTC 2016 (duration 8m 10s)
  • 09:12 elukey: hhvm restarted on mw1034.eqiad.wmnet due to hhvm package update
  • 09:07 logmsgbot: oblivian@tin sync-l10n completed (1.27.0-wmf.12) (duration: 11m 55s)
  • 08:42 _joe_: trying a manual run of l10nupdate since it failed last night again
  • 08:25 moritzm: rebooting es2001 to es2004 for kernel update

2016-02-07

  • 04:54 andrewbogott: upgraded python-openstackclient python-glanceclient python-novaclient python-keystoneclient on silver

2016-02-06

  • 05:43 bblack: rebooted cp2006 via racadm after crash - no crash data in logs...

2016-02-05

  • 23:54 chasemp: nfs shaping is really writes :)
  • 23:54 chasemp: tc to shape some nfs read traffic in tools for labs (also logged there) can be cancelled with: /sbin/tc qdisc del dev eth0 root
  • 23:51 YuviPanda: dropped old nfs snapshots from labstore1001
  • 23:30 logmsgbot: maxsem@mira Synchronized portals: (no message) (duration: 01m 18s)
  • 23:29 logmsgbot: maxsem@mira Synchronized portals/prod/wikipedia.org/assets: (no message) (duration: 01m 19s)
  • 22:56 jynus: reimaging db1018
  • 22:48 jynus: restarting slave on m2/codfw (db2011)
  • 22:41 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/268818/ (duration: 01m 22s)
  • 22:10 bblack: cache rolling reboots stopped for the weekend, can pick up the other half monday
  • 20:36 bblack: resuming rolling cache reboots
  • 20:07 mutante: cygnus - reboot VM
  • 19:28 bblack: halted rolling cache reboots, we seem to be having problems with a batch of them coming back...
  • 18:23 logmsgbot: demon@mira Synchronized wmf-config/InitialiseSettings.php: comment stuff, gerrit 267994 (duration: 01m 19s)
  • 18:15 jynus: stopping mysql@db1018 and starting to clone it for reimaging
  • 18:10 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool db1018 for maintenance (duration: 02m 12s)
  • 17:31 cmjohnson1: trouble shooting elastic1021
  • 17:08 bblack: rolling cpNNNN reboots are 27% complete, only two hosts so far failed to reboot on their own (but came up fine after manual racadm powercycle)
  • 16:20 ottomata: reenabling kafka1012 in analytics-eqiad kafka cluster
  • 16:03 jynus: reimaging db2030 to test jessie installer
  • 15:53 logmsgbot: oblivian@tin sync-l10n completed (1.27.0-wmf.12) (duration: 00m 08s)
  • 15:47 urandom: performing rolling restbase restart in staging env
  • 15:35 andrewbogott: rebooting silver for kernel update - wikitech outage will ensue
  • 15:33 urandom: re-restarting restbase on restbase1002.eqiad.wmnet,restbase1005.eqiad.wmnet,restbase1006.eqiad.wmnet,restbase1009.eqiad.wmnet (prior restarts may have happened before puppet run)
  • 15:29 andrewbogott: rebooting holmium for kernel update
  • 15:27 andrewbogott: rebooting labcontrol1002 for kernel update
  • 15:24 bblack: cp3005 didn't come back online during rolling reboot, investigating (remains depooled)
  • 15:22 _joe_: initializing mediawiki repos on tin
  • 15:22 andrewbogott: rebooting labnet1001 for kernel update
  • 15:15 urandom: restbase rolling restart complete
  • 15:08 urandom: performing rolling restbase restart to apply config change (https://gerrit.wikimedia.org/r/#/c/268611/)
  • 14:56 urandom: forcing puppet run and bouncing restbase on restbase1001.eqiad.wmnet (https://gerrit.wikimedia.org/r/#/c/268611/)
  • 14:41 elukey: confctl mw1228.eqiad.wmnet: weight changed 10 => 20
  • 14:24 moritzm: rebooting db2065 to db2070 for kernel update
  • 14:20 jynus: reimporting nlwiktionary revision into labs (expect some temporary lag on labs-s3)
  • 14:06 moritzm: rebooting db2060 to db2064 for kernel update
  • 13:34 bblack: starting rolling reboots of cp* (traffic cache hosts) for kernel updates
  • 12:51 moritzm: rebooting db2055 to db2059 for kernel update
  • 12:38 elukey: repooled mw1228.eqiad.wmnet
  • 12:34 moritzm: rebooting db2050 to db2054 for kernel update
  • 12:15 moritzm: rebooting db2045 to db2049 for kernel update
  • 12:07 jynus: reimporting nlwiktionary pages into labs
  • 12:05 logmsgbot: l10nupdate@tin LocalisationUpdate failed: git pull of core failed
  • 12:05 logmsgbot: l10nupdate@tin LocalisationUpdate failed: git clone of core failed
  • 11:54 moritzm: rebooting db2041 to db2044 for kernel update
  • 11:37 moritzm: rebooting db2038 to db2040 for kernel update
  • 11:36 godog: start swiftrepl replication pass of common thumbs eqiad -> codfw
  • 10:15 moritzm: rolling reboot of ocg* cluster
  • 02:27 mobrovac: restbase deploy end of caae1f7
  • 02:20 mobrovac: restbase deploy start of caae1f7
  • 01:52 Tim: deploying apache log format change following successful test on deployment-prep
  • 01:47 logmsgbot: aude@mira Finished scap: Re-add user rights messages in Echo (duration: 24m 37s)
  • 01:22 logmsgbot: aude@mira Started scap: Re-add user rights messages in Echo
  • 01:20 logmsgbot: aude@mira Synchronized php-1.27.0-wmf.12/extensions/Echo: Re-add user rights messages in Echo (duration: 01m 20s)
  • 01:06 logmsgbot: aude@mira Synchronized wmf-config/: Sync wikidata config changes for beta (duration: 01m 15s)
  • 01:00 mobrovac: restbase deploy end of 2aef1b67a0
  • 00:58 logmsgbot: aude@mira Synchronized wmf-config/InitialiseSettings.php: Set $wgEnotifMinorEdits to true on huwiki (duration: 01m 16s)
  • 00:53 logmsgbot: aude@mira Synchronized wmf-config/InitialiseSettings.php: Add museumvictoria.com.au to $wgCopyUploadsDomains (duration: 01m 17s)
  • 00:46 logmsgbot: aude@mira Synchronized wmf-config/InitialiseSettings.php: Re-enable ShortUrl on maiwiki, bhwiki and orwikisource, after creating db table (duration: 01m 18s)
  • 00:36 logmsgbot: aude@mira Synchronized wmf-config/InitialiseSettings.php: revert shorturl changes (duration: 01m 17s)
  • 00:29 logmsgbot: aude@mira Synchronized wmf-config/InitialiseSettings.php: Enable ShortUrl on maiwiki, bhwiki and orwikisource (duration: 01m 17s)
  • 00:26 akosiaris: restart apache on mendelevium.eqiad.wmnet. seems there's a memory leak, need to investigate tomorrow
  • 00:23 logmsgbot: aude@mira Synchronized wmf-config/InitialiseSettings.php: Re-enable category watch on wikipedia and commons (duration: 01m 18s)
  • 00:19 mobrovac: restbase deploy start of 2aef1b67a0 on rb1001

2016-02-04

  • 23:54 mobrovac: restbase disabled temporarily puppet in prod to test https://gerrit.wikimedia.org/r/#/c/268597/
  • 23:29 mutante: rcs1001 - rebooting
  • 22:45 bblack: rebooting cp1060 to test traffic-pool stuff
  • 22:30 logmsgbot: demon@mira Synchronized wmf-config/: undo my cleanup grumble grumble (duration: 01m 16s)
  • 22:12 logmsgbot: demon@mira Synchronized wmf-config/: touch (duration: 01m 14s)
  • 22:10 logmsgbot: demon@mira Synchronized private/: touch (duration: 01m 15s)
  • 22:01 logmsgbot: demon@mira Synchronized wmf-config/PrivateSettings.php: touch symlink (duration: 01m 15s)
  • 21:57 logmsgbot: demon@mira Synchronized wmf-config/: gerrit 268471, 268454 (duration: 01m 18s)
  • 21:52 mutante: mira chgrp -R wikidev /srv/mediawiki-staging/.git/objects
  • 21:29 mutante: rcs1001 started redis
  • 21:21 paravoid: setting up OSPF/OSPF3/PIM between ulsfo and codfw (cr2-ulsfo/cr1-codfw)
  • 21:19 mutante: rcs1002 - start redis
  • 21:15 mutante: rcs1002 - reboot for kernel
  • 20:45 mutante_: rsc1001 - schedule downtime, reboot
  • 20:30 paravoid: cr1-ulsfo: deactivating BGP peering with GTT
  • 20:26 mutante: eeden service ntp restart
  • 20:26 hashar: All wikis to 1.27.0-wmf.12 No troubles so far congratulations to everyone involved @wikimedia #wikimedia
  • 20:23 mutante: mw1115 service hhvm restart
  • 20:16 mutante: mw1117 - powercycled
  • 20:15 paravoid: cr1-ulsfo: turning up BGP with Zayo
  • 20:15 mutante: scb1001/scb1002 service mathoid restart
  • 20:03 logmsgbot: hashar@mira rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.27.0-wmf.12
  • 20:03 hashar: all wikis to 1.27.0-wmf.12 (yeah really)
  • 20:00 mobrovac: restbase rolling restart after merging https://gerrit.wikimedia.org/r/#/c/268016/
  • 19:54 logmsgbot: demon@mira Synchronized private/PrivateSettings.php: (no message) (duration: 02m 05s)
  • 19:43 ottomata1: rebooting kafka1012
  • 19:32 logmsgbot: demon@mira Synchronized wmf-config/InitialiseSettings.php: T125850 (duration: 02m 11s)
  • 19:00 logmsgbot: demon@mira Synchronized wmf-config/InitialiseSettings-labs.php: prod no op for completeness (duration: 03m 02s)
  • 18:49 logmsgbot: demon@mira Synchronized wmf-config/: removing old mwblocker.log (duration: 02m 07s)
  • 18:45 logmsgbot: demon@mira Synchronized private/mwblocker.log: (no message) (duration: 02m 10s)
  • 18:40 yurik: deployed and reenabled tilerator & tileratorui
  • 17:30 logmsgbot: hoo@mira Synchronized php-1.27.0-wmf.12/extensions/Wikidata: Fix editing terms in languages other than the interface language via the term box (duration: 02m 18s)
  • 17:29 hoo: (Re)started wdqs-updater on wdqs1001, but seems it doesn't work
  • 16:53 twentyafterfour: restarted phd to synchronize settings with phabricator
  • 16:52 twentyafterfour: restarted apache2 on iridium so that phabricator recognizes sprint.phragile-uri
  • 16:45 logmsgbot: thcipriani@mira Synchronized php-1.27.0-wmf.12/includes/media/Bitmap.php: SWAT: BitmapHandler: Implement validateParam() gerrit:268407 (duration: 02m 08s)
  • 16:07 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable confirmed group at nowiki gerrit:267804 (duration: 02m 15s)
  • 15:53 urandom: rolling restbase restart complete
  • 15:47 urandom: restbase cluster puppet run complete; performing rolling restart of restbase (applying https://gerrit.wikimedia.org/r/#/c/266297/)
  • 15:44 moritzm: rebooting db203[678] for kernel update
  • 15:35 urandom: forcing puppet run on restbase cluster (config deploy)
  • 15:34 urandom: reenabling puppet on restbase cluster (continue config deploy)
  • 15:18 urandom: restarting restbase on restbase1001.eqiad.wmnet (config deploy)
  • 15:15 urandom: re-enabling puppet and forcing run on restbase1001.eqiad.wmnet (canary config deploy)
  • 15:11 _joe_: restarting pybal on lvs200{3,6}
  • 15:08 urandom: disabling puppet on restbase cluster in preparation for configuration deploy (https://gerrit.wikimedia.org/r/#/c/266297/)
  • 15:07 moritzm: rebooting oxygen for kernel update
  • 14:58 ottomata: stopping eventlogging to reboot eventlog1001 for kernel update
  • 14:41 moritzm: rebooting db203[45] for kernel update
  • 14:09 hoo: Restarted blazegraph on wdqs1001
  • 13:57 godog: powercycle ms-be2020
  • 13:39 moritzm: continue rolling reboot of maps cluster for kernel update (2002-2004)
  • 12:21 jynus: starting mysql at db2009
  • 12:08 moritzm: rebooting db2001 to db2019 for kernel update
  • 11:44 jynus: dropping echo_* tables from labs
  • 11:19 dcausse: elastic codfw: resuming writes and setting cluster.routing.allocation.balance.threshold back to default (1%)
  • 10:35 dcausse: elastic codfw: freezing writes and setting cluster.routing.allocation.balance.threshold to 100% (fast recovery test)
  • 10:35 logmsgbot: hashar@mira Synchronized php-1.27.0-wmf.12/.gitmodules: Set branch in .gitmodules for extensions/Wikidata https://gerrit.wikimedia.org/r/#/c/268218/ (duration: 02m 08s)
  • 10:16 moritzm: rolling reboot of maps cluster for kernel update
  • 10:14 jynus: testing new replication filters from production's testwiki
  • 10:13 elukey: running smartctl -t long on kafka1012 (kafka not running, host de-pooled from the broker list)
  • 10:11 moritzm: repooling restbase2006
  • 10:01 jynus: applying live on the 7 sanitarium instance the newly puppet-configured labs replication filters
  • 09:57 moritzm: repooling restbase2005, depooling restbase2006 for kernel reboot/Java update
  • 09:46 dcausse: elastic in codfw: reducing the number of replicas from 0-3 to 0-2 for commonswiki_file
  • 09:46 moritzm: repooling restbase2004, depooling restbase2005 for kernel reboot/Java update
  • 09:39 ema: re-enabling puppet on mw1161
  • 09:34 moritzm: depooling restbase2004 for kernel reboot/Java update
  • 09:11 jynus: converting remaining InnoDB tables (s3) to TokuDB on db1069
  • 08:14 chasemp: iridium puppet agent --enable && puppet agent --disable "DO NO ENABLE AS IT WILL BREAK THINGS CONTACT MUKUNDA"
  • 07:51 twentyafterfour: phabricator repositories checked out to these revisions: http://pastebin.com/JxEaYKiW
  • 07:49 chasemp: git checkout tag release/2015-11-18/1 for phab & libphutil on iridiuum
  • 07:35 andrewbogott: disabling puppet on iridium to prevent it from smashing phabricator (as it seems to do now and then)
  • 07:00 andrewbogott: on iridium in /srv/deployment/phabricator/deploy/phabricator, naming the currently detached git branch ‘andrewfounditlikethis'
  • 06:49 robh: phabricator down with errors during repo updates in phd daemon log
  • 02:12 mutante: OTRS - changed motd message in /opt/otrs/Kernel/Output/HTML/Templates/Standard/Motd.tt - admins can turn it on and off
  • 01:04 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.12/tests: https://gerrit.wikimedia.org/r/#/c/268332/ (duration: 02m 08s)
  • 01:01 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.12/includes/parser: https://gerrit.wikimedia.org/r/#/c/268332/ (duration: 02m 25s)
  • 01:00 moritzm: rebooting iridium (phabricator host) for kernel update
  • 00:42 YuviPanda: yuvipanda@labstore2001:~$ sudo lvremove backup/maps20160121040005
  • 00:41 YuviPanda: yuvipanda@labstore2001:~$ sudo lvremove backup/tools20160121020007
  • 00:04 logmsgbot: thcipriani@mira rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.12

2016-02-03

  • 23:53 moritzm: repooling restbase2002 , depooling restbase2003 for kernel/Java update
  • 23:39 moritzm: repooling restbase2001 , depooling restbase2002 for kernel/Java update
  • 23:36 logmsgbot: thcipriani@mira rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.27.0-wmf.12
  • 23:29 hashar: passing wmf12 responsibility to thcipriani . Crashing to bed myself.
  • 23:22 moritzm: depooling restbase2001 for kernel/Java update
  • 23:15 moritzm: rebooting wdqs1002 for kernel update
  • 23:08 hashar: Full script of my deployment session is on mira.codfw.wmnet:/home/hashar/wmf12-deploy.script
  • 23:07 logmsgbot: hashar@mira rebuilt wikiversions.php and synchronized wikiversions files: Clarify only testwiki and test2wiki are on php-1.27.0-wmf.12
  • 23:07 moritzm: rebooting wdqs1001 for kernel update
  • 22:51 hashar: test / test2 wikis are incredibly slow . Filled https://phabricator.wikimedia.org/T125727
  • 22:47 subbu: finished deploying parsoid sha 98619f7f
  • 22:43 logmsgbot: hashar@mira rebuilt wikiversions.php and synchronized wikiversions files: test2wiki to php-1.27.0-wmf.12
  • 22:43 hashar: sync-wikiversions "test2wiki to php-1.27.0-wmf.12"
  • 22:41 moritzm: repooling restbase1009
  • 22:38 logmsgbot: hashar@mira Finished scap: to properly sync other master tin due to l10nupdate ui mismatch (duration: 24m 27s)
  • 22:34 moritzm: repooling restbase1006 , depooling restbase1009 for kernel/Java update
  • 22:34 hashar: Still looking at test.wikipedia.org being super "slow" . scap still rebuilding though
  • 22:32 ejegg: updated payments-wiki from 1817327b4b0919ebe26bbd8b9d84fac1bd7ddb03 to fad669c99db8240b26a524aa70c85cfebd13a18c
  • 22:21 moritzm: repooling restbase1005 , depooling restbase1006 for kernel/Java update
  • 22:14 ejegg: rolled payments-wiki back to 1817327b4b0919ebe26bbd8b9d84fac1bd7ddb03
  • 22:14 hashar: https://test.wikipedia.org/ switched to 1.27.0-wmf.12
  • 22:13 logmsgbot: hashar@mira Started scap: to properly sync other master tin due to l10nupdate ui mismatch
  • 22:13 subbu: restarted parsoid on wtp1002 as a canary
  • 22:13 logmsgbot: hashar@mira Finished scap: testwiki to php-1.27.0-wmf.12 and rebuild l10n cache (with proper branches for special_extensions) (duration: 20m 23s)
  • 22:07 moritzm: repooling restbase1004 , depooling restbase1005 for kernel/Java update
  • 22:06 ejegg: updated payments-wiki from 1817327b4b0919ebe26bbd8b9d84fac1bd7ddb03 to 52afbc735ef5d759fd42bef072bed286fe3a5581
  • 22:06 subbu: starting parsoid deploy
  • 22:03 mutante: mira, tin: find /srv/mediawiki-staging/ -uid 1001 -exec chown 10002 {} \;
  • 21:53 hashar: reopened https://phabricator.wikimedia.org/T119165 l10nupdate user uid mismatch between tin and mira
  • 21:52 logmsgbot: hashar@mira Started scap: testwiki to php-1.27.0-wmf.12 and rebuild l10n cache (with proper branches for special_extensions)
  • 21:51 mutante: tin - find / -uid 1001 -exec chown 10002 {} \;
  • 21:49 mutante: tin - fixing UID of l10nupdate user (T119165)
  • 21:45 moritzm: depooling restbase1004 for kernel/Java update
  • 21:45 moritzm: repooling restbase1003
  • 21:35 hashar: mismatching uid for l10nupdate user between mira and tin
  • 21:34 logmsgbot: hashar@mira scap aborted: testwiki to php-1.27.0-wmf.12 and rebuild l10n cache (with proper branches for special_extensions) (duration: 07m 41s)
  • 21:32 moritzm: depooling restbase1003 for kernel/Java update
  • 21:27 moritzm: repooling restbase1008
  • 21:26 logmsgbot: hashar@mira Started scap: testwiki to php-1.27.0-wmf.12 and rebuild l10n cache (with proper branches for special_extensions)
  • 21:25 hashar: mira had to hard reset CentralNotice / SemanticMediaWiki / SemanticResultFormats / Validator after we pointed them from master to their proper branch, submodule attempted a rebase automatically.. That is a no no
  • 21:14 moritzm: depooling restbase1008 for kernel/Java update
  • 21:08 hashar: waiting for the submodule patch https://gerrit.wikimedia.org/r/#/c/268214/ to land and will scap again
  • 20:33 logmsgbot: hashar@mira scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.bTBpxD6CuI" ' returned non-zero exit status 1 (duration: 01m 13s)
  • 20:32 logmsgbot: hashar@mira Started scap: testwiki to php-1.27.0-wmf.12 and rebuild l10n cache (after RandomRootPage had a dummy entry point added)
  • 20:31 logmsgbot: demon@mira Synchronized php-1.27.0-wmf.12/extensions/RandomRootPage/: unbreak (duration: 01m 19s)
  • 20:23 logmsgbot: hashar@mira scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_2188303825" --threads=10 --lang en --quiet' returned non-zero exit status 255 (duration: 01m 49s)
  • 20:21 logmsgbot: hashar@mira Started scap: testwiki to php-1.27.0-wmf.12 and rebuild l10n cache
  • 20:20 hashar: Hacked wikiversions.json to only have testwiki on .12
  • 19:58 logmsgbot: demon@mira Synchronized wmf-config/InitialiseSettings.php: touch (duration: 01m 19s)
  • 19:49 logmsgbot: demon@mira Synchronized wmf-config/: fix wikibase/mobilefrontend config (duration: 01m 19s)
  • 19:48 robh: halting puppet on carbon for a few minutes to livehack a partition recipe change in netboot.cfg
  • 19:45 hashar: https://phabricator.wikimedia.org/T125672 blocking wmf.12 "Notice: Undefined variable: wgMFQueryPropModules in /srv/mediawiki/wmf-config/Wikibase.php on line 120"
  • 19:39 akosiaris: hot patch OTRS installation with https://github.com/OTRS/otrs/commit/c7ea6d64e02518e166fbac02f42f25dacad54342
  • 19:35 hashar: mira: manually fixed /php and /w/static/current symlinks to point back to .10 (wikiversions migrated them to .11 which we skip)
  • 19:30 moritzm: repooling restbase1002
  • 19:29 hashar: Create patches to update wikiversions.json
  • 19:24 hashar: Applying security patches on mira
  • 19:24 hashar: starting train deployment of 1.27.0-wmf.12
  • 19:09 csteipp: deployed patch for T125684
  • 19:08 moritzm: depooling restbase1002 for kernel/Java update
  • 18:38 logmsgbot: bd808@mira Synchronized wmf-config/InitialiseSettings-labs.php: Experiment one: Labs stripping HTML in beta (360e5af) (duration: 01m 19s)
  • 18:34 moritzm: rebooting californium for kernel update
  • 18:16 bblack: restarting pybal on lvs1001
  • 18:04 jynus: previous announcement was for db2011, not db2010
  • 18:02 jynus: starting slave IO thread on db2010
  • 17:32 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Repool db1060 after maintenance (duration: 01m 20s)
  • 17:22 mobrovac: restbase restarting rb1001
  • 17:03 mdholloway: mobileapps deployed 68e38ec
  • 17:02 bblack: restarting pybal on lvs1004 (not 1003!) T125397
  • 17:02 bblack: restarting pybal on lvs1003 T125397
  • 16:57 hashar: mira: updating /srv/mediawiki-staging/php-1.27.0-wmf.12 (prep deployment train)
  • 16:55 logmsgbot: thcipriani@mira Synchronized wmf-config/CirrusSearch-production.php: SWAT: Return more like search queries to codfw gerrit:268097 (duration: 01m 17s)
  • 16:45 logmsgbot: thcipriani@mira Synchronized wmf-config/CommonSettings.php: SWAT: Remove unused/no longer existing item-create oauth grant gerrit:265447 (duration: 01m 18s)
  • 16:39 logmsgbot: thcipriani@mira Synchronized wmf-config: SWAT: Enable math data type on test wikidata + test wikipedias gerrit:268086 (duration: 01m 18s)
  • 16:32 logmsgbot: thcipriani@mira Synchronized wmf-config/mobile.php: SWAT: Remove section collapsing config gerrit:267776 (duration: 01m 18s)
  • 16:28 akosiaris: OTRS migration to 4.0 completed, starting upgrade to 5.0
  • 16:24 logmsgbot: thcipriani@mira Synchronized wmf-config/CommonSettings.php: SWAT: MW parsoid URLs: s/parsoidcache/parsoid/ gerrit:267234 (duration: 01m 18s)
  • 16:18 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Add 2 sites to $wgCopyUploadsDomains gerrit:262893 (duration: 01m 18s)
  • 16:13 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Just use the default MobileFrontend specified page actions. Part II gerrit:267807 (duration: 01m 18s)
  • 16:11 logmsgbot: thcipriani@mira Synchronized wmf-config/mobile.php: SWAT: Just use the default MobileFrontend specified page actions. Part I gerrit:267807 (duration: 02m 14s)
  • 15:41 hashar: mira symlink pointing to current version got changed to wmf.11 by the checkoutMediaWiki script. Manually changed to proper wmf.10 https://phabricator.wikimedia.org/T125475#1994078
  • 15:32 jynus: restart and reconfigure mysql in db1060
  • 15:30 hashar: MediaWiki 1.27.0-wmf.12, from 1.27.0-wmf.12, successfully checked out.
  • 15:23 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool db1060 (duration: 00m 43s)
  • 15:21 hashar: mira: cloning 1.27.0-wmf.12 (no link updates)
  • 15:15 bblack: rebooting cp1060 (depooled/downtimed)
  • 15:11 bblack: depooling cp1060 temporarily from cache_mobile varnish backends
  • 14:56 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Repool db1054 with low weight, repool db1067 with original weight (duration: 01m 22s)
  • 14:50 bblack: rebooting cp1008 for kernel
  • 14:28 godog: investigating uwsgi processes for graphite-web not coming up after reboot
  • 14:10 moritzm: rebooting graphite1001 for kernel update
  • 13:41 godog: powercycle ms-be2015
  • 13:39 jynus: restarting and reconfiguring mysql at db1054
  • 13:27 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Repool db1067 at low weight; depool db1054 (duration: 01m 16s)
  • 11:45 jynus: restarting and reconfiguring mysql at db1067
  • 11:11 moritzm: repooling restbase1001
  • 11:04 akosiaris: OTRS database upgraded to 3.3, moving on with 4.0
  • 11:00 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Repool db1063 at 100% load; depool db1067 for maintenance (duration: 01m 16s)
  • 10:48 moritzm: depooling restbase1001 for kernel/Java update
  • 10:37 _joe_: ending the load test on the eqiad apaches
  • 10:11 moritzm: reboot francium for kernel update
  • 09:53 jynus: m2 backup finished on /srv/backups/2016-02-03_08-51-06, filename 'db1020-bin.000842', position 220103947
  • 09:50 moritzm: restarting neodymium for kernel update
  • 09:49 _joe_: doing some basic load test on appservers in eqiad
  • 08:52 akosiaris: stop otrs-daemon on mendelevium
  • 08:51 jynus: starting mysql backup on db1020 (/srv/backups)
  • 08:44 akosiaris: stop slave on db2011, db1020's (m2-master) slave, for OTRS migration. DO NOT ENABLE
  • 08:40 akosiaris: stop exim4, cron, apache2 on iodine, mendelevium
  • 08:39 akosiaris: disabling puppet on iodine, mendelevium, OTRS migration
  • 08:24 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Repool db1063 with low weight (duration: 01m 20s)

2016-02-02

  • 23:13 logmsgbot: demon@mira Finished scap: everything re-sync one more time for good measure (duration: 17m 04s)
  • 22:56 logmsgbot: demon@mira Started scap: everything re-sync one more time for good measure
  • 22:50 bblack: repooling scap proxies: mw10033, mw1070, mw1097, mw1216
  • 22:45 chasemp: restart hhvm & apache2 on mw1235.eqiad.wmnet
  • 22:44 _joe_: restarted hhvm on mw1231, stat_cache again
  • 22:42 logmsgbot: demon@mira Finished scap: resync final batch with master (duration: 06m 48s)
  • 22:35 logmsgbot: demon@mira Started scap: resync final batch with master
  • 22:31 logmsgbot: demon@mira Finished scap: re-sync batch of mw1136-50, mw1190-1220, mw2150-mw2200 with master (duration: 09m 33s)
  • 22:22 logmsgbot: demon@mira Started scap: re-sync batch of mw1136-50, mw1190-1220, mw2150-mw2200 with master
  • 22:20 ori: restarted HHVM on mw1243. Lock-up. Backtrace in /tmp/hhvm.2897.bt
  • 22:20 logmsgbot: demon@mira Finished scap: re-sync batch of mw1101-1135,1240-1260, 2101-2150 with master (duration: 12m 51s)
  • 22:07 logmsgbot: demon@mira Started scap: re-sync batch of mw1101-1135,1240-1260, 2101-2150 with master
  • 22:00 logmsgbot: demon@mira Finished scap: re-sync batch of mw1151-mw1225, mw2174-mw2214 with master (duration: 11m 24s)
  • 21:49 logmsgbot: demon@mira Started scap: re-sync batch of mw1151-mw1225, mw2174-mw2214 with master
  • 21:45 logmsgbot: demon@mira Finished scap: re-sync batch of mw1051-1100, mw2051-2100 with master (duration: 13m 41s)
  • 21:31 logmsgbot: demon@mira Started scap: re-sync batch of mw1051-1100, mw2051-2100 with master
  • 21:28 logmsgbot: demon@mira Finished scap: re-sync batch of mw1025-1050 and mw2007-mw2050 with master (2nd try) (duration: 14m 33s)
  • 21:27 _joe_: depooling eqiad scap-proxies
  • 21:13 logmsgbot: demon@mira Started scap: re-sync batch of mw1025-1050 and mw2007-mw2050 with master (2nd try)
  • 21:04 logmsgbot: demon@mira scap aborted: re-sync batch of mw1025-1050 and mw2007-mw2050 with master (duration: 10m 11s)
  • 20:54 logmsgbot: demon@mira Started scap: re-sync batch of mw1025-1050 and mw2007-mw2050 with master
  • 20:32 hashar: mw1114-mw1119 are canary api appservers Finished syncing
  • 20:28 ori: restarted hhvm on mw1116
  • 20:17 hashar: Running sync-common on mw1114-mw1119 (canary api appservers)
  • 20:16 ostriches: mira: removed untracked wmf-config/x.php testing file
  • 20:11 ori: Running sync-common on canary app servers (mw1017-mw1025)
  • 19:46 hashar: Running sync-common on mw1260 (video scaler)
  • 19:40 ori: Running sync-common on all jobscalers
  • 19:35 ori: Running sync-common on mw1259 (video scaler) and mw1153 (image scaler) too
  • 19:29 ori: Running sync-common on mw100[123]
  • 18:59 _joe_: running sync-common on mw1020
  • 18:54 _joe_: repooled mw1119
  • 17:45 hashar: mira /srv/mediawiki-staging git submodule update --init --recursive
  • 17:43 hashar: mw1119 sync-common
  • 17:37 godog: disable unused swift container-sync for wikibooks-ka-local-thumb wikibooks-hr-local-thumb wikibooks-km-local-thumb wikibooks-sk-local-thumb wikibooks-tr-local-thumb wikipedia-it-local-thumb.fc
  • 17:36 hashar: mw1119:/srv/mediawiki/wmf-config/event-schemas is empty
  • 17:31 _joe_: depooled mw1119, partial sync
  • 16:59 hashar: files were /srv/mediawiki/docroot/wikimedia.org/WikipediaMobileFirefoxOS/.git and /srv/mediawiki/docroot/wikimedia.org/WikipediaMobileFirefoxOS/js/lib/MobileFrontend/.git
  • 16:58 ostriches: mw1017: removed stray .git directory from WikipediaFirefoxMobileOS or w/e. It shouldn't be there anyway. sync-common is happy again on it
  • 16:48 hashar: tin /srv/mediawiki-staging  : running git submodule update --init --recursive
  • 16:47 hashar: tin /srv/mediawiki-staging  : running git submodule update --init
  • 16:40 hashar: mw1017 sync-common --verbose
  • 16:35 _joe_: sync-common on mw2030 and mw1161; re-enable puppet, jobrunner, jobchron on mw1161
  • 16:34 _joe_: restarted puppet and rsync on both tin and mira, removed comments on the l10nupdate job on tin
  • 16:23 logmsgbot: thcipriani@mira rebuilt wikiversions.php and synchronized wikiversions files: rebuild wikiversion.php
  • 14:57 godog: disable swift container-sync for wikipedia-it-local-public.a7
  • 14:43 hashar: tin /srv/mediawiki-staging/multiversion/checkoutMediaWiki 1.27.0-wmf.10 php-1.27.0-wmf.10
  • 14:43 hashar: tin /srv/mediawiki-staging/multiversion/checkoutMediaWiki 1.27.0-wmf.9 php-1.27.0-wmf.9
  • 14:43 hashar: tin /srv/mediawiki-staging/multiversion/checkoutMediaWiki 1.27.0-wmf.8 php-1.27.0-wmf.8
  • 14:21 hashar: starting rebuilding /srv/mediawiki-staging from scratch on tin (not mira)
  • 14:20 hashar: starting rebuilding /srv/mediawiki-staging from scratch on mira
  • 14:04 bblack: nevermind, not looking at eeden
  • 14:04 bblack: looking at eeden
  • 13:58 moritzm: rebooting eeden for kernel update
  • 13:09 moritzm: rolling reboot of scb* (for kernel update)
  • 13:02 akosiaris: reboot dubnium for kernel upgrades
  • 13:01 akosiaris: reboot pollux for kernel upgrades
  • 12:45 moritzm: rebooting baham for kernel update
  • 12:20 _joe_: stopping rsync on mira too, to avoid accidental deploys
  • 12:15 _joe_: stopped puppet on mira, added a big warning in the motd
  • 12:15 _joe_: stopped rsync, puppet, l10nupdate cronjob on tin
  • 12:06 _joe_: stopped rsync on tin to avoid problems
  • 11:38 moritzm: rolling reboot of aqs* (for kernel update)
  • 11:24 hashar_: Restarting Zuul. Stuck in a dependency loop :(
  • 11:12 jynus: restarting and reconfiguring mysql at db1063
  • 10:51 _joe_: stopped jobrunner on mw1161 after failed sync-common
  • 10:44 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool db1063, repool db1036 (duration: 00m 21s)
  • 10:00 jynus: reconfigure and upgrade db1036
  • 09:51 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Testing scap-reduce db1018 weight (duration: 00m 21s)
  • 09:42 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool db1036, repool db1021 (duration: 00m 22s)
  • 09:38 hashar: Jenkins is fully up and operational
  • 09:36 jynus: armed keyholder on tin
  • 09:34 dcausse: elastic (codfw and eqiad): unfreezing indices
  • 09:33 moritzm: restarting gerrit on ytterbium for java security update
  • 09:33 _joe_: re-syncing tin homes
  • 09:32 hashar: gallium: apt-get upgrade | Restarting Jenkins
  • 09:12 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool db1036, repool db1021 (duration: 00m 21s)
  • 09:08 dcausse: elastic (codfw and eqiad): freezing indices to stop titlesuggest maint scripts
  • 09:03 godog: repool restbase1007 via confctl
  • 08:13 jynus: restarting and upgrading db1021
  • 08:02 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Pool db1018; Depool db1021 (duration: 00m 20s)
  • 07:46 jynus: https://phabricator.wikimedia.org/rOMWC2ea9167221d11eb1880e4d26eae64a85cb9b2697 and https://phabricator.wikimedia.org/rOMWCa55d2bf8cd3a2853fac35d5b8239b8e8c2fe6a0f merged but not deployed
  • 06:58 _joe_: reimaging tin.eqiad.wmnet
  • 01:30 logmsgbot: ebernhardson@mira Finished scap: Add Cookie statement link to footer of all WMF wikis per legal (duration: 19m 42s)
  • 01:11 logmsgbot: ebernhardson@mira Started scap: Add Cookie statement link to footer of all WMF wikis per legal
  • 01:07 logmsgbot: ebernhardson@mira scap failed: CalledProcessError Command '/srv/deployment/scap/scap/bin/refreshCdbJsonFiles --directory="/srv/mediawiki-staging/php-1.27.0-wmf.10/cache/l10n" --threads=10 ' returned non-zero exit status 255 (duration: 03m 31s)
  • 01:03 logmsgbot: ebernhardson@mira Started scap: Add Cookie statement link to footer of all WMF wikis per legal
  • 00:31 logmsgbot: ebernhardson@mira scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="cawikibooks" --outdir="/tmp/scap_l10n_1684485672" --threads=10 --quiet' returned non-zero exit status 255 (duration: 02m 35s)
  • 00:30 mobrovac: restbase deploy end of c3bd864
  • 00:29 logmsgbot: ebernhardson@mira Started scap: Add Cookie statement link to footer of all WMF wikis per legal
  • 00:26 logmsgbot: ebernhardson@mira Synchronized wmf-config/logging.php: Revert "monolog: Ensure that context data added by WebProcessor is utf-8 safe" (duration: 01m 27s)
  • 00:23 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-production.php: Move morelike query load back to eqiad to allow load testing on codfw (duration: 01m 38s)

2016-02-01

  • 23:51 mobrovac: restbase deploy start of c3bd864 on canary rb1001
  • 19:28 logmsgbot: ori@mira Synchronized docroot/wikipedia.org/speed-tests: I5b48a491390: Speed trials: add preconnect (duration: 01m 27s)
  • 18:54 bblack: banned obj.http.Content-Length == 13817 on all cache_text
  • 18:54 mutante: LDAP - added elukey to "ops" group
  • 18:11 mutante: planet1001 - rebooting for upgrade
  • 17:54 hoo: restarted hhvm on mw1253
  • 17:06 logmsgbot: thcipriani@mira Synchronized wmf-config: SWAT: Use extension registration for Graph gerrit:266433 (duration: 01m 29s)
  • 16:59 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable SandboxLink on or.wikipedia.org gerrit:267194 (duration: 01m 31s)
  • 16:54 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable WikidataPageBanner on es.wikivoyage gerrit:267195 (duration: 01m 29s)
  • 16:52 _joe_: restarted pybal on lvs1001
  • 16:47 _joe_: installing the new HHVM package to the api appserver cluster in eqiad
  • 16:38 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Set WikidataPageBanner namespaces on fr.wikivoyage gerrit:266541 (duration: 01m 26s)
  • 16:32 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration on cu.wikipedia gerrit:265885 (duration: 01m 26s)
  • 16:26 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Centralise all VisualEditor feedback pages except for a few wikis gerrit:258206 (duration: 01m 30s)
  • 16:22 logmsgbot: thcipriani@mira Synchronized dblists/visualeditor-default.dblist: SWAT: Enable VisualEditor by default for some other wikis gerrit:264765 (duration: 01m 58s)
  • 16:05 ema: hhvm restarted on mw1072
  • 15:54 logmsgbot: krenair@mira Synchronized wmf-config/interwiki.cdb: Updating interwiki cache (duration: 01m 52s)
  • 15:48 bblack: restarted pybal on lvs1004 (lvs1003 above was a bad log message!)
  • 15:42 bblack: restarted pybal on lvs1003
  • 15:13 bblack: cp3042 repooled
  • 15:10 ema: restarting hhvm on mw1057
  • 14:33 chasemp: labstore1002 cfg scheduling
  • 14:04 godog: set ms-be1019 swift weight to 4000
  • 13:33 moritzm: rolling reboot of xenon/cerium/praseodymium for kernel update (and updating to new openjdk-8)
  • 12:40 _joe_: depooling cp3042 from esams uploads
  • 12:15 _joe_: backing up tin homes before reimaging
  • 11:59 moritzm: rolling reboot of ms-be1016 to ms-be1021 for kernel update
  • 11:39 moritzm: uploaded openjdk-8 8u72-b15-1~bpo8+1 for jessie-wikimedia to carbon
  • 11:34 moritzm: uploaded openssl 1.0.2f for jessie-wikimedia to carbon
  • 11:19 godog: repool restbase1007
  • 10:32 godog: reboot ms-be1010, xfs
  • 10:27 jynus: partitioning revision and logging for db2037 and db2044 (s4)
  • 00:04 logmsgbot: tstarling@mira Synchronized php-1.27.0-wmf.11/includes: (no message) (duration: 01m 31s)

2016-01-31

  • 23:58 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.11/extensions/VisualEditor/extension.json: https://gerrit.wikimedia.org/r/#/c/267617/ (duration: 01m 28s)
  • 22:31 ori: restarted parsoid-rt-client.service
  • 22:14 ori: Updated parsoid on ruthenium and restarted parsoid-rt-client on ruthenium, per subbu's request.
  • 22:03 bd808: backfilled missing data in https://tools.wmflabs.org/sal/production from https://wikitech.wikimedia.org/wiki/Server_Admin_Log
  • 21:37 bd808: https://tools.wmflabs.org/sal/production missing data from 2016-01-30 until now
  • 21:33 logmsgbot: ori@mira Synchronized php-1.27.0-wmf.10/includes/jobqueue/jobs/HTMLCacheUpdateJob.php: Live-hacked wfDebugLog() call for T124418 (duration: 01m 31s)
  • 16:01 tgr: changed wikiversions.php on mw1017 to serve wmf.10 for SessionManager-related debugging
  • 05:35 legoktm: restarted extensions/CentralAuth/maintenance/resetGlobalUserTokens.php
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jan 31 02:32:12 UTC 2016 (duration 7m 11s)
  • 02:25 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 10m 14s)

2016-01-30

  • 23:20 logmsgbot: bd808@mira rebuilt wikiversions.php and synchronized wikiversions files: Revert all wikis to 1.27.0-wmf.10 (again)
  • 23:01 logmsgbot: bd808@mira Synchronized wmf-config/InitialiseSettings.php: Revert Enable debug level session logging to fluorine (17bfb06) (duration: 01m 28s)
  • 22:36 logmsgbot: bd808@mira Synchronized wmf-config/InitialiseSettings.php: Enable debug level session logging to fluorine (5ac9412) (duration: 01m 26s)
  • 18:43 _joe_: updated visualdiff, restarted parsoid-vd
  • 13:00 godog: discard preserved cache on ms-be2003, powercycle
  • 03:40 Krenair: Deleted old /srv/mediawiki/php-1.27.0-wmf.[1-5] directories across the cluster to match the deployment tree, T124567
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jan 30 02:31:56 UTC 2016 (duration 7m 2s)
  • 02:24 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.11) (duration: 10m 24s)
  • 00:08 logmsgbot: bd808@mira Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: Remove proposed fix for T125267 (duration: 01m 33s)

2016-01-29

  • 23:53 jynus: restarted db1018 replication (and its codfw slaves) after a (somewhat) failed maintenance
  • 23:41 mutante: ruthenium - restart parsoid-rt-client, parsoid-vd-client
  • 23:37 mutante: ruthenium - git pull origin in /srv/visualdiff/
  • 23:22 logmsgbot: bd808@mira Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: Testing proposed fix for T125267 (duration: 01m 26s)
  • 22:52 jynus: powercycling cp3042 to test it is really the broken one
  • 22:37 jynus: powercycle cp3049, not 42
  • 22:37 jynus: powercycle cp3042
  • 22:27 mutante: cp3042 - md0: unknown partition table
  • 22:23 mutante: powercycled cp1049
  • 22:06 mutante: powercycle cp3049
  • 21:13 mutante: bromine - stop and remove rsync service
  • 20:16 logmsgbot: aaron@mira Synchronized wmf-config/CommonSettings.php: Use the logical redis definition for GettingStarted (duration: 01m 26s)
  • 19:36 jynus: reinstall db1018
  • 18:11 jynus: creating special partitioning for db2037 and db2044 (ETA:5 days, lag)
  • 18:01 jynus: creating special partitioning for db2034 and db2042 (ETA:5 days, lag)
  • 17:51 logmsgbot: bd808@mira Synchronized wmf-config/InitialiseSettings.php: Stop the first survey in fawiki and eswiki (f89621d) (duration: 01m 25s)
  • 17:44 logmsgbot: bd808@mira Synchronized php-1.27.0-wmf.11/includes/api/ApiMain.php: Log user-agents that are using HTTP when HTTPS is preferred (55ac0b7) (duration: 01m 26s)
  • 17:41 logmsgbot: bd808@mira Synchronized wmf-config/CommonSettings.php: Grant autocreateaccount to anons on loginwiki (d916008) (duration: 01m 27s)
  • 17:39 logmsgbot: bd808@mira Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: CentralAuth: Take auto-creation into account (f526ef1) (duration: 01m 28s)
  • 17:35 logmsgbot: bd808@mira Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: SessionManager: Save user name to metadata even if the user doesn't exist locally (a39b4ac) (duration: 01m 29s)
  • 17:01 jynus: restarting mysql at db1018
  • 16:50 robh: parsoid-vd restart was due to subbu irc request (i wasnt just randomly restarting things ;)
  • 16:47 robh: restarting parsoid-vd & parsoid-vd-client on ruthenium
  • 16:33 ottomata: uinstalling impala in analytics cluster
  • 15:45 bblack: upgrade packages (incl kernel) on eqiad caches hosts (cp1xxx)
  • 15:37 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool db1018 for maintenance (duration: 01m 49s)
  • 15:32 akosiaris: remove all networking configuration from asw-b-eqiad switch for nas1001-a, nas1001-b. Leave just descriptions
  • 15:21 bblack: upgrading packages (incl kernel) on esams cache hosts (cp3xxx) (codfw, ulsfo already done)
  • 15:11 akosiaris: powering off nas1001-a.eqiad.wmnet. https://phabricator.wikimedia.org/T124156
  • 15:08 akosiaris: powering off nas1001-b.eqiad.wmnet. https://phabricator.wikimedia.org/T124156
  • 15:01 elukey: re-enabled puppet on analytics1027
  • 14:39 elukey: stopped kafka (service) on kafka1012 (the host that caused the outage)
  • 14:24 moritzm: rebooting bohrium for kernel update
  • 14:04 _joe_: installing the new hhvm package on all the codfw appserver
  • 13:43 _joe_: installing the new HHVM package to the canary appservers (main and api)
  • 12:30 paravoid: force-rebooting pollux
  • 11:43 _joe_: uploaded hhvm_3.6.5+dfsg1-1+wm8 to trusty-wikimedia
  • 11:22 moritzm: rolling restart of swift in codfw
  • 11:14 elukey: disabled puppet on analytics1027 due to issues with Camus and HDFS
  • 10:17 moritzm: rolling restart of swift in esams
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jan 29 02:32:56 UTC 2016 (duration 7m 28s)
  • 02:25 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.11) (duration: 10m 40s)
  • 01:31 logmsgbot: ori@mira Synchronized wmf-config: I83da57cf: Enable persistent redis connections for job runners (duration: 01m 11s)
  • 01:03 logmsgbot: krenair@mira Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/267186/ (duration: 01m 09s)
  • 01:01 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/265292/ (duration: 01m 14s)
  • 00:57 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/267071/ (duration: 01m 11s)
  • 00:53 logmsgbot: krenair@mira Synchronized wmf-config/CirrusSearch-production.php: https://gerrit.wikimedia.org/r/#/c/266995/ (duration: 01m 11s)
  • 00:50 yurik: synced latest graphoid
  • 00:49 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.11/extensions/MobileFrontend/resources/skins.minerva.editor/init.js: https://gerrit.wikimedia.org/r/#/c/267168/ (duration: 01m 12s)
  • 00:45 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/267053/ (duration: 01m 10s)
  • 00:43 logmsgbot: krenair@mira Synchronized wmf-config/CirrusSearch-common.php: https://gerrit.wikimedia.org/r/#/c/267053/ (duration: 01m 10s)
  • 00:42 logmsgbot: krenair@mira Synchronized tests/cirrusTest.php: https://gerrit.wikimedia.org/r/#/c/267053/ (duration: 01m 11s)
  • 00:35 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/267025/ (duration: 01m 12s)
  • 00:25 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.11/extensions/Graph/modules/graph2.js: https://gerrit.wikimedia.org/r/#/c/267065/ (duration: 01m 11s)
  • 00:17 logmsgbot: krenair@mira Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/267060/ (duration: 01m 12s)
  • 00:02 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/267189/2 (duration: 01m 11s)

2016-01-28

  • 23:51 mutante: caesium - stop puppet, shutdown server, remove from icinga, clean puppet cert ...
  • 23:46 Tim: on ruthenium installing build dependencies and compiling uprightdiff for test
  • 23:20 logmsgbot: ori@mira Synchronized php-1.27.0-wmf.11/includes/api/ApiStashEdit.php: Ia4196eba9: Add ParserOutputStashForEdit hook for extension cache warming (duration: 01m 10s)
  • 23:17 logmsgbot: tgr@mira Synchronized php-1.27.0-wmf.11/includes/session/SessionManager.php: T125161 (duration: 01m 11s)
  • 22:58 ottomata: restoring MobileWebSectionUsage_14321266 from db1047 to dbstore1002 using mysqlimport
  • 22:23 bblack: starting cache_mobile->cache_text conversion in eqiad - https://phabricator.wikimedia.org/T109286
  • 22:09 bblack: eqiad pybal->etcd conversion done
  • 22:01 logmsgbot: dduvall@mira Synchronized php-1.27.0-wmf.11/extensions/WikimediaEvents/WikimediaEventsHooks.php: deploying fix for T125151 (duration: 01m 15s)
  • 21:59 mutante: releases.wm.org - switched backend to bromine
  • 21:58 bblack: converting active eqiad LVS/pybal to etcd
  • 21:56 mutante: caesium - stopped apache
  • 21:31 logmsgbot: ori@mira Synchronized php-1.27.0-wmf.11/extensions/AbuseFilter: I13fcc3ce4: Updated mediawiki/core Project: mediawiki/extensions/AbuseFilter 19baa3b6e51b8fe6baf6e3ce7e590060e8e6eec9 (duration: 01m 11s)
  • 21:27 bblack: converting backup/inactive eqiad LVS/pybal to etcd
  • 21:16 logmsgbot: dduvall@mira rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.27.0-wmf.11
  • 20:54 mutante: sca1001 - stop mathoid,graphoid,citoid
  • 20:52 mutante: sca1002 - stop mathoid,graphoid,citoid
  • 20:50 logmsgbot: dduvall@mira Synchronized php-1.27.0-wmf.11: syncing 1.27.0-wmf.11 for T125114 and https://gerrit.wikimedia.org/r/#/c/267128/ (duration: 03m 30s)
  • 20:25 bblack: depool -> reboot cp4008 (ulsfo text, trying new kernel with live traffic)
  • 20:00 bblack: depool -> reboot cp4011 (ulsfo mobile, currently unused for traffic - testing local conftool-scripts depool + new kernel)
  • 19:55 logmsgbot: ori@mira Synchronized wmf-config: Iea2573ccfbe: Revert "Autopromotion: remove deprecated onView event, fix INGROUPS" (duration: 02m 13s)
  • 19:43 ori: added tgr and marxarelli to security group on phab
  • 19:26 ottomata: kafka preferred-replica-election to rebalanace analytics-eqiad brokers
  • 18:22 elukey: rebooting analytics1001 for new kernel upgrade
  • 18:21 yurik: deployed graphoid
  • 17:43 elukey: rebooting analytics1002.eqiad.wmnet (Hadoop master's slave) for kernel upgrade
  • 17:39 urandom: finished deploying configuration change (https://gerrit.wikimedia.org/r/266299) to restbase staging
  • 17:38 robh: neglected to log i ifinished icinga/neon updates and its back to normal service (never interrrupted)
  • 17:38 urandom: restarting restbase on restbase200[1-3].codfw.wmnet (restbase staging)
  • 17:34 urandom: forcing puppet run on restbase200[1-3].codfw.wmnet (restbase staging)
  • 17:30 urandom: forcing puppet run on praseodymium.eqiad.wmnet, and restarting restbase (staging env)
  • 17:27 urandom: restarting restbase on xenon.eqiad.wmnet (restbase staging)
  • 17:25 urandom: forcing puppet run on xenon.eqiad.wmnet (restbase staging)
  • 17:21 urandom: restarting restbase on cerium.eqiad.wmnet
  • 17:18 urandom: forcing puppet run on cerium.eqiad.wmnet (restbase staging)
  • 17:18 robh: pushing icinga updates (shouldnt affect service but others shouldnt also try to update neon right now)
  • 17:17 logmsgbot: krenair@mira Synchronized README: testing (duration: 02m 08s)
  • 17:15 urandom: disabling pupplet on restbase staging hosts
  • 17:01 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/266957/ (duration: 02m 15s)
  • 16:52 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/267040/ (duration: 02m 13s)
  • 16:48 cmjohnson1: mw1172, mw1178,mw1217, mw1257 powering off task# T124642
  • 16:45 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264219/ (duration: 02m 12s)
  • 16:42 logmsgbot: krenair@mira Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/264219/ (duration: 02m 12s)
  • 16:37 Krenair: Downloaded and `chmod +x`'d mira:/srv/mediawiki-staging/.git/hooks/commit-msg
  • 16:29 mdholloway: mobileapps deployed 7583148, reverting in part 869ec35
  • 16:25 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: rv (duration: 02m 10s)
  • 16:25 bblack: upgrading packages (incl kernel) on all codfw caches
  • 16:19 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/266955/ (duration: 02m 14s)
  • 16:13 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/266564/ (duration: 02m 12s)
  • 16:05 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264733/ (duration: 02m 11s)
  • 15:39 bblack: kafka1012 booted up normally
  • 15:39 mdholloway: mobileapps deployed 869ec35
  • 15:37 bblack: rebooting kafka1012
  • 15:36 bblack: kafka1012: manually edited fstab, s/sdb1/sdb3/, s/sdc3/sdc1/, and now the filesystems mount and data looks right
  • 15:23 bblack: powering up kafka1012
  • 14:09 moritzm: rebooting serpens/seaborgium for kernel update
  • 13:58 logmsgbot: faidon@mira Synchronized wmf-config/InitialiseSettings.php: depool kafka1012 (duration: 02m 10s)
  • 13:31 bblack: citoid and cxserver public hostnames moving to cache_text
  • 12:59 moritzm: rebooting rutherfordium (peopleweb) for kernel update
  • 12:53 elukey: stopping kafka on kafka1012 + host reboot for kernel upgrade
  • 12:23 jynus: generating empty schema for new codfw parsercaches
  • 12:14 logmsgbot: jynus@mira Synchronized wmf-config/db-codfw.php: New parsercache servers for codfw datacenter (duration: 03m 10s)
  • 12:11 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: New parsercache servers for codfw datacenter (duration: 02m 15s)
  • 12:07 jynus: pooling new parsercaches for codfw datacenter
  • 12:01 moritzm: powercycled mw1163, was unreachable after reboot of the jobrunners (but now up again after powercycle via mgmt)
  • 11:31 elukey: disabled puppet on analytics1027 due to some issues with camus and hdfs
  • 10:42 moritzm: rebooted parsoid systems in codfw for kernel update, rolling reboot for eqiad
  • 10:39 _joe_: rolling reboot of jobrunners in eqiad
  • 02:46 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.11) (duration: 06m 16s)
  • 02:41 logmsgbot: tgr@mira Synchronized php-1.27.0-wmf.11/includes/: deploy SessionManager patch for T124971: gerrit 266944, 266946 (duration: 03m 20s)
  • 02:27 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 10m 21s)
  • 01:03 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264460/ (duration: 02m 30s)
  • 00:58 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264066/ (duration: 02m 26s)
  • 00:46 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.11/extensions/Gather/resources: https://gerrit.wikimedia.org/r/#/c/266793/ and https://gerrit.wikimedia.org/r/#/c/266792/ (duration: 02m 23s)
  • 00:41 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.11/extensions/Flow/: https://gerrit.wikimedia.org/r/#/c/266939/ (duration: 02m 27s)
  • 00:27 logmsgbot: krenair@mira Synchronized php-1.27.0-wmf.10/extensions/Flow/includes: https://gerrit.wikimedia.org/r/#/c/266938/ (duration: 02m 29s)
  • 00:09 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/266945/ (duration: 02m 36s)

2016-01-27

  • 22:36 robh: restarting parsoid-rt-client service on ruthenium
  • 22:29 ottomata: starting mysqldump of MobileWebSectionUsage_14321266 from db1047 into m4-master
  • 21:45 yurik: updated graphoid on scb*
  • 21:29 mdholloway: mobileapps deployed 6f35859
  • 21:26 cscott: updated OCG to version 64050af0456a43344b32e3e93561a79207565eaf
  • 21:26 logmsgbot: ori@mira Synchronized docroot and w: (no message) (duration: 02m 26s)
  • 19:48 YuviPanda: started nfs-exports daemon on labstore1001, had been dead for a few days
  • 19:32 mutante: stat1002 - redis.exceptions.ConnectionError: Error connecting to mira.codfw.wmnet:6379. timed out.
  • 19:31 mutante: stat1002 - running puppet, was reported as last run about 4 hours ago but not deactivated
  • 19:14 logmsgbot: dduvall@mira rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.11
  • 19:07 ejegg: set donation queue consumer time limit back to 90 sec
  • 18:49 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Repool pc1006 after cloning (duration: 02m 25s)
  • 18:48 bd808: HHVM on mw1019 still dying on a regular basis with "Lost parent, LightProcess exiting"
  • 18:00 csteipp: deploy patch for T103239
  • 17:50 csteipp: deploy patch for T97157
  • 17:47 jynus: migrating ruthenium parsoid-test database to m5-master
  • 17:27 elukey: rebooting analytics105* hosts to upgrade their kernel
  • 17:16 elukey: rebooting analytics1035.eqiad.wmnet for kernel upgrade
  • 16:23 ejegg: updated SmashPig from 072c7ec6ed94e7074ba35b7986d5dde94866fe2f to 97629339994bffe8831a9067f5e9c21fa423586b
  • 16:22 logmsgbot: thcipriani@mira Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/CentralAuthUtils.php: SWAT: Preserve certain keys when updating central session gerrit:266672 (duration: 02m 28s)
  • 16:11 logmsgbot: thcipriani@mira Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: SWAT: Avoid forceHTTPS cookie flapping if core and CA are setting the same cookie gerrit:266671 (duration: 02m 26s)
  • 16:03 elukey: rebooting analytics 1043 -> 1050 for kernel upgrade.
  • 15:47 elukey: rebooting analytics 1026, 1040 -> 1042 due to kernel upgrade.
  • 14:58 jynus: cloning persercache contents from pc1003 to pc1006
  • 14:45 elukey: rebooting analytics 1036 to 1039 for kernel upgrade
  • 14:35 elukey: analytics 1035 hasn't been rebooted because it is a Hadoop Journal Node (will be restarted in the end)
  • 14:04 elukey: rebooting analytics 1032 to 1035 for kernel upgrades
  • 14:03 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool pc1003 for cloning to pc1006 (duration: 02m 30s)
  • 13:59 jynus: about to going new hardware/OS/mariadb-only for parsercache service
  • 13:32 elukey: rebooting analytics1030/1031 for kernel upgrade
  • 13:15 akosiaris: rebooting fermium for kernel upgrades
  • 13:10 elukey: rebooting analytics1029 for kernel upgrade
  • 12:29 moritzm: rebooting analytics1028 for kernel update
  • 10:25 ema: restarting apache2 and hhvm on mw1119
  • 03:19 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-production.php: Correct invalid cirrus shard configuration (duration: 02m 59s)
  • 02:55 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jan 27 02:55:21 UTC 2016 (duration 7m 13s)
  • 02:48 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.11) (duration: 10m 25s)
  • 02:24 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 51s)
  • 01:59 logmsgbot: ori@mira Synchronized docroot and w: Icc4f6134b0: Add a speed experiment which inlines the top stylesheet (duration: 02m 28s)
  • 01:29 MaxSem: on terbium: ran mwscript namespaceDupes.php --wiki=wuuwiki --source-pseudo-namespace= --add-suffix=/renamed --fix
  • 01:26 MaxSem: Fail, trying something else...
  • 01:21 MaxSem: running mwscript namespaceDupes.php --wiki=wuuwiki --move-talk --fix
  • 00:52 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/266497/ (duration: 02m 26s)
  • 00:48 logmsgbot: krenair@mira Synchronized w/static/images/project-logos/ukwikinews.png: https://gerrit.wikimedia.org/r/#/c/266497/ (duration: 02m 29s)
  • 00:44 logmsgbot: krenair@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/266161/ (duration: 02m 27s)
  • 00:15 logmsgbot: ebernhardson@mira Synchronized php-1.27.0-wmf.11/extensions/CirrusSearch/: Allow pointing morelike queries at a specific datacenter (duration: 03m 04s)
  • 00:10 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-production.php: point morelike queries back at the eqiad cluster (duration: 05m 41s)
  • 00:02 chasemp: enable puppet and codify the 192 thread count for nfsd

2016-01-26

  • 22:25 logmsgbot: dduvall@mira rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.27.0-wmf.11, for real this time
  • 22:17 logmsgbot: dduvall@mira rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.27.0-wmf.11
  • 22:15 logmsgbot: dduvall@mira Synchronized php-1.27.0-wmf.11: syncing wmf.11 backports of session fixes (duration: 03m 55s)
  • 21:55 logmsgbot: ori@mira Synchronized docroot and w: I9b054d847a: New set of speed experiments (duration: 01m 29s)
  • 21:41 marxarelli: filed https://phabricator.wikimedia.org/T124828 for fatal in extensions/Echo
  • 21:22 marxarelli: Fatal error: Cannot redeclare class CallbackFilterIterator in /srv/mediawiki-staging/php-1.27.0-wmf.11/extensions/Echo/includes/iterator/CallbackFilterIterator.php on line 24
  • 21:21 marxarelli: lint error found when running sync-dir 'Errors parsing /srv/mediawiki-staging/php-1.27.0-wmf.11/extensions/Echo/includes/iterator/CallbackFilterIterator.php'
  • 21:11 marxarelli: sync-dir php linting failed
  • 21:02 marxarelli: resuming sync-dir and ignoring error as a known issue
  • 20:59 marxarelli: getting 'Lost parent, LightProcess exiting' when running sync-dir
  • 20:57 chasemp: drop labstore1001 nfs threads down to 192
  • 20:46 chasemp: stopping nfs on labstore1001
  • 20:46 marxarelli: modified wikiversions.php locally on mw1017 to promote all wikis to wmf.11 for initial testing
  • 20:18 marxarelli: locally modified wikiversions.php and wikiversions.json on mw1017 for testing
  • 20:14 marxarelli: running 'sync-common --verbose deployment.eqiad.wmnet' on mw1017 to sync wmf.11 for initial testing
  • 20:02 marxarelli: proceeding with train deploy. wmf.11 to mw1017, then group0
  • 19:46 akosiaris: issuing a varnish ban on all esams mobile frontend varnish for req.http.host .*wikimedia.org
  • 19:45 akosiaris: issuing a varnish ban on all esams mobile backend varnish for req.http.host .*wikimedia.org
  • 19:44 akosiaris: issuing a varnish ban on all ulsfo mobile frontend varnish for req.http.host .*wikimedia.org
  • 19:44 akosiaris: issuing a varnish ban on all ulsfo mobile backend varnish for req.http.host .*wikimedia.org
  • 19:43 akosiaris: issuing a varnish ban on all codfw mobile frontend varnish for req.http.host .*wikimedia.org
  • 19:36 akosiaris: issuing a varnish ban on all codfw mobile backend varnish for req.http.host .*wikimedia.org
  • 19:36 akosiaris: issuing a varnish ban on all eqiad mobile frontend varnish for req.http.host .*wikimedia.org
  • 19:36 akosiaris: issuing a varnish ban on all eqiad mobile backend varnish for req.http.host .*wikimedia.org
  • 19:36 akosiaris: all of the above referred to cache_text
  • 19:29 akosiaris: all of the above already done, back logging
  • 19:29 akosiaris: issuing a varnish ban on all esams frontend varnish for req.http.host .*wikimedia.org
  • 19:29 akosiaris: issuing a varnish ban on all esams backend varnish for req.http.host .*wikimedia.org
  • 19:29 akosiaris: issuing a varnish ban on all ulsfo backend varnish for req.http.host .*wikimedia.org
  • 19:29 akosiaris: issuing a varnish ban on all ulsfo frontend varnish for req.http.host .*wikimedia.org
  • 19:28 akosiaris: issuing a varnish ban on all ulsfo backend varnish for req.http.host .*wikimedia.org
  • 19:28 akosiaris: issuing a varnish ban on all codfw frontend varnish for req.http.host .*wikimedia.org
  • 19:28 akosiaris: issuing a varnish ban on all codfw backend varnish for req.http.host .*wikimedia.org
  • 19:28 akosiaris: issuing a varnish ban on all eqiad frontend varnish for req.http.host .*wikimedia.org
  • 19:14 akosiaris: issuing a varnish ban on all eqiad backend varnish for req.http.host .*wikimedia.org
  • 19:02 marxarelli: backports to wmf.11 ready on mira but delaying train due to wikimedia.org outage
  • 18:44 _joe_: running salt --batch-size=20 -C 'G@luster:appserver and G@site:eqiad' cmd.run 'puppet agent -t --tags mw-apache-config'
  • 18:27 robh: i broke icinga, but then i fixed it, icinga back to normal.
  • 18:21 robh: icinga is broken, it seems it was from a change before mine, but my forced reload broke it
  • 18:18 legoktm: running mwscript updateArticleCount.php --wiki=jawiki --update=1
  • 18:14 cmjohnson1: starting puppet on mw cluster
  • 18:14 robh: i broke icinga, fixing
  • 18:08 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Pool new parsercache pc1005 after cloning it from pc1002 (duration: 01m 28s)
  • 17:43 thcipriani: ltwiki collation updated 503623 rows processed
  • 17:35 mutante: mw1258 - restart hhvm
  • 17:20 cmjohnson: disabling puppet on mw cluster
  • 17:02 thcipriani: running updateCollation on ltwiki
  • 17:01 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Set category collation to uca-lt on lt.wikipedia gerrit:266427 (duration: 01m 33s)
  • 16:55 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration on ur.wikipedia gerrit:265888 (duration: 07m 10s)
  • 16:36 logmsgbot: thcipriani@mira Synchronized w/static/images/project-logos/etwikiquote.png: SWAT: Update et.wikiquote logo gerrit:265623 (duration: 01m 27s)
  • 16:31 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable SandboxLink on nl.wikiquote gerrit:265666 (duration: 01m 26s)
  • 16:26 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespaces configuration on sk.wikipedia gerrit:265896 (duration: 01m 27s)
  • 16:19 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove Tranwiki namespace on wuu.wikipedia gerrit:265892 and Add Portal namespace on wuu.wikipedia gerrit:265893 (duration: 01m 27s)
  • 16:12 logmsgbot: thcipriani@mira Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration for wuu.wikipedia gerrit:265891 (duration: 01m 29s)
  • 14:57 ema: Finished migration of mobile traffic to text cluster in esams https://phabricator.wikimedia.org/T109286
  • 14:48 chasemp: RPS on eth0 on labstores
  • 14:39 bblack: upgrading packages (incl kernel) on all ulsfo caches (cp4xxx)
  • 14:21 akosiaris: migrating alsafi,mx2001 back to 2004 for testing
  • 14:14 akosiaris: migrate alsafi,mx2001 back from ganeti2004 to fix a network misconfiguration
  • 13:32 moritzm: rebooted nescio/maerlant for kernel update
  • 13:14 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool pc1002 for maintenance (clone to pc1005) (duration: 01m 39s)
  • 12:39 akosiaris: rolling reboot of ganeti200{1,2,3,4,5,6}.codfw.wmnet for kernel upgrade
  • 12:10 moritzm: rebooting mx2001/mx1001 (with a delay in between) for kernel update
  • 11:50 moritzm: rebooting etherpad1001 for kernel update
  • 11:46 moritzm: rebooting bromine for kernel update
  • 10:50 ema: Starting migration of mobile traffic to text cluster in esams https://phabricator.wikimedia.org/T109286
  • 09:30 hashar: restarting Jenkins to upgrade the gearman plugin with https://review.openstack.org/#/c/271543/
  • 09:28 _joe_: finishing reboots of appservers in eqiad
  • 04:27 legoktm: restarted resetGlobalUserTokens.php after it lost mysql connection again
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jan 26 02:30:58 UTC 2016 (duration 7m 0s)
  • 02:24 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 36s)
  • 01:45 logmsgbot: krenair@mira Synchronized wmf-config/wikitech.php: https://gerrit.wikimedia.org/r/#/c/266453/ (duration: 01m 27s)
  • 00:45 mobrovac: mobileapps deploying c2318b6
  • 00:40 logmsgbot: ebernhardson@mira Synchronized wmf-config/CommonSettings.php: (no message) (duration: 01m 25s)
  • 00:37 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT bd808 (duration: 01m 34s)
  • 00:32 logmsgbot: ebernhardson@mira Synchronized portals/: SWAT jgirault (duration: 01m 28s)
  • 00:29 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT ebernhardson (duration: 01m 26s)
  • 00:27 logmsgbot: ebernhardson@mira Synchronized wmf-config/CirrusSearch-common.php: SWAT ebernhardson (duration: 01m 26s)
  • 00:25 logmsgbot: ebernhardson@mira Synchronized wmf-config/CommonSettings.php: SWAT ebernhardson (duration: 01m 27s)
  • 00:15 logmsgbot: ebernhardson@mira Synchronized wmf-config/CommonSettings.php: SWAT AaronSchulz (duration: 01m 26s)
  • 00:13 logmsgbot: ebernhardson@mira Synchronized wmf-config/filebackend-production.php: SWAT AaronSchulz (duration: 01m 26s)
  • 00:10 logmsgbot: ebernhardson@mira Synchronized wmf-config/CommonSettings.php: SWAT James_F (duration: 01m 26s)
  • 00:08 logmsgbot: ebernhardson@mira Synchronized wmf-config/InitialiseSettings.php: SWAT James_F (duration: 01m 35s)

2016-01-25

  • 23:14 logmsgbot: legoktm@mira Synchronized php-1.27.0-wmf.10/includes/parser/: live hacks, now committed (duration: 01m 27s)
  • 23:07 logmsgbot: legoktm@mira Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/266410/ (duration: 01m 35s)
  • 22:52 logmsgbot: ori@mira Synchronized php-1.27.0-wmf.10/includes/parser/ParserOutput.php: Fix-up for ParserOutput.php@263 debug logging (duration: 01m 27s)
  • 22:30 logmsgbot: legoktm@mira Synchronized php-1.27.0-wmf.10/includes/parser/: https://gerrit.wikimedia.org/r/#/c/266401/ + https://gerrit.wikimedia.org/r/#/c/266406/ + live hacks (duration: 01m 28s)
  • 22:28 logmsgbot: legoktm@mira Synchronized php-1.27.0-wmf.10/includes/content/WikitextContent.php: https://gerrit.wikimedia.org/r/#/c/266401/ (duration: 01m 29s)
  • 21:53 logmsgbot: hoo@mira Synchronized wmf-config/Wikibase-production.php: Disable (not yet deployed) commons category sidebar link overwrite in production (duration: 01m 28s)
  • 21:47 mutante: nitrogen - shutdown -h now ....
  • 21:45 mutante: alsafi - was reported down in icinga , is ganeti VM - fixed by just logging in as if it went to hibernate
  • 21:37 mdholloway: mobileapps deployed 9252a22
  • 21:30 mutante: nitrogen - stop puppet, stop salt, remove from stored configs / icinga
  • 20:19 logmsgbot: hoo@mira Synchronized wmf-config/Wikibase-labs.php: (no message) (duration: 01m 28s)
  • 20:14 chasemp: bump labstore nfs threads to 288 from 244
  • 19:32 paravoid: eqiad: removing static routes for 6to4/Teredo to nitrogen (decommissioning our own relays)
  • 19:10 bd808: Live hacking on mw1017 to debug 1.27.0-wmf.11 issues. All wikis there currently set to use 1.27.0-wmf.11.
  • 19:05 chasemp: labstore1001 temp change to CFQ scheduler on 01/22/2015
  • 19:04 chasemp: the nfsd thread change is on labstore1001
  • 19:04 chasemp: nfsd has 224 threads atm and was bumped up over the weekend
  • 18:58 ori: removed unused wikiversions.cdb on mira and tin
  • 18:28 jynus: retroactively logging the depool of mw1217, mw1178 and mw1257 3 hours ago (Jan 25 15:45:26)
  • 16:49 ema: Finished migration of mobile traffic to text cluster in ulsfo https://phabricator.wikimedia.org/T109286
  • 16:38 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Preparing ips for new parsercache deployments (third try) (duration: 01m 35s)
  • 16:26 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Preparing ips for new parsercache deployments (second try after running puppet) (duration: 03m 23s)
  • 16:25 _joe_: restarting salt-minion on all deployment targets
  • 16:24 _joe_: running salt deploy.fixurl on all deployment targets
  • 16:09 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Preparing ips for new parsercache deployments (duration: 03m 32s)
  • 15:51 ejegg: updated DjangoBannerStats from a64fe0e373a978d3df0b7f1dd74ac4cc5c78d34e to 71df14d4d8b11f3ca0ef1eeb6c6e2db9be79103a
  • 15:35 ema: Starting migration of mobile traffic to text cluster in ulsfo https://phabricator.wikimedia.org/T109286
  • 15:14 chasemp: restart of pdns and pdns-recursor on labservices1001
  • 14:56 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: deploy new parsercache hardware (pc1004) substituting pc1001 (duration: 03m 25s)
  • 13:16 elukey: ran kafka preferred-replica-election on kafka1022 to balance the leaders
  • 13:07 elukey: restarting kafka on kafka1022
  • 12:57 elukey: restarting kafka on kafka1013
  • 12:38 elukey: restarting kafka on kafka1014
  • 12:20 jynus: compressed and truncated iridium's phab daemons.log - it was taking 20% of disk space
  • 12:04 ema: restarting kafka on kafka1018
  • 11:26 jynus: stopping mysql at pc1001 and cloning to pc1004
  • 10:55 logmsgbot: jynus@mira Synchronized wmf-config/db-eqiad.php: Depool pc1001 for maintenance (clone to pc1004) (duration: 01m 41s)
  • 10:11 _joe_: switching the active deployment host to mira
  • 09:56 ema: limiting GCLogFileSize and restarting kafka on kafka1012
  • 09:31 _joe_: rolling reboot of the eqiad appserver cluster
  • 09:27 moritzm: installed fuse security update on labnodepool1001 (the other fuse installations are on Ubuntu, which doesn't ship the udev rule, but uses mountall instead)
  • 07:47 paravoid: stat1002: umount -f /mnt/hdfs
  • 07:34 _joe_: rebooting alsafi, unresponsive to ssh
  • 07:24 _joe_: restarting hhvm on mw1148, stuck in HPHP::Treadmill::startRequest (__lll_lock_wait)
  • 07:23 _joe_: restarting hhvm on mw1143, stuck into HPHP::SynchronizableMulti::waitImpl (__pthread_cond_wait)
  • 03:10 logmsgbot: tstarling@tin Synchronized php-1.27.0-wmf.10/includes/parser/ParserCache.php: (no message) (duration: 00m 25s)
  • 03:03 logmsgbot: tstarling@tin Synchronized php-1.27.0-wmf.10/includes/parser/ParserCache.php: (no message) (duration: 00m 25s)
  • 03:02 logmsgbot: tstarling@tin Synchronized php-1.27.0-wmf.10/includes/parser/ParserOutput.php: (no message) (duration: 00m 27s)
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jan 25 02:30:13 UTC 2016 (duration 6m 52s)
  • 02:23 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 09s)

2016-01-24

  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jan 24 02:31:21 UTC 2016 (duration 6m 58s)
  • 02:24 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 11s)

2016-01-23

  • 19:03 logmsgbot: ebernhardson@tin Synchronized wmf-config/CirrusSearch-production.php: config change to repoint morelike search from eqiad to codfw (duration: 00m 26s)
  • 19:02 logmsgbot: ebernhardson@tin Synchronized php-1.27.0-wmf.10/extensions/CirrusSearch/: Support code for repointing morelike queries from eqiad to codfw (duration: 00m 30s)
  • 19:00 ebernhardson: repoint most expensive search queries (morelike) at codfw cluster to reduce load. 1/2 of eqiad cluster maxed on cpu
  • 16:47 Krinkle: mwscript deleteEqualMessages.php --wiki wowiki
  • 13:25 jynus: upgrading and restarting db1046
  • 13:13 jynus: db1046 maintenance finished- restarting mysql to apply latest configuration
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jan 23 02:32:15 UTC 2016 (duration 7m 3s)
  • 02:25 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 09s)
  • 01:33 logmsgbot: bd808@tin rebuilt wikiversions.php and synchronized wikiversions files: Back to 1.27.0-wmf10 again after fixking l10n cache problems
  • 01:28 logmsgbot: bd808@tin rebuilt wikiversions.php and synchronized wikiversions files: Temporarily back to 1.27.0-wmf11; need to rebuild l10n cache
  • 01:16 logmsgbot: bd808@tin rebuilt wikiversions.php and synchronized wikiversions files: Revert all wikis to 1.27.0-wmf.10
  • 00:08 logmsgbot: bd808@tin Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: https://gerrit.wikimedia.org/r/#/c/265872/ (duration: 00m 25s)
  • 00:07 logmsgbot: bd808@tin Synchronized php-1.27.0-wmf.11/includes/session/CookieSessionProvider.php: https://gerrit.wikimedia.org/r/#/c/265871/ (duration: 00m 25s)

2016-01-22

  • 23:43 logmsgbot: legoktm@tin Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: https://gerrit.wikimedia.org/r/#/c/265870/ (duration: 00m 26s)
  • 23:42 logmsgbot: legoktm@tin Synchronized php-1.27.0-wmf.11/includes/session/CookieSessionProvider.php: https://gerrit.wikimedia.org/r/#/c/265869/ (duration: 00m 26s)
  • 23:22 mobrovac: restbase cassandra truncating local_group_wiktionary_T_term_definition.data
  • 22:33 mdholloway: mobileapps deployed 2900faa
  • 22:23 logmsgbot: twentyafterfour@tin Finished scap: deploy https://gerrit.wikimedia.org/r/#/c/263415/ and clean up old branches (duration: 07m 02s)
  • 22:16 logmsgbot: twentyafterfour@tin Started scap: deploy https://gerrit.wikimedia.org/r/#/c/263415/ and clean up old branches
  • 22:06 bblack: upgrading vhtcpd on all caches
  • 22:05 eileen: upgrade Civicrm from b9ebf3d31aeab8120143cfbf6bc2df0f617341cf to c009af16944a6478bd0292422f5bb0151f7a22c1
  • 21:49 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.11/includes/: Fix T124468, for real this time (duration: 00m 36s)
  • 21:48 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.11/includes/: Fix T124468 (duration: 00m 38s)
  • 21:17 legoktm: running migrateAccount.php --attachbroken over list of all unattached users (T74791)
  • 20:04 mutante: ruthenium - rebooting for reinstall
  • 19:42 logmsgbot: aaron@tin Synchronized wmf-config/CommonSettings.php: Revert "Bump $wgJobBackoffThrottling to lower the htmlcacheupdate backlog" (duration: 00m 32s)
  • 18:51 jynus: "repairing" enwiki.oldtable on dbstore1001
  • 18:40 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Aborting pc1001 maintenance (duration: 00m 31s)
  • 18:15 legoktm: running CentralAuth's resetGlobalUserTokens.php to force session resets for all users T124440
  • 18:02 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.11/includes/user/User.php: Fix T124414 (duration: 00m 33s)
  • 17:53 legoktm: manually attaching User:Mower Genetics and User:Themeetingplace because they made edits somehow (T74791)
  • 17:46 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: Stop logging the CirrusSearchRequests channel (duration: 00m 32s)
  • 17:44 legoktm: running migrateAccount.php --attachbroken over lists on T74791
  • 17:39 _joe_: removed an archived CirrusSearchRequests.log on fluorine, now we have enough room for the weekend
  • 17:29 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes: Fix T124406 (duration: 00m 35s)
  • 17:05 mobrovac: mobileapps deploying bba45456
  • 17:00 logmsgbot: reedy@tin Synchronized docroot and w: Extra noc symlinks (duration: 00m 32s)
  • 16:58 logmsgbot: jynus@tin Synchronized wmf-config/InitialiseSettings.php: monolog: reduce on-disk logging of DBPerformance to warning (duration: 00m 32s)
  • 16:47 jynus: truncating 100GB DBPerformance.log on fluorine, compressed backup available
  • 16:46 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: Fix T124409, part 2 (duration: 00m 32s)
  • 16:46 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: Fix T124409, part 1 (duration: 00m 33s)
  • 16:41 cmjohnson1: Troubleshooting mw1228
  • 16:36 _joe_: all api appservers in eqiad have been restarted
  • 16:21 ori: restarted statsv on hafnium
  • 15:53 ema: Finished migrating mobile traffic to text cluster in codfw (Mexico + green US states on this map https://phabricator.wikimedia.org/T114659)
  • 15:39 gwicke: aqs: increased compression block size on per-article table from 128k to 256k; expectation is to further increase compression ratio & reduce seeks on rotating disks
  • 15:22 Reedy: created translate tables on ruwikimedia T121766
  • 14:18 paravoid: cr1-eqord: turning up BGP with Zayo
  • 13:08 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.10/extensions/MobileFrontend: I08cdf37a1: Use TitleSquidURLs hook to purge mobile URLs directly (Bug: T124165) (duration: 00m 33s)
  • 13:05 logmsgbot: ori@tin Synchronized wmf-config/InitialiseSettings.php: If443f3c80: monolog: explicitly declare logstash as debug for sessions (duration: 00m 34s)
  • 12:31 ema: Starting migration of mobile traffic to text cluster https://phabricator.wikimedia.org/T109286
  • 11:35 logmsgbot: oblivian@tin Synchronized wmf-config/InitialiseSettings.php: Re-synching (duration: 00m 31s)
  • 11:25 logmsgbot: oblivian@tin Synchronized wmf-config/InitialiseSettings.php: Stop writing session logs to fluorine (duration: 01m 25s)
  • 11:17 bblack: codfw LVS under etcd/conftool control now, like ulsfo
  • 10:57 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool pc1001 for maintenance (duration: 02m 48s)
  • 10:45 _joe_: rolling restarting the API cluster in eqiad
  • 10:34 _joe_: rolling restart of all api appservers in eqiad
  • 10:07 _joe_: dropping api logs from 2015 on fluorine
  • 09:10 _joe_: rolling restart of imagescalers in eqiad
  • 08:48 _joe_: powercycling ms-be1002, blank console, down
  • 08:46 _joe_: rebooting mw1001 with a new kernel
  • 08:07 _joe_: upgrading kernel on all mw hosts in eqiad
  • 05:07 logmsgbot: tstarling@tin Synchronized php-1.27.0-wmf.11/includes/parser/ParserCache.php: (no message) (duration: 01m 28s)
  • 02:42 logmsgbot: tstarling@tin Synchronized php-1.27.0-wmf.11/includes/parser/ParserCache.php: (no message) (duration: 01m 28s)
  • 02:40 logmsgbot: tstarling@tin Synchronized php-1.27.0-wmf.11/includes/OutputPage.php: (no message) (duration: 01m 32s)
  • 02:30 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.11) (duration: 09m 31s)
  • 01:44 logmsgbot: catrope@tin Finished scap: Deploying OATHAuth and WikimediaMessages i18n changes (duration: 30m 52s)
  • 01:37 gwicke: restbase cassandra: increased compression chunk size from 256 to 512k on wikimedia and wikipedia html and data-parsoid
  • 01:13 logmsgbot: catrope@tin Started scap: Deploying OATHAuth and WikimediaMessages i18n changes
  • 01:08 eileen: Updating CiviCRM from cb5e20c29d7376920c45eb5c343e6ee464217833 to to b9ebf3d31aeab8120143cfbf6bc2df0f617341cf
  • 00:19 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: Add ability for OfficeWiki sysops to add and remove flood group rights from themselves. (duration: 01m 27s)
  • 00:14 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: enable EventBus extension on mediawikiwiki (duration: 01m 27s)
  • 00:10 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: enable sandboxlink on ladwiki and dont sent messages to autocreated accounts on metawiki (duration: 01m 27s)
  • 00:08 logmsgbot: ebernhardson@tin Synchronized wmf-config/throttle.php: Santiago Editatón throttle rule (duration: 01m 27s)
  • 00:02 logmsgbot: ebernhardson@tin Synchronized wmf-config/CirrusSearch-production.php: configure cirrus completion suggester recycling (duration: 01m 29s)
  • 00:00 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: configure cirrus completion suggester recycling (duration: 01m 28s)

2016-01-21

  • 22:46 legoktm: started running migratePass0.php (CentralAuth) on group1 wikis
  • 22:24 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.27.0-wmf.11
  • 22:23 legoktm: started running migratePass0.php (CentralAuth) on group0 wikis
  • 21:35 ejegg: re-enabled low-level fundraising banner campaigns
  • 21:30 ejegg: reverted donatewiki maintenance message
  • 21:19 ejegg: updated paymentswiki from a7785baa7b40b442ecf0b60d47572502d0759780 to 1817327b4b0919ebe26bbd8b9d84fac1bd7ddb03
  • 21:13 andrewbogott: all reachable labs instances are now running security-patched kernels.
  • 21:12 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: cswiktionary to 1.27.0-wmf.11
  • 21:12 ejegg: disabled low-level fundraising banner campaigns
  • 21:12 andrewbogott: all labvirt10xx hosts are now running the latest utopic kernel
  • 21:09 ejegg: replaced form on donatewiki with maintenance notice
  • 21:08 logmsgbot: thcipriani@tin Synchronized php-1.27.0-wmf.11/includes/session/SessionManager.php: SessionManager: Notify AuthPlugin when auto-creating accounts gerrit:265578 (duration: 01m 26s)
  • 21:01 andrewbogott: rebooting labvirt1010
  • 20:51 andrewbogott: rebooting labvirt1009
  • 20:33 andrewbogott: rebooting labvirt1007
  • 20:33 logmsgbot: dduvall@tin Synchronized php-1.27.0-wmf.11/includes/user/BotPassword.php: deploy fix for T124335 (duration: 01m 29s)
  • 20:27 mobrovac: restbase deploy end of 79a4d27
  • 20:20 mobrovac: restbase deploy start of 79a4d27
  • 20:16 andrewbogott: rebooting labvirt1006
  • 19:58 mobrovac: mobileapps deploying 68c09e
  • 19:54 logmsgbot: dduvall@tin rebuilt wikiversions.php and synchronized wikiversions files: rollback cswiktionary to 1.27.0-wmf.10
  • 19:54 andrewbogott: rebooting labvirt1005
  • 19:32 andrewbogott: rebooting labvirt1004
  • 19:31 logmsgbot: dduvall@tin Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthTokenSessionProvider.php: deploy https://gerrit.wikimedia.org/r/#/c/265545/ for 1.27.0-wmf.11 (duration: 01m 28s)
  • 19:24 mobrovac: restbase rolling-restart after firejail inclusion
  • 19:22 mobrovac: restbase re-enabling puppet in prod
  • 19:14 andrewbogott: rebooting labvirt1003
  • 18:57 logmsgbot: dduvall@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.11
  • 18:53 marxarelli: starting train promotion of group1 to 1.27.0-wmf.11
  • 18:52 marxarelli: sync to mw2020 failed due to failed host key verification, mw2087/mw2039/mw2098 due to connection failed
  • 18:47 marxarelli: 4 apache sync failures during sync-file, appear to be know issues
  • 18:46 andrewbogott: rebooting labvirt1002
  • 18:43 logmsgbot: dduvall@tin Synchronized php-1.27.0-wmf.11/includes/session/PHPSessionHandler.php: deploy follow-up warning fix for T124126 (duration: 01m 28s)
  • 18:43 mobrovac: restbase disabling puppet in prod for testing firejail in staging
  • 18:41 akosiaris: enable puppet and salt-minion on sca100{1,2}.eqiad.wmnet
  • 18:39 akosiaris: depool sca1001, sca1002 for citoid
  • 18:34 akosiaris: pool scb1001, scb1002 for citoid
  • 18:07 andrewbogott: rebooting labvirt1001
  • 17:57 akosiaris: depool sca1001,sca1002 for graphoid pybal config
  • 17:49 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Really enable ContentTranslationCorpora gerrit:265514 (duration: 01m 29s)
  • 17:48 akosiaris: add scb1001, scb1002 in pybal graphoid config
  • 17:30 akosiaris: disabled puppet and salt-minion on sca1001, sca1002 for graphoid upgrade
  • 17:24 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Enable ContentTranslationCorpora Part II gerrit:265459 (duration: 01m 28s)
  • 17:22 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable ContentTranslationCorpora Part I gerrit:265459 (duration: 01m 28s)
  • 17:12 _joe_: restarting pybal on the main balancers in ulsfo to consume from etcd
  • 17:02 andrewbogott: rebooting labvirt1008
  • 16:42 jynus: batch-converting m4-master (log) tables from innodb to tokudb
  • 16:42 logmsgbot: thcipriani@tin Synchronized php-1.27.0-wmf.11/extensions/MobileFrontend/MobileFrontend.php: SWAT: Use TitleSquidURLs hook to purge mobile URLs directly Part II gerrit:265486 (duration: 01m 28s)
  • 16:40 logmsgbot: thcipriani@tin Synchronized php-1.27.0-wmf.11/extensions/MobileFrontend/includes/MobileFrontend.hooks.php: SWAT: Use TitleSquidURLs hook to purge mobile URLs directly Part I gerrit:265486 (duration: 01m 28s)
  • 16:35 ottomata: stopped eventlogging mysql consumers for long downtime: https://phabricator.wikimedia.org/T120187
  • 16:28 logmsgbot: thcipriani@tin Synchronized php-1.27.0-wmf.10/extensions/MobileApp/config/config.json: SWAT: Roll out RESTBase usage to Android Beta app: 100% gerrit:265117 (duration: 01m 27s)
  • 16:22 logmsgbot: thcipriani@tin Synchronized php-1.27.0-wmf.11/extensions/MobileApp/config/config.json: SWAT: Roll out RESTBase usage to Android Beta app: 100% gerrit:265118 (duration: 01m 28s)
  • 16:20 ottomata: started eventlogging mysql consumers
  • 16:19 paravoid: deactivating GTT BGP peering on cr2-eqiad
  • 16:05 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: wgRCWatchCategoryMembership true on dewiki gerrit:264732 (duration: 01m 28s)
  • 15:59 ottomata: stopping eventlogging mysql consumers for https://phabricator.wikimedia.org/T123546
  • 14:37 paravoid: upgraded cr2-codfw to JunOS 13.3R8.7
  • 13:20 _joe_: rolling reboot of imagescalers, jobrunners in codfw
  • 12:10 paravoid: upgrading cr1-codfw to JunOS 13.3R8.7
  • 11:27 _joe_: restarting pybal on lvs4003, switching to etcd
  • 11:25 _joe_: restarting pybal on lvs4004, switching to etcd
  • 11:09 jynus: adding new version of mariadb to carbon for jessie (10.0.23-1)
  • 10:19 _joe_: mw2098 doesn't reboot, console unreachable
  • 10:10 jynus: mw2098.codfw.wmnet failed to sync
  • 10:10 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Restore s5 DB configuration (duration: 01m 57s)
  • 09:53 _joe_: rolling reboot of the codfw appserver layer
  • 09:27 _joe_: powercycled mw1162, memory exhaustion
  • 08:01 _joe_: upgrading all codfw appserver layer's kernel to linux-image-3.13.0-76-generic
  • 02:56 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jan 21 02:56:44 UTC 2016 (duration 7m 9s)
  • 02:49 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.11) (duration: 09m 39s)
  • 02:27 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 33s)
  • 02:24 mobrovac: citoid deploying 3a1b6c8648
  • 02:16 ori: Restarting jobrunner service on job runners to ensure I180856917 gets picked up
  • 01:47 mutante: nitrogen - install package upgrades
  • 01:15 bd808: Restarted logstash on logstash1003
  • 01:14 bd808: Restarted logstash on logstash1002
  • 01:04 logmsgbot: maxsem@tin Synchronized wmf-config/: https://gerrit.wikimedia.org/r/#/c/265395/ (duration: 00m 32s)
  • 00:56 logmsgbot: maxsem@tin Synchronized php-1.27.0-wmf.11/extensions/GeoData/: https://gerrit.wikimedia.org/r/#/c/265409/ (duration: 00m 33s)
  • 00:50 logmsgbot: maxsem@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/265142/ (duration: 00m 32s)

2016-01-20

  • 23:56 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.10/extensions/SemanticForms/: fix wikitech again (duration: 00m 34s)
  • 23:06 bd808: Restarted logstash on logstash1001
  • 23:04 bd808: Logstash1001 went nuts and decided that instead of 2016 it would go back to the start of 2015 after 2015-12-31T23:59
  • 22:54 bd808: no HHVM log events in logstash since 2015-12-31T23:59:44.000Z
  • 22:48 bd808: HHVM log messages not being recorded in Logstash; bd808 to investigate
  • 22:38 logmsgbot: tgr@tin Synchronized php-1.27.0-wmf.11/includes/: T124143,T124126 (duration: 00m 36s)
  • 22:06 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.11/extensions/OAuth: Deploy fix for T124224 (duration: 00m 32s)
  • 22:04 logmsgbot: anomie@tin Synchronized php-1.27.0-wmf.2/extensions/OAuth: Deploy fix for T124224 (duration: 00m 34s)
  • 21:51 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/extensions/SemanticResultFormats: Fix wikitech log noise (duration: 00m 31s)
  • 21:50 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/extensions/SemanticMediaWiki: Fix wikitech log noise (duration: 00m 34s)
  • 21:48 subbu: finished deploying parsoid sha f1ddfb88
  • 21:41 subbu: synced new parsoid code; restarted parsoid on wtp1001 as a canary
  • 21:35 subbu: starting parsoid deploy
  • 21:32 thcipriani: reverted group1 wikis to 1.27.0-wmf.10 due to session errors.
  • 21:30 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.10
  • 21:14 andrewbogott: rebooting labvirt1011
  • 21:08 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/extensions/SemanticForms/: Fix fatal on wikitech (duration: 00m 36s)
  • 20:37 akosiaris: s#/dev/md1#/dev/mapper/tank-data# on labvirt1010, reverted by puppet with Notice: /Stage[main]/Role::Labs::Openstack::Nova::Compute/Mount[/var/lib/nova/instances]/device: device changed '/dev/mapper/tank-data' to '/dev/md1'
  • 20:37 akosiaris: s#/dev/md1#/dev/mapper/tank-data#
  • 19:32 logmsgbot: dduvall@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.11
  • 19:14 marxarelli: including labswiki and labtestwiki in group1 promotion after all
  • 19:09 marxarelli: starting promotion of group1, but holding back labswiki and labtestwiki until Jan 21 'all' promotion
  • 18:54 paravoid: manually triggering an ubuntu mirror update ("sudo -u mirror /usr/local/sbin/update-ubuntu-mirror" on carbon)
  • 18:41 jynus: schema change on wikidatawiki (wb_terms) finished- slaves already catching up
  • 18:34 mutante: restart hhvm on mw1206
  • 18:32 godog: bounce stuck hhvm on mw1205
  • 18:06 paravoid: turning up BGP with Zayo in codfw
  • 17:48 jynus: restarting replication on db1026 after schema change
  • 17:09 gwicke: restbase cassandra: set DTCS max_window_size_seconds to 70736000, large enough to accommodate a two-year window
  • 16:56 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Set default graph vega version back to 1 gerrit:265289 (duration: 00m 32s)
  • 16:46 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add davidabian.com to wgCopyUploadsDomains gerrit:265286 (duration: 00m 32s)
  • 16:42 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Change default graph version param. Part II gerrit:265282 (duration: 00m 32s)
  • 16:42 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Change default graph version param. Part I gerrit:265282 (duration: 00m 36s)
  • 16:33 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add davidabian.com to wgCopyUploadsDomains gerrit:259003 (duration: 00m 32s)
  • 16:21 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add *.bodleian.ox.ac.uk to wgCopyUploadsDomains gerrit:265165 (duration: 00m 33s)
  • 16:19 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add *.archives.gov to wgCopyUploadsDomains gerrit:265163 (duration: 00m 32s)
  • 16:13 godog: bounce hhvm on mw1191 and syntaxlight runaway processes
  • 16:05 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable active gadget user stats on enwiki since it takes too long gerrit:265185 (duration: 00m 32s)
  • 14:52 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/vendor/: Fix ?PHP properly from commit (duration: 00m 36s)
  • 14:50 godog: powercycle mw1123, hhvm oom
  • 14:47 ema: Finished reverting migration of mobile traffic to text cluster in codfw https://phabricator.wikimedia.org/T109286
  • 14:24 logmsgbot: hoo@tin Synchronized wmf-config/db-eqiad.php: Set db1045 load to 0 (duration: 00m 32s)
  • 14:23 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/: consistency (duration: 02m 38s)
  • 14:15 logmsgbot: hoo@tin Synchronized wmf-config/db-eqiad.php: Re-Pool lagged db1045 (duration: 00m 35s)
  • 14:14 _joe_: syncronizing /srv/deployment manually between the two deployment servers for the first time
  • 14:11 logmsgbot: hoo@tin Synchronized wmf-config/db-eqiad.php: Has not been synced before (duration: 00m 32s)
  • 14:07 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.10/: consistency (duration: 02m 38s)
  • 13:58 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/extensions/Validator/: noop for wikitech deploy (duration: 00m 32s)
  • 13:58 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/extensions/SemanticMediaWiki/: noop for wikitech deploy (duration: 00m 34s)
  • 13:57 logmsgbot: reedy@tin Synchronized php-1.27.0-wmf.11/extensions/SemanticResultFormats/: noop for wikitech deploy (duration: 00m 33s)
  • 13:41 ema: Revert migration of mobile traffic to text cluster in codfw https://phabricator.wikimedia.org/T109286
  • 12:55 akosiaris: restart hhvm on mw1130
  • 12:43 jynus: performing alter table on db1026 (ETA: 5 hours)
  • 12:20 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Setting s5 master as recentchanges role (duration: 00m 32s)
  • 12:04 jynus: trying schema change on wikidata (wb_terms)
  • 09:36 akosiaris: gnt-instance modify -H disk_aio=native cygnus.codfw.wmnet
  • 09:18 akosiaris: offline fr_archive volume on nas1001-a
  • 09:15 akosiaris: unexport /vol/fr_archive on nas1001-a
  • 07:56 _joe_: powercycling mw1162, unable to login from console, memory exhaustion
  • 07:24 logmsgbot: ebernhardson@tin Synchronized php-1.27.0-wmf.10/extensions/CirrusSearch/includes/DataSender.php: stop checking for frozen indices while codfw elasticsearch recovers (duration: 01m 42s)
  • 06:24 ebernhardson: codfw elasticsearch cluster stopped responding during load test, idling test to see if it recovers
  • 03:44 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jan 20 03:44:48 UTC 2016 (duration 7m 29s)
  • 03:37 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.11) (duration: 16m 21s)
  • 03:02 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 10m 06s)
  • 02:35 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 11m 20s)
  • 01:27 logmsgbot: aaron@tin Synchronized wmf-config: Configure $wgCdnReboundPurgeDelay (duration: 00m 32s)
  • 01:01 mobrovac: restbase deploy end of d621b76
  • 00:57 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264917/ (duration: 00m 32s)
  • 00:56 legoktm: delete from localuser where lu_name ="Αντώνης Μανιός" and lu_wiki ="mediawikiwiki" limit 1 on centralauth db for T119736
  • 00:53 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264920/ (duration: 00m 33s)
  • 00:49 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/MobileFrontend/includes/api/ApiMobileView.php: https://gerrit.wikimedia.org/r/#/c/264973/ (duration: 00m 32s)
  • 00:49 mobrovac: restbase deploy start of d621b76
  • 00:38 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/264961/ (duration: 00m 31s)
  • 00:37 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264961/ (duration: 00m 33s)
  • 00:22 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264260/ (duration: 00m 32s)
  • 00:21 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/264260/ (duration: 00m 32s)
  • 00:17 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/CirrusSearch: https://gerrit.wikimedia.org/r/#/c/265146/ (duration: 00m 33s)
  • 00:10 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/CirrusSearch/includes/ElasticsearchIntermediary.php: https://gerrit.wikimedia.org/r/#/c/264989/ (duration: 00m 32s)

2016-01-19

  • 23:33 logmsgbot: aaron@tin Synchronized wmf-config/CommonSettings.php: Bump $wgJobBackoffThrottling to lower the htmlcacheupdate backlog (duration: 00m 32s)
  • 23:22 logmsgbot: krenair@tin Synchronized wmf-config/wikitech.php: https://gerrit.wikimedia.org/r/265145 (duration: 02m 24s)
  • 23:19 logmsgbot: dduvall@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.27.0-wmf.11
  • 23:13 logmsgbot: dduvall@tin Finished scap: testwiki to php-1.27.0-wmf.11 and rebuild l10n cache (duration: 72m 03s)
  • 22:01 logmsgbot: dduvall@tin Started scap: testwiki to php-1.27.0-wmf.11 and rebuild l10n cache
  • 21:35 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/265135 (duration: 00m 32s)
  • 21:33 logmsgbot: krenair@tin Synchronized dblists/nonglobal.dblist: https://gerrit.wikimedia.org/r/265135 (duration: 03m 21s)
  • 21:33 ema: Finished migrating mobile traffic to text cluster in codfw (Mexico + green US states on this map https://phabricator.wikimedia.org/T114659)
  • 21:15 logmsgbot: dduvall@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="testwiki" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.qyk48j8kem" ' returned non-zero exit status 1 (duration: 16m 11s)
  • 20:59 Krenair: sync-common on labtestweb2001
  • 20:58 logmsgbot: dduvall@tin Started scap: testwiki to php-1.27.0-wmf.11 and rebuild l10n cache
  • 20:48 mutante: tin: deleted unused things from /srv/deployment (T120157)
  • 20:46 logmsgbot: catrope@tin Synchronized wmf-config/InitialiseSettings.php: Disable global AbuseFilters on non-global wikis (duration: 02m 04s)
  • 20:25 logmsgbot: dduvall@tin scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki="labtestwiki" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.jRNpeW67FO" ' returned non-zero exit status 1 (duration: 01m 31s)
  • 20:23 logmsgbot: dduvall@tin Started scap: testwiki to php-1.27.0-wmf.11 and rebuild l10n cache
  • 20:13 mutante: ruthenium: disable puppet, copy data over to osmium (screen)
  • 20:12 mutante: ruthenium: service mysql stop
  • 19:15 logmsgbot: catrope@tin Synchronized wmf-config/CommonSettings.php: EventBus plumbing (duration: 00m 30s)
  • 19:14 logmsgbot: catrope@tin Synchronized wmf-config/InitialiseSettings.php: Disable Flow on wikitech; add EventBus plumbing (duration: 00m 31s)
  • 19:13 logmsgbot: catrope@tin Synchronized wmf-config/extension-list: Add EventBus (duration: 00m 31s)
  • 19:00 marxarelli: starting branch cut for 1.27.0-wmf.11
  • 18:42 ema: Starting migration of mobile traffic to text cluster https://phabricator.wikimedia.org/T109286
  • 17:54 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/UploadWizard/UploadWizard.config.php: https://gerrit.wikimedia.org/r/#/c/264969/ (duration: 00m 31s)
  • 16:51 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/264964/ (duration: 00m 31s)
  • 16:47 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/Graph/modules/graph-loader.js: https://gerrit.wikimedia.org/r/#/c/264715/ (duration: 00m 31s)
  • 16:45 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/264469/ (duration: 00m 31s)
  • 16:41 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/264437/ (duration: 00m 32s)
  • 14:58 cmjohnson1: reseating asw-c-eqiad uplink module (xe-1/1/0 and xe-1/1/2)
  • 14:29 jynus: reimporting some fawiki tables from production into labsdb hosts
  • 13:52 godog: powercycle ms-be1001
  • 13:51 paravoid: powercycling alsafi
  • 02:53 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jan 19 02:53:40 UTC 2016 (duration 7m 0s)
  • 02:46 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 09m 21s)
  • 02:26 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 40s)

2016-01-18

  • 23:26 logmsgbot: krenair@tin Synchronized multiversion/MWMultiVersion.php: https://gerrit.wikimedia.org/r/264895 (duration: 00m 31s)
  • 23:08 logmsgbot: krenair@tin Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/264786/ (duration: 00m 32s)
  • 22:55 logmsgbot: krenair@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 22:55 logmsgbot: krenair@tin Synchronized dblists: (no message) (duration: 00m 31s)
  • 22:53 logmsgbot: krenair@tin Synchronized w/static/images/project-logos/wikitech.png: https://gerrit.wikimedia.org/r/#/c/264786/ (duration: 00m 31s)
  • 17:30 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/264758 - labs-only change (duration: 00m 36s)
  • 14:24 godog: powercycle praseodymium
  • 10:42 godog: powercycle ms-be2016, high load avg
  • 10:16 godog: dist-upgrade ms-be3002 to trusty
  • 02:57 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jan 18 02:57:41 UTC 2016 (duration 7m 8s)
  • 02:50 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 08m 39s)
  • 02:49 YuviPanda: updated annualreport for foks
  • 02:30 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 11m 38s)

2016-01-17

  • 04:58 YuviPanda: started restbase on restbase1002
  • 02:53 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jan 17 02:53:19 UTC 2016 (duration 6m 59s)
  • 02:46 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 08m 53s)
  • 02:26 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 41s)
  • 01:47 paravoid: restarting HHVM on mw1120, mw1125, mw1127, mw1132, mw1148; OOM

2016-01-16

  • 19:52 andrewbogott: renaming and reimaging labcontrol2001 -> labtestweb2001
  • 15:57 milimetric: piwik is taking events on bohrium but the interface can't complete the queries to load because there's too much data. Mysql is maxing the CPU but it seems ok for now, will check again Monday.
  • 15:22 milimetric: restarted mysql on bohrium because it had stopped working (probably due to piwik performance problems)
  • 03:02 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jan 16 03:02:21 UTC 2016 (duration 6m 57s)
  • 02:55 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 08m 35s)
  • 02:35 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 18m 55s)

2016-01-15

  • 22:43 logmsgbot: aaron@tin Synchronized wmf-config/CommonSettings.php: Set $wgCentralAuthUseSlaves for testwiki (duration: 00m 33s)
  • 22:38 mutante: gadolinium - shutdown -h now
  • 22:35 mutante: erbium - killing from puppet/icinga/salt
  • 21:54 mutante: mira - starting salt
  • 21:29 mutante: protactinium - shut down, unused system with outdated software
  • 21:09 mutante: (ganglia for ulsfo will be affected, brb)
  • 21:07 mutante: bast4001 - reinstalling with jessie
  • 18:55 ori: disabled gzip in apache for javascript mime types and did an apache config reload
  • 18:04 logmsgbot: ori@tin Synchronized docroot and w: Ie60638b0: Mirror homepage.js from 15.wikipedia.org (duration: 00m 42s)
  • 16:01 godog: bounce hhvm on mw1129 / mw1204
  • 15:41 godog: reimage ms-be3001 with trusty
  • 14:54 godog: reimage ms-fe3002 with trusty
  • 14:13 mark: Temporarily paused md126 RAID check on labstore1001 (sync_action idle)
  • 14:09 chasemp: phab restart phd (reports as not running in phab itself) seems ok now
  • 14:03 mark: set sync_speed_min to 5000 for md126 on labstore1001
  • 13:28 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: w:he as import source for commonswiki (duration: 00m 49s)
  • 12:17 hashar: restarting Jenkins for plugins updates
  • 11:07 _joe_: re-enabled puppet on mw1013, restarted HHVM to make it pick up our latest changes
  • 10:01 moritzm: installed ganeti security updates
  • 09:18 moritzm: installed git security updates on all jessie systems
  • 03:10 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jan 15 03:10:09 UTC 2016 (duration 6m 48s)
  • 03:03 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 16m 02s)
  • 02:30 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/includes/api/ApiQueryRecentChanges.php: https://gerrit.wikimedia.org/r/264231 (duration: 00m 42s)
  • 02:29 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 14m 00s)
  • 02:23 YuviPanda: pull annualreport git repo on bromine for Krenair
  • 01:00 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/includes/api/ApiQueryWatchlist.php: https://gerrit.wikimedia.org/r/#/c/264224/ (duration: 00m 31s)
  • 00:27 logmsgbot: krenair@tin Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/263905/ (duration: 00m 32s)
  • 00:24 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: touch (duration: 00m 31s)
  • 00:22 logmsgbot: krenair@tin Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/264091/ (duration: 00m 32s)
  • 00:06 mobrovac: restbase started a dump of enwiki to populate storage with mobileapps renders

2016-01-14

  • 23:56 mobrovac: restbase end deploy of dac31a8c
  • 23:49 mobrovac: restbase start deploy of dac31a8c
  • 22:17 csteipp: deployed patch for T122807
  • 19:55 ottomata: restarted eventlogging_sync script to insert batches of 1000
  • 19:31 logmsgbot: dduvall@tin rebuilt wikiversions.php and synchronized wikiversions files: rollback labswiki to wmf.9
  • 19:02 logmsgbot: dduvall@tin rebuilt wikiversions.php and synchronized wikiversions files: all wikis to 1.27.0-wmf.10
  • 18:40 bblack: removing old eqiad misc-web IP (DNS switched 50h ago (not 26 like above), TTLs are max 1h)
  • 18:39 bblack: removing old eqiad misc-web IP (DNS switched 26h ago, TTLs are max 1h)
  • 18:01 paravoid: turning up BGP with Zayo in eqiad
  • 16:25 logmsgbot: demon@tin Synchronized wmf-config/throttle.php: (no message) (duration: 00m 49s)
  • 15:48 moritzm: installed DHCP security updates across the fleet
  • 14:44 _joe_: powercycling mw1013, console stuck
  • 11:28 godog: bounce uwsgi on labmon1001
  • 11:18 godog: upgrade graphite-carbon / graphite-web on labmon1001
  • 10:38 _joe_: restarting hhvm on odd-numbered jobrunners
  • 10:29 moritzm: installed DHCP security updates on carbon
  • 04:28 paravoid: powercycling mw1005/mw1011
  • 04:24 paravoid: restart hhvm on odd-numbered appservers
  • 02:30 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 12m 21s)
  • 01:32 Krenair: Wikitech rolled back to wmf.9 due to T123583
  • 01:27 logmsgbot: krenair@tin rebuilt wikiversions.php and synchronized wikiversions files: (no message)
  • 01:06 mutante: mw1009 - restarted hhvm
  • 01:00 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/VisualEditor/extension.json: https://gerrit.wikimedia.org/r/#/c/264031/ (duration: 01m 35s)
  • 00:30 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/CirrusSearch/includes: https://gerrit.wikimedia.org/r/#q,263991,n,z (duration: 06m 08s)
  • 00:11 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/263804/ (duration: 00m 31s)
  • 00:10 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/263804/ (duration: 00m 31s)
  • 00:08 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/Echo/modules/echo.variables.less: https://gerrit.wikimedia.org/r/#/c/263767/ (duration: 00m 45s)

2016-01-13

  • 23:46 tgr: T123451: running mwscript sql.php --wiki=metawiki patch-bot_passwords.sql
  • 23:09 mobrovac: restbase end deploy of 536e15b6
  • 22:58 andrewbogott: /etc/init.d/nfs-kernel-server restart on labstore1001
  • 22:54 mobrovac: restbase start deploy of 536e15b6
  • 22:20 logmsgbot: catrope@tin Synchronized wmf-config/: sync labs-only config changes (duration: 00m 32s)
  • 21:54 mobrovac: restbase end deploy of 559a13a
  • 21:44 mobrovac: restbase start deploy of 559a13a
  • 21:40 mdholloway: mobileapps deployed c9e7e28
  • 21:27 aude: Updated cirrus search mappings for testwikidata and wikidata to add new fields
  • 21:02 ori: Disabling Puppet on mw1013 (eqiad jobrunner) to hack in some debug logging into GWT jobs.
  • 20:01 ottomata: dropped MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 from analytics-store eventlogging slave db
  • 19:55 ostriches: *wikimania2017wiki_content
  • 19:55 ostriches: elasticsearch: wikimania2017_content was reporting as missing in logstash, ran updateSearchIndexConfig. messy aliases? Seems to be working again.
  • 19:27 ottomata: dropping eventlogging tables from MobileWebSectionUsage_14321266 and MobileWebSectionUsage_15038458 m4-master log database. These are too large and have been blacklisted from mysql. No more events will be inserted into mysql for these. We are attempting to help replication catch up on the analytics-store slave.
  • 19:11 logmsgbot: thcipriani@tin rebuilt wikiversions.php and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.10
  • 18:33 RobH: restarted zotero/mobileapps on sca1*/scb1* respectively for marko's code deploy
  • 18:33 RobH: restarted zotero/mobileapps on sca1*/scb1* respectively
  • 18:27 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: OfficeIT namespace on wikitech (duration: 00m 31s)
  • 18:03 mobrovac: zotero deploying translators 0476aa0
  • 17:12 gwicke: restarted mathoid on scb1001 and scb1002
  • 17:06 gwicke: restarted mathoid on sca1001 and sca1002
  • 17:00 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/Wikidata: https://gerrit.wikimedia.org/r/#/c/263865/ (duration: 00m 41s)
  • 16:31 logmsgbot: krenair@tin Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/263625/ (duration: 00m 31s)
  • 16:28 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/263341/ (duration: 00m 31s)
  • 16:22 logmsgbot: krenair@tin Synchronized portals: https://gerrit.wikimedia.org/r/#/c/263796/ (duration: 00m 31s)
  • 16:20 logmsgbot: krenair@tin Synchronized wmf-config/Wikibase-production.php: https://gerrit.wikimedia.org/r/#/c/263838/ (duration: 00m 31s)
  • 16:14 logmsgbot: krenair@tin Synchronized wmf-config/Wikibase.php: https://gerrit.wikimedia.org/r/#/c/263354/ (duration: 00m 31s)
  • 16:03 logmsgbot: krenair@tin Synchronized docroot/noc: https://gerrit.wikimedia.org/r/#/c/263370/3 (duration: 00m 31s)
  • 14:11 godog: bounce hhvm on mw1007
  • 14:03 godog: bounce hhvm on mw1005, powercycle mw1011
  • 13:46 godog: bounce hhvm on mw1009, powercycle mw1003
  • 13:39 godog: bounce hhvm on mw1013
  • 10:31 paravoid: upgrading grafana 2.6.0-beta1 -> 2.6.0
  • 06:45 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.9/extensions/GWToolset: Ib9375b: Make sure XMLReader::close() is always called (T122069) (duration: 00m 32s)
  • 06:43 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.10/extensions/GWToolset: Ib9375b: Make sure XMLReader::close() is always called (T122069) (duration: 01m 07s)
  • 03:15 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jan 13 03:15:57 UTC 2016 (duration 7m 13s)
  • 03:08 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.10) (duration: 16m 09s)
  • 02:57 Krinkle: Manually killed uwsgi graphite-web child processes on graphite1001. Service recovered itself from there.
  • 02:44 Krinkle: Graphite is down. Consistently returns HTTP 502 Bad Gateway for any/all requests
  • 02:34 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 11m 13s)
  • 01:33 yurik: deployed tilerator maps service
  • 01:19 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/Echo/Resources.php: https://gerrit.wikimedia.org/r/#/c/263645/ (duration: 00m 32s)
  • 01:18 logmsgbot: krenair@tin Synchronized php-1.27.0-wmf.10/extensions/Flow/modules/editor/editors/visualeditor/mw.flow.ve.Target.js: https://gerrit.wikimedia.org/r/#/c/263644/ (duration: 00m 31s)
  • 01:03 logmsgbot: krenair@tin Synchronized portals: https://gerrit.wikimedia.org/r/#/c/263770/ - after having done the submodule update this time (duration: 00m 31s)
  • 00:37 logmsgbot: krenair@tin Synchronized portals: https://gerrit.wikimedia.org/r/#/c/263770/ (duration: 00m 33s)
  • 00:31 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/261994/ (duration: 00m 31s)
  • 00:28 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/262895/ (duration: 00m 32s)
  • 00:25 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/262894/ (duration: 00m 30s)
  • 00:17 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/263237/ (duration: 00m 31s)
  • 00:15 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/262999/ (duration: 00m 31s)
  • 00:10 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/263201/ (duration: 00m 30s)
  • 00:08 yurik: switched all maps kartotherian servers to v5, restarted
  • 00:06 logmsgbot: krenair@tin Synchronized images/mobile/wikivoyage.png: https://gerrit.wikimedia.org/r/#/c/263201/ (duration: 00m 31s)
  • 00:06 logmsgbot: krenair@tin Synchronized images/mobile/wikidata.png: https://gerrit.wikimedia.org/r/#/c/263201/ (duration: 00m 32s)

2016-01-12

  • 21:58 ori: Restarting jobchron / jobrunner / HHVM on all job runners for I44990808
  • 21:07 logmsgbot: hoo@tin Synchronized php-1.27.0-wmf.10/extensions/Math/: Introduce a "MathEnableWikibaseDataType" config (duration: 00m 32s)
  • 20:52 logmsgbot: hoo@tin Synchronized wmf-config/: Set $wgMathEnableWikibaseDataType to false (duration: 01m 29s)
  • 20:44 logmsgbot: twentyafterfour@tin rebuilt wikiversions.php and synchronized wikiversions files: group0 to 1.27.0-wmf.10
  • 20:34 logmsgbot: thcipriani@tin Finished scap: testwiki to php-1.27.0-wmf.10 and rebuild l10n cache (duration: 54m 42s)
  • 20:14 mobrovac: restbase switching restbase200x to node 4.2
  • 20:13 mobrovac: restbase switch of restbase100[1-4] to node 4.2 completed
  • 20:10 mobrovac: restbase switching restbase100[1-4] to node 4.2
  • 19:39 logmsgbot: thcipriani@tin Started scap: testwiki to php-1.27.0-wmf.10 and rebuild l10n cache
  • 19:31 logmsgbot: dduvall@tin scap failed: CalledProcessError Command 'sudo -u www-data -n -- /bin/mktemp' returned non-zero exit status 1 (duration: 00m 42s)
  • 19:30 logmsgbot: dduvall@tin Started scap: testwiki to php-1.27.0-wmf.10 and rebuild l10n cache
  • 19:26 YuviPanda: import new r-base package into carbon
  • 18:15 marxarelli: cutting MW branch 1.27.0-wmf.10
  • 17:37 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/263632/ (duration: 00m 31s)
  • 16:53 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Import sources on gu.wikipedia gerrit:258441 (duration: 00m 29s)
  • 16:48 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Get rid of old unused $wgAllowed* variables gerrit:256853 (duration: 00m 29s)
  • 16:47 _joe_: restarted salt-minion on tin
  • 16:44 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add portal namespace to ps.wikipedia.org gerrit:255519 (duration: 00m 30s)
  • 16:42 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove proxyunbannable gerrit:254842 (duration: 00m 30s)
  • 16:37 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Allow sysop to grant and revoke transwiki on gu.wikipedia gerrit:258474 (duration: 00m 29s)
  • 16:33 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration on pa.wikipedia gerrit:258436 (duration: 00m 29s)
  • 16:22 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Namespace configuration on my.wikipedia gerrit:258442 (duration: 00m 30s)
  • 15:56 godog: reprovision ms-fe3001 with jessie
  • 14:55 ema: added myself to ops and wmf ldap groups
  • 11:57 _joe_: enabling auth on the production etcd cluster
  • 08:37 paravoid: ms-be1002: echo b > /proc/sysrq-trigger, kernel misbehaving and unrecoverable (out of kernel memory/XFS issues)
  • 07:38 paravoid: cr2-eqiad: reenable BGP peerings with GTT
  • 05:31 paravoid: rm CirrusSearchRequests.log-201510*.gz on fluorine (saving ~200G)
  • 04:07 paravoid: cleaning up elastic1006's /var/log from old logs
  • 03:59 paravoid: reenabling puppet on sca1001/2; no reason was left
  • 02:33 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jan 12 02:33:00 UTC 2016 (duration 6m 55s)
  • 02:26 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 47s)
  • 00:46 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: rv 443026e3ad18934dd0017a258673d88104cf6b5e (duration: 00m 29s)
  • 00:32 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258670/ (duration: 00m 30s)
  • 00:29 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258672/ (duration: 00m 30s)
  • 00:25 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258453/ (duration: 00m 30s)
  • 00:18 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/258444/ (duration: 00m 30s)
  • 00:14 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/255361/ (duration: 00m 30s)
  • 00:10 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/244140/ (duration: 00m 30s)
  • 00:09 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/244140/ (duration: 00m 30s)
  • 00:06 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/260242/ (duration: 00m 30s)

2016-01-11

  • 22:52 logmsgbot: jzerebecki@tin Synchronized wmf-config/throttle.php: deploying https://gerrit.wikimedia.org/r/#/c/263427/ (duration: 00m 30s)
  • 22:48 YuviPanda: restart eventlogging_synch on dbstore1002
  • 22:47 logmsgbot: jzerebecki@tin Synchronized php-1.27.0-wmf.9/extensions/Wikidata/extensions/Wikibase/repo/maintenance/dispatchChanges.php: restoring truncated Wikidata dispatchChanges.php to let dispatchers run again (duration: 00m 30s)
  • 22:46 mutante: restbase1004, restbase2002, restbase2005 - manually install nodejs
  • 22:45 logmsgbot: jzerebecki@tin Synchronized php-1.27.0-wmf.9/extensions/Wikidata/extensions/Wikibase/repo: deploying https://gerrit.wikimedia.org/r/#/c/253898/ with dispatchChanges.php still truncated (duration: 00m 33s)
  • 22:40 mutante: restbase1001 - apt-get install nodejs
  • 22:40 jzerebecki: dispatchChanges.php killed on terbium
  • 22:38 logmsgbot: jzerebecki@tin Synchronized php-1.27.0-wmf.9/extensions/Wikidata/extensions/Wikibase/repo/maintenance/dispatchChanges.php: truncating Wikidata dispatchChanges.php to stop dispatchers as preparation for https://gerrit.wikimedia.org/r/#/c/253898/ (duration: 00m 31s)
  • 21:19 papaul: pc200[4-6] - signing puppet certs, salt-key, initial run
  • 21:13 subbu: finished deploying parsoid sha 07494cf2
  • 21:06 papaul: installing OS on pc200[4-6]
  • 21:06 subbu: synced new code; restarted parsoid on wtp1003 as a canary
  • 21:02 subbu: starting parsoid deploy
  • 18:52 RobH: rt.w.o cert expired and its replacement will be later today (rt is internal ops only tool)
  • 18:36 RobH: tendril cert updated and neon returned to normal service
  • 18:30 ori: Restarting HHVM on all job runners, to vacate memory now that the cause of the leak appears to have subsided.(T122069)
  • 18:24 RobH: tendril updating ssl cert on neon, https may flap for a second (this is on neon, so icinga https portal may also flap)
  • 17:29 hoo: Updated Wikidata's property suggester with data from today's json dump
  • 17:16 papaul: db2033 - signing puppet certs, salt-key, initial run
  • 16:58 papaul: installing OS on db2033
  • 16:49 logmsgbot: thcipriani@tin Synchronized robots.txt: SWAT: Remove overager unrequested /wiki/User: robots.txt rule gerrit:263360 (duration: 00m 30s)
  • 16:41 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable new user groups on gu.wikipedia.org gerrit:255810 (duration: 00m 30s)
  • 16:34 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: dewikibooks: Set $wgRestrictDisplayTitle to false gerrit:260964 (duration: 00m 30s)
  • 16:30 godog: halt ms-be1013, required to reset idrac
  • 16:27 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable global AubseFilter at French Wikipedia gerrit:257868 (duration: 00m 29s)
  • 16:23 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Changed user group rights at trwikiquote gerrit:261869 (duration: 00m 30s)
  • 16:16 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Added noindex rule for uawikimedia user namespace gerrit:261902 (duration: 00m 30s)
  • 16:09 logmsgbot: thcipriani@tin Synchronized robots.txt: SWAT: Tidy robots.txt gerrit:240065 (duration: 00m 30s)
  • 16:08 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgLocaltimezone for orwiki gerrit:260745 (duration: 00m 29s)
  • 16:03 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Add enwiki as transwiki import source for ta.wikipedia gerrit:262352 (duration: 00m 33s)
  • 15:05 godog: repool restbase1004 in pybal, fully bootstrapped and running latest code
  • 11:14 _joe_: upgrading etcd to 2.2.1 in production
  • 10:36 _joe_: updating nodejs on restbase-test2002
  • 07:17 _joe_: restarting HHVM on a few jobrunners
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jan 11 02:32:37 UTC 2016 (duration 6m 55s)
  • 02:25 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 39s)
  • 01:11 paravoid: deactivating eqiad<->GTT BGP peering, reported network issues (P2469)

2016-01-10

  • 22:00 gwicke: restbase: 1005-1009 now on node 4.2
  • 19:44 paravoid: powercycling mw1004, mw1008, mw1012
  • 19:38 paravoid: restarting hhvm on jobrunners again
  • 12:40 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 626m 20s)
  • 10:13 ori: disabled categoryMembershipChange on mw1165 too, then restart jobrunner / jobchron / hhvm on mw1165 and mw1164
  • 08:55 ori: mw1166 -- disabled puppet; disabled categoryMembershipChange jobs
  • 08:48 ori: mw1167 -- disabled puppet; disabled deleteLinks and refreshLinks* jobs
  • 08:45 ori: mw1168 -- disabled puppet; disabled restbase jobs
  • 08:41 ori: mw1169 -- disables cirrus jobs.
  • 08:33 ori: Attempting to isolate cause of T122069 by toggling job types on mw1169. Disabling Puppet to prevent it from clobbering config changes.
  • 08:29 paravoid: restarting hhvm on jobrunners again
  • 04:58 paravoid: powercycling mw1005, mw1008, mw1009 -- unresponsive due to OOM
  • 04:56 paravoid: restarting HHVM on eqiad jobrunners, OOM, memleak faster than the 24h restarts

2016-01-09

  • 02:33 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jan 9 02:33:40 UTC 2016 (duration 6m 57s)
  • 02:26 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 11m 19s)

2016-01-08

  • 23:49 RobH: stalled puppet on carbon for now, messing with partman files
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jan 8 02:31:46 UTC 2016 (duration 7m 0s)
  • 02:24 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 15s)

2016-01-07

  • 23:24 akosiaris: repooled scb1002 for mobileapps
  • 23:24 akosiaris: enabled puppet,salt on scb1001
  • 23:23 mobrovac: mobileapps deploying 58b371a on scb1001
  • 23:09 mobrovac: mobileapps deploying 58b371a on scb1002
  • 23:01 akosiaris: apt-mark hold nodejs on scb1001, etherpad1001 and maps-test200{1,2,3,4}
  • 22:58 akosiaris: disable puppet and salt on scb1001 from nodejs 4.2 transition
  • 22:57 akosiaris: depool scb1002 for mobileapps. Transition to nodejs 4.2 ongoing
  • 19:21 YuviPanda: started tools / maps backup on labstore1001
  • 19:13 YuviPanda: remove snapshots others20150815030010, others20150815030010, maps20151216040005 and maps20151028040004 that were all stale and should've been removed anyway (on labstore2001)
  • 19:13 YuviPanda: remove snapshots others20150815030010, others20150815030010, maps20151216040005 and maps20151028040004 that were all stale and should've been removed anyway
  • 19:11 jynus: setting up watchdog process killing long running queries on db1051
  • 19:11 YuviPanda: run sudo lvremove backup/tools20151216020005 on labstore2001 to clean up full snapshot
  • 18:54 _joe_: also resetting the drac
  • 18:53 _joe_: powercycling ms-be1013
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Jan 7 02:32:04 UTC 2016 (duration 6m 54s)
  • 02:25 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 33s)

2016-01-06

  • 23:03 gwicke: switched restbase1009 to node 4.2 for testing, and restarted restbase; see https://phabricator.wikimedia.org/T107762
  • 02:34 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Jan 6 02:34:38 UTC 2016 (duration 6m 53s)
  • 02:27 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 30s)

2016-01-05

  • 22:38 logmsgbot: aaron@tin Synchronized rpc: 830e1ed8d80295710dc02f18102b4fadae7fca86 (duration: 00m 55s)
  • 18:34 logmsgbot: jzerebecki@tin scap aborted: deploy-log (duration: 00m 04s)
  • 18:34 logmsgbot: jzerebecki@tin Started scap: deploy-log
  • 15:47 ottomata: transitioned analytics1001 to active namenode
  • 03:51 logmsgbot: krinkle@tin Synchronized php-1.27.0-wmf.9/includes/specials/SpecialJavaScriptTest.php: Idaacf71870 (duration: 00m 30s)
  • 03:50 logmsgbot: krinkle@tin Synchronized php-1.27.0-wmf.9/resources/src/mediawiki.special/: Idaacf71870 (duration: 00m 30s)
  • 03:49 logmsgbot: krinkle@tin Synchronized php-1.27.0-wmf.9/resources/Resources.php: Idaacf71870 (duration: 00m 36s)
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Jan 5 02:31:46 UTC 2016 (duration 6m 54s)
  • 02:24 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 13s)

2016-01-04

  • 20:50 mutante: ms-be1011 - powercycled, was frozen
  • 20:43 mutante: ms-be2007 - System halted!Error: Integrated RAID
  • 20:42 mutante: ms-be2007 - powercycle (was status: on but all frozen) (i assume xfs like be2006 appears in SAL recently)
  • 20:36 mutante: mw2019 - puppet run (icinga claimed it failed but just here)
  • 20:19 mutante: rutherfordium - attempt to restart with gnt-instance
  • 20:12 mutante: rutherfordium (people.wm) was down for days per icinga - then magically fixes itself when i connect to console but before even loggin in (ganeti VM)
  • 20:00 mutante: mw1123 - start HHVM (was 503 and service stopped)
  • 19:28 mutante: elastic1006 - out of disk - gzip eqiad_index_search_slowlog.log files
  • 17:37 logmsgbot: yurik@tin Synchronized php-1.27.0-wmf.9/extensions/Graph/: Deployed Graph ext - gerrit 262357 (duration: 00m 33s)
  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Jan 4 02:32:10 UTC 2016 (duration 6m 53s)
  • 02:25 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 05s)

2016-01-03

  • 02:32 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Jan 3 02:31:58 UTC 2016 (duration 6m 52s)
  • 02:25 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 22s)

2016-01-02

  • 03:34 twentyafterfour: deploying https://gerrit.wikimedia.org/r/261725, restarted apache2 on iridium
  • 02:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Jan 2 02:31:28 UTC 2016 (duration 6m 58s)
  • 02:24 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 10m 09s)
  • 01:04 YuviPanda: imported vagrant 1.8.1 for jessie per bd808
  • 00:04 ori: (at 23:46 UTC) restarted nova-compute on labvirt1002

2016-01-01

  • 23:50 legoktm: restarted nodepool on labnodepool1001
  • 23:37 ori: restarting nodepool on labnodepool1001.eqiad.wmnet (T122731)
  • 19:41 bd808: Updated scholarships.wikimedia.org with latest translation data from translatewiki
  • 02:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Jan 1 02:30:27 UTC 2016 (duration 6m 47s)
  • 02:23 logmsgbot: mwdeploy@tin sync-l10n completed (1.27.0-wmf.9) (duration: 09m 58s)


Archives