Release Engineering/SAL/Archive 3
Appearance
2017-12-27
- 19:42 legoktm: legoktm@integration-slave-jessie-1003:/srv/jenkins-workspace/workspace$ sudo rm -rf *
2017-12-25
- 04:54 legoktm: deployed https://gerrit.wikimedia.org/r/400153 https://gerrit.wikimedia.org/r/400103
2017-12-24
- 16:07 Amir1: ladsgroup@deployment-tin$ mwscript extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=wikidatawiki --load-from https://en.wikipedia.beta.wmflabs.org/w/api.php (T183633)
2017-12-23
- 14:10 addshore: made ladsgroup owner of the github org
2017-12-22
- 22:39 thcipriani: integration-slave-jessie-1004 removed mediawiki-core-jsduck, mwgate-php55lint, mediawikicore-php55lint as /srv mount was full T179963
- 16:37 hashar: cancelled update of npm and npm-test images. npm is broken when used with a proxy | https://gerrit.wikimedia.org/r/#/c/399837/
- 16:12 hashar: rebuilding npm and npm-test docker images https://gerrit.wikimedia.org/r/#/c/388450/
- 09:17 hashar: Fixed mediawiki-core-php70-phan-docker mwext-php70-phan-docker jobs that used a wrong Docker image name | https://gerrit.wikimedia.org/r/399789
- 09:08 hashar: updating mediawiki-core-php70-phan-docker mwext-php70-phan-docker jobs to the new ci-src-setup docker image https://gerrit.wikimedia.org/r/399754
- 09:08 hashar: updating mediawiki-core-php70-phan-docker mwext-php70-phan-docker jobs to the new ci-src-setup docker image mediawiki-core-php70-phan-docker mwext-php70-phan-docker
2017-12-21
- 16:54 awight: Update ORES to eb0f776bb
- 15:14 hashar: fab deploy_docker for https://gerrit.wikimedia.org/r/#/c/399612/ "fix hhvm docker-pkg definitions"
- 14:50 hashar: fab deploy_docker for https://gerrit.wikimedia.org/r/#/c/399611/ "fix php55 definitions"
- 14:20 hashar: fab deploy_docker for https://gerrit.wikimedia.org/r/#/c/399609/ "fix zuul-cloner docker-pkg definition"
- 13:51 hashar: wikitech: change email of PortalsBuilder user from releng@lists.wikimedia.org to portals@lists.wikimedia.org | Credentials come from https://phabricator.wikimedia.org/D872 | T179694
2017-12-20
- 20:51 hashar: Rebuilding hhvm Docker containers https://gerrit.wikimedia.org/r/399406 | T183324
- 18:00 RoanKattouw: Importing dump from deployment-db03 on deployment-db04
- 15:30 RoanKattouw: Restarting dump again, failed due to lack of disk space
- 15:07 RoanKattouw: Dropped invalid view labswiki.updates, restarting dump
- 14:59 RoanKattouw: Dumping all databases on deployment-db03 so I can restore replication on deployment-db04. This may cause MediaWiki writes to fail while the dump runs
2017-12-19
- 20:10 RoanKattouw: (Earlier today) Depooled deployment-db04, it needs fixing after replication broke badly. It's out of sync with deployment-db03, where I manually fixed inconsistencies
- 18:11 awight: Update beta ORES service to f109792
- 17:00 awight: Disable ORES UI for beta wikidatawiki, T183266
- 15:58 zeljkof: Reloading Zuul to deploy 2f514e4
- 15:37 zeljkof: Reloading Zuul to deploy fb9327e
- 15:34 hashar: Switched tox jobs from wmfreleng/tox to docker-registry.wikimedia.org/releng/tox | https://gerrit.wikimedia.org/r/388449
- 14:50 zeljkof: Reloading Zuul to deploy 2fbfc1d
- 12:12 hashar: CI: switching mwgate-composer-php70 job from Nodepool to Docker | https://gerrit.wikimedia.org/r/#/c/398921/
- 11:59 hashar: CI: switching composer-php55 / composer-package-php55 jobs from Nodepool to Docker | https://gerrit.wikimedia.org/r/#/c/398920/
- 11:30 hashar: building php55 docker images on contint1001 | https://gerrit.wikimedia.org/r/#/c/397634/
2017-12-18
- 19:43 thcipriani: checkout master on deployment-tin:/srv/mediawiki-staging/php-master to fix beta-code-update-eqiad
- 17:36 addshore: paused beta-code-update-eqiad for a while while I test something
- 09:58 hashar: gerrit: deleted unused user-metrics-2 repo. Been created 4 years and 7 months ago but otherwise unused
- 09:53 hashar: gerrit: deleted wikibase/data-model and wikibase/data-model-services . They are on Github https://github.com/wmde/WikibaseDataModel and https://github.com/wmde/WikibaseDataModelServices
2017-12-13
- 22:08 mdholloway: deployed mobileapps@ddddebb to BC
- 18:25 thcipriani: failed Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/394551/ permissions errors with fabfile.py
- 18:18 thcipriani: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/394551/
- 17:24 awight: Install aspell-is for ORES
- 17:06 awight: Deploy ORES service b67bba7
- 00:38 mdholloway: deployed mobileapps@bfc3588 to BC
2017-12-12
- 18:22 mdholloway: deployed mobileapps@5b8796d to BC
- 15:34 addshore: deploy zuul for parameter_functions update
- 15:27 addshore: unblocked beta scaps and files syncs on jenkins
2017-12-11
- 23:18 mdholloway: deployed mobileapps@e290b17 to BC
- 21:06 mdholloway: deployed mobileapps@61ca333 to BC
- 19:03 andrewbogott: upgraded deployment-puppetmaster02 to puppet v4
- 17:24 hashar: Update node-6-docker jobs so the entry point recognizes setting NPM_RUN_SCRIPT=doc | https://gerrit.wikimedia.org/r/#/c/397576/
- 10:00 hashar: github: deleted https://github.com/wikimedia/mediawiki-extensions-GitHub | T182231
- 09:59 hashar: github: archiving https://github.com/wikimedia/mediawiki-extensions-SwiftCloudFiles - T182384
2017-12-09
- 03:13 legoktm: deployed https://gerrit.wikimedia.org/r/396486
2017-12-08
- 21:20 legoktm: deployed https://gerrit.wikimedia.org/r/396480
- 19:47 legoktm: deployed https://gerrit.wikimedia.org/r/396453
- 18:09 Hauskatze: maurelio@deployment-tin:~$ mwscript namespaceDupes.php --wiki=enwiki --fix - T182356
- 13:01 hashar: github: disable vulnerability alerts and archived https://github.com/wikimedia/wikimedia-lobbypop/ | T180878
- 13:01 hashar: github: disable vulnerability alerts and archived https://github.com/wikimedia/labs-tools-Wikimedia-Emoji-Bot/ | T180878
- 13:00 hashar: github: disable vulnerability alerts for the archived repo https://github.com/wikimedia/labs-tools-Wikimedia-Emoji-Bot/
- 11:06 zeljkof: Reloading Zuul to deploy 1faf444
2017-12-07
- 20:53 Hauskatze: maurelio@deployment-tin:~$ mwscript namespaceDupes.php --wiki=dewiki --fix
- 20:25 Hauskatze: maurelio@deployment-tin:~$ mwscript namespaceDupes.php --wiki=deploymentwiki --fix --add-prefix=Broken/
- 20:18 Hauskatze: deployment-prep maurelio@deployment-tin:~$ mwscript cleanupSpam.php --wiki=deploymentwiki --delete *.loginidol.org
- 17:56 mdholloway: deployed mobileapps@71f581c to beta cluster
- 10:09 hashar: integration: sudo cumin --force 'name:integration-slave-jessie-100*' /usr/local/sbin/run-puppet-agent | https://gerrit.wikimedia.org/r/395961
- 10:06 hashar: integration: unbroke puppet on some permanent slaves. Add been broken since Nov 29th ~ 19:50UTC | https://gerrit.wikimedia.org/r/#/c/395961/
- 09:48 hashar: CI: removed Wikidata from configuration, replaced by Wikibase. wmf/* and REL branches are going to be broken though | https://gerrit.wikimedia.org/r/395704 | T181838
2017-12-06
- 21:43 awight: Update ORES to 42cf532
- 17:54 gehel: logstash upgrade on deployment-logstash2 completed, 5 minutes of logs lost during upgrade - T178412
- 17:26 gehel: upgrading ELK on deployment-logstash2 - T178412
- 16:48 Hauskatze: Ran cleanupSpam.php on deploymentwiki
- 10:03 hashar: docker push wmfreleng/npm:v2017.12.06.09.55 wmfreleng/npm-stretch:v2017.12.06.09.55 wmfreleng/npm-test:v2017.12.06.09.55 wmfreleng/npm-test-stretch:v2017.12.06.09.55 !!! wmfreleng/npm-browser-test:v2017.12.06.09.55 | https://gerrit.wikimedia.org/r/#/c/395555/
2017-12-05
- 12:18 hashar: deployment-videoscaler01: rm /var/log/hhvm/* /var/log/apache2/* . Restarted apache2/hhvm/syslog
- 12:16 hashar: integration: sudo cumin --force '*' 'apt-get clean'
- 12:16 hashar: deployment-prep: sudo cumin --force '*' 'apt-get clean'
- 12:15 hashar: deployment-videoscaler01: apt-get clean to free up disk space
- 08:51 hashar: jenkins: adding global property FORCE_COLOR=1 to https://integration.wikimedia.org/ci/configure . That forces webdriver.io to spurts color in the Jenkins console when not using a TTY
- 06:37 kart_: Updated cxserver to 1693bcf
2017-12-04
2017-12-03
- 21:27 legoktm: legoktm@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ sudo rm -rf * # to clear out full /srv
2017-12-01
- 21:15 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/394655
- 13:46 godog: deployment-prep bounce elasticsearch on logstash2 to test jmx_exporter
- 11:55 hashar: updating *npm-browser-node-6-docker jobs to use a new container based on Stretch with Chromium/Firefox | https://gerrit.wikimedia.org/r/#/c/394340/ | T179360
- 10:08 hashar: docker push wmfreleng/npm-browser-test-stretch:v2017.11.30.21.30 && docker push wmfreleng/npm-browser-test-stretch:latest | https://gerrit.wikimedia.org/r/#/c/394340/ | T179360
- 08:40 hashar: rebased operations/puppet on deployment-prep and integration puppetmasters
- 08:40 hashar: deployment-prep: removed a hack to puppetmaster environments/future/environment.conf containing: parser = future \n manifest = $confdir/manifests\n
- 08:38 hashar: integration: removed a hack to puppetmaster environments/future/environment.conf containing: parser = future \n manifest = $confdir/manifests\n
2017-11-30
- 23:08 addshore: turned beta-scap-eqiad back on
- 23:03 addshore: reload zuul to deploy Revert "Use gate-and-submit-swat for mediawiki-config" [integration/config] - https://gerrit.wikimedia.org/r/394484
- 22:58 addshore: also reloaded with hashar Switch ArticlePlaceholder to npm-browser-test & Remove mwgate-npm-node-6-jessie
- 22:57 addshore: reloaded zuul for Use gate-and-submit-swat for mediawiki-config [integration/config] - https://gerrit.wikimedia.org/r/394464
- 21:05 hashar: docker push wmfreleng/npm-stretch:v2017.11.30.21.03 && docker push wmfreleng/npm-stretch:latest && docker push wmfreleng/npm-test-stretch:v2017.11.30.21.03 && docker push wmfreleng/npm-test-stretch:latest | https://gerrit.wikimedia.org/r/#/c/394338/ | T179360
- 20:50 addshore: temp disable beta-scap-eqiad so that it doesnt block me doing my own scaps
- 18:59 bd808: Testing stashbot fix for double phab logging (T181731)
- 17:49 anomie: Finished running cleanupUsersWithNoId.php on Beta Cluster for T181731
- 16:58 anomie: Running cleanupUsersWithNoId.php on Beta Cluster, see T181731
2017-11-29
- 21:27 awight: Update ores submodule, for RevIdScorer statistics
- 21:17 awight: deployment-prep Verbose logging for ORES Celery
- 14:32 chasemp: git pull on /var/lib/git/labs/private and resolve one merge conflict. (the root key file is too old here)
- 09:18 hashar: gerrit: forcing replication: ssh -p 29418 hashar@gerrit.wikimedia.org replication start operations/software/druid_exporter # T181219
- 09:14 hashar: github: created wikimedia/operations-debs-contenttranslation-apertium-crh-tur and wikimedia/operations-debs-prometheus-openldap-exporter
- 09:08 hashar: github: created repo operations-software-druid_exporter | T181219
- 03:56 legoktm: deleted all workspaces on integration-slave-jessie-1003 /srv ran out of space
- 03:23 Krinkle: Jenkins jobs for mediawiki-core-php55lint consistently failing on integration-slave-jessie ("git: stderr: error: failed to write..")
- 00:02 halfak: deploy-prep awight enabled ORES service
- 00:01 halfak: deploy-prep awight disabled ORES service
2017-11-28
- 17:42 awight: Remove stale ORES customizations for the beta cluster.
- 17:31 awight: Remove beta cluster customizations for ORES
2017-11-27
- 19:06 awight: Update beta ORES to latest, e58bfbf
- 08:38 hashar: reactivating https://phabricator.wikimedia.org/source/iegreview/ , it still developped https://phabricator.wikimedia.org/D894
2017-11-24
- 08:16 hashar: pooling integration-slave-docker-1003 again | T179378
- 08:14 hashar: nodepool: Image snapshot-ci-jessie-1511510623 in wmflabs-eqiad is ready
- 08:13 hashar: upgrading blubber on contint2001
- 08:03 hashar: nodepool: manually rebuilding snapshot-ci-jessie
2017-11-23
- 19:34 hashar: migrating pywikibot/core jobs to Docker https://gerrit.wikimedia.org/r/#/c/393091/
- 19:34 hasharAway: migrating pywikibot/core jobs to Docker https://gerrit.wikimedia.org/r/#/c/393091/
2017-11-22
- 18:55 greg-g: beta update jobs are back
- 18:48 greg-g: hung beta updates, doing the monthly dance: https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code/db_update
- 16:58 halfak: deploying ores-prod-deploy:5084251 T181168
2017-11-21
- 23:29 mdholloway: deployed mobileapps@52d6a83 on the beta cluster
- 21:28 TabbyCat: deployment-prep Ran cleanupSpam.php on deploymentwiki.
- 18:38 mdholloway: deployed mobileapps@9d1602d on the beta cluster
- 17:06 hasharAway: docker push wmfreleng/tox-cergen:v2017.11.21.16.52 | https://gerrit.wikimedia.org/r/392678 | For https://integration.wikimedia.org/ci/job/cergen-tox-docker/ which pass !
- 13:24 hashar: gerrit: adding Jdrewniak to wmf-deployment group https://gerrit.wikimedia.org/r/#/admin/groups/21,members | T180639
- 13:08 hashar: gerrit: created wikimedia/portals/deploy https://gerrit.wikimedia.org/r/#/admin/projects/wikimedia/portals/deploy for jan_drewniak | T180777
- 13:02 hashar: docker push wmfreleng/ci-src-setup:v2017.11.21.12.57 && docker push wmfreleng/ci-src-setup:latest | https://gerrit.wikimedia.org/r/392632 | T177684
- 02:42 Krinkle: Adding relative time to Deployments calendar (Common.js), e.g. "4 hours from now" or "soon"
2017-11-20
- 15:57 hashar: gerrit: deleted operations/network-diagrams mostly empty and no changes. Created back in 2012.
- 15:03 hashar: integration: pass all environment variables to the docker run commands | https://gerrit.wikimedia.org/r/#/c/390432/ | T177684
- 10:06 hashar: nodepool: manually deleted left over instances ci-jessie-wikimedia-894187 and ci-jessie-wikimedia-894188 . Jenkins fails to ssh to it and they were left ready for 72 hours.
- 10:05 hashar: deployment-phab : set hiera 'phabricator_cluster_search: []' trying to unblock puppet and soft rebooted the instance | T180935
- 09:39 hashar: deployment-prep added missing key between_bytes_timeout to cache::app_def_be_opts for deployment-cache-text04 and deployment-cache-upload04 | T180935
- 09:29 hashar: deployment-tin: apt-mark hold scap | the apt-repo on deployment-tin is out of date | T180935
2017-11-16
- 23:16 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/391902
- 00:10 thcipriani: removed old differential-docker-test images on integration-slave-docker-1001
2017-11-15
- 21:05 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/391079/6
- 17:29 thcipriani: updating docker-pkg dockerfiles on contint1001 for https://gerrit.wikimedia.org/r/#/c/388448/
- 09:55 addshore: created wmf/1.31.0-wmf.8 branch of Wikidata extension repo T180539
- 07:30 Krinkle: Aborting jobs in 'test' pipeline for backport REL commits that are already merged meanwhile in 'submit' pipeline
2017-11-13
- 23:14 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/391136/2
2017-11-10
- 22:17 hashar: docker push wmfreleng/npm:v2017.11.10.22.15 && docker push wmfreleng/npm:latest && docker push wmfreleng/npm-test:v2017.11.10.22.15 && docker push wmfreleng/npm-test:latest | BABEL_CACHE_PATH | T179425
- 21:39 hashar: docker push wmfreleng/tox-pyspark:v2017.11.10.21.37 && docker push wmfreleng/tox-pyspark:latest | | https://gerrit.wikimedia.org/r/389937 docker: handle signals in tox entrypoint | T176747
- 21:36 hasharAway: docker push wmfreleng/tox:v2017.11.10.21.35 && docker push wmfreleng/tox:latest | https://gerrit.wikimedia.org/r/389937 docker: handle signals in tox entrypoint | T176747
- 21:36 hasharAway: docker push wmfreleng/tox:v2017.11.10.21.35 && docker push wmfreleng/tox:latest | https://gerrit.wikimedia.org/r/389937 docker: handle signals in tox entrypoint
- 16:18 hashar: docker push wmfreleng/tox:v2017.11.10.16.17 && docker push wmfreleng/tox:latest for https://gerrit.wikimedia.org/r/#/c/388084/
- 12:39 hashar: Updated Jenkins tox jobs | https://gerrit.wikimedia.org/r/#/c/389924/ | Does not quite fix T176747 yet thoughaaaaaaaaaaaa
- 11:56 hashar: docker push wmfreleng/tox:v2017.11.10.11.49 && docker push wmfreleng/tox:latest && docker push wmfreleng/tox-pyspark:v2017.11.10.11.49 && docker push wmfreleng/tox-pyspark:latest | https://gerrit.wikimedia.org/r/#/c/389924/
2017-11-09
- 21:45 legoktm: deployed https://gerrit.wikimedia.org/r/390325
- 16:54 zeljkof: Reloading Zuul to deploy baceaeb
- 15:20 hashar: docker push wmfreleng/npm-test:v2017.11.09.15.15 | https://gerrit.wikimedia.org/r/390261 | T176747
- 12:52 hashar: Jenkins: fixed/changed the global git user.name / user.email (now: "Wikimedia CI" and "releng@lists.wikimedia.org" )
- 10:59 hashar: Drop jsduck doc/ri from npm/npm-test images | https://gerrit.wikimedia.org/r/#/c/389943 | docker push wmfreleng/npm:v2017.11.09.10.57 && docker push wmfreleng/npm-test:v2017.11.09.10.57
- 07:53 hasharAway: Killed stuck containers wmfreleng/npm-test on integration-slave-docker-1001 - T176747
2017-11-08
- 13:43 Reedy: ran apt-get clean|autoclean on deplyoment-mediawiki04 to free up some space
2017-11-07
- 18:45 twentyafterfour: cowboy-committed and pushed rMSCAc1f2ac2 to hopefully unbreak `scap deploy` in beta
- 17:56 legoktm: integration-slave-jessie-1003 /srv full, legoktm@integration-slave-jessie-1003:/srv/jenkins-workspace/workspace$ sudo rm -rf mwgate-* mediawiki-*
- 17:27 hashar: Image snapshot-ci-jessie-1510074928 in wmflabs-eqiad is ready - T179772
- 17:15 hashar: Updating Nodepool snapshot to get php5.5-zip - T179772
- 16:15 hashar: Created portalsbuilder in Gerrit, generated a ssh key pair for it and stored in Jenkins credentials store - T179694
- 15:15 hashar: Created VPS account "PortalsBuilder" - T179694
2017-11-06
- 23:49 thcipriani: ssh-keyscan deployment-videoscaler01.deployment-prep.eqiad.wmflabs >> /etc/ssh/ssh_known_hosts
- 22:29 hashar: killed stuck npm Docker containers on integration-slave-docker-1002 (due to T176747 ). Pooled the instance back, the slowness it experienced is probably not related to labvirt CPU usage ( T179378 )
- 20:35 Amir1: deploy ores:93e8846 in beta cluster
- 16:02 thcipriani: Reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/388546/ and https://gerrit.wikimedia.org/r/#/c/389463/
2017-11-03
- 13:51 hashar: pooled integration-slave-docker-1004 and integration-slave-docker-1007
- 13:30 hashar: Unpool integration-slave-docker-1002 and integration-slave-docker-1003 . They are slow CPU wise, most probably due to the underlying labvirt being CPU starved. - T179378
- 12:38 hashar: T179593 generate doc for cumin@v1.2.2 : contint1001$ zuul enqueue-ref --trigger gerrit --pipeline publish --project operations/software/cumin --ref refs/tags/v1.2.2 --newrev f745387
- 11:20 hashar: generate doc for cumin@v1.2.2 : contint1001$ zuul enqueue-ref --trigger gerrit --pipeline publish --project operations/software/cumin --ref refs/tags/v1.2.2
- 11:17 addshore: zuul reload for zuul: add noop jobs for new analytics/wmde/WDCM-* repos [integration/config] - https://gerrit.wikimedia.org/r/388423
- 11:17 hashar: generate doc for cumin ( T179593 ) : contint1001$ zuul enqueue --trigger gerrit --pipeline postmerge --project operations/software/cumin --change 388261,2
- 02:04 legoktm: integration-slave-jessie-1004 deleted mwgate-php55lint (5.2GB) and mediawiki-core-php55lint (2.5GB) workspaces due to low disk space in /srv
2017-11-02
- 22:30 halfak: deploying ores-deploy 82a13ae
- 16:58 addshore: reloaded zuul to deploy https://gerrit.wikimedia.org/r/387960
- 13:02 hashar: gerrit: marked mediawiki/extensions/WikibaseJavaScriptApi.git read-only - T178226
- 12:17 hashar: gerrit: created wikibase/javascript-api inheriting from wikibase.git - T178226
- 07:05 legoktm: mwext-VisualEditor-publish got stuck for 15 hours, deleted a job in jenkins to kick it
2017-11-01
- 16:31 hashar: docker push wmfreleng/tox:v2017.11.01.16.28 | add libmysqlclient-dev | T179392
- 15:46 hashar: docker push wmfreleng/tox:v2017.11.01.15.29 | https://gerrit.wikimedia.org/r/#/c/387723/
- 00:12 legoktm: deployed https://gerrit.wikimedia.org/r/387650
2017-10-31
- 22:53 hashar: docker push wmfreleng/tox:v2017.10.31.22.51 ( tox 2.6.0 https://gerrit.wikimedia.org/r/#/c/387682/ )
- 22:14 hashar: docker push wmfreleng/tox:v2017.10.31.21.03 | for ebernhardson / https://gerrit.wikimedia.org/r/#/c/387682/
- 21:34 hashar: T144961 : sudo cumin --force 'name:docker' 'rm -fR /srv/jenkins-workspace/workspace/composer-*php70*'
- 21:32 hashar: T144961 : sudo cumin --force 'name:docker' 'rm -fR /srv/jenkins-workspace/workspace/composer-package-php70-docker/*'
- 18:06 legoktm: deployed https://gerrit.wikimedia.org/r/387624
- 16:52 hashar: Migrated some tox jobs to Docker via https://gerrit.wikimedia.org/r/387582
- 16:08 hashar: integration: sudo cumin --force 'name:docker' 'rm -fR /srv/jenkins-workspace/workspace/*tox-docker*'
- 02:24 legoktm: moved mwgate-npm jobs over to docker - https://lists.wikimedia.org/pipermail/wikitech-l/2017-October/089046.html
- 01:19 legoktm: deployed https://gerrit.wikimedia.org/r/387500
2017-10-30
- 10:56 hashar: deployment-logstash2 removed puppet class role::labs::lvm::mnt, replacing with role::labs::lvm::srv . /srv is already mounted. Unmounting /mnt and restarting elastcisearch - T178722
- 10:53 hashar: deployment-logstash2 removed puppet class role::labs::lvm::mnt, replacing with role::labs::lvm::srv . /srv is already mounted. Unmounting /mnt and restarting elastcisearch - T 178722
- 10:52 hashar: deployment-logstash2 removed puppet class role::labs::lvm::mnt, replacing with role::labs::lvm::srv . /srv is already mounted. Unmounting /mnt and restarting elastcisearch
- 09:55 hashar: gerrit: deleted graphs/shared.git unused / emtpy repo
- 09:27 hashar: gerrit: deleted /nfsd.git (unused / no changes, created on October 4th 2016)
- 09:22 hashar: gerrit: prefix mediawiki/extensions/AutomaticBoardWelcome description with '[ARCHIVED] ' - T179196
- 09:21 hashar: gerrit: prefix mediawiki/extensions/AWS description with '[ARCHIVED] ' - T174864
2017-10-28
- 09:12 Krenair: fixed puppet on deployment-kafka01 by installing ldap-utils
2017-10-27
- 13:11 godog: provision deployment-redis{03,04} with stretch - T148637
- 13:06 hashar: zuul enqueue --trigger gerrit --pipeline postmerge --project wikidata/query/rdf --change 383791,15
2017-10-26
- 16:27 greg-g: fixed it. Had to offline/reonline deployment-tin repeatedly to get through the mediawiki-config-update post-merge backlog one by one. Now jobs are running on deployment-tin again
- 15:42 greg-g: tried the gearman disable/enable, got one beta-scap run, but that's it... in a 1:1 now
- 15:30 greg-g: Jenkins slave agent won't start: https://phabricator.wikimedia.org/P6195
- 15:27 greg-g: doing the dance: https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code.2Fdb_update
2017-10-24
- 17:59 madhuvishy: Ran `sudo cumin -b 5 --backend openstack "project:deployment-prep" "apt-get install git --yes"`
- 11:19 elukey: removed several roles mistakenly applied to puppet prefix deployment-aqs in Horizon (causing puppet failures for AQS nodes)
- 08:35 hashar: beta: cherry pick https://gerrit.wikimedia.org/r/#/c/386077/4 "hieradata for varnish caches" - T178841
2017-10-23
- 20:29 Krinkle: Puppet still failing, now with: "Error 400 on SERVER: Could not find data item cache::fe_transient_gb in any Hiera data file and no default supplied at /etc/puppet/modules/profile/manifests/cache/text.pp:12 on node deployment-cache-text04.deployment-prep.eqiad.wmflabs"
- 20:29 Krinkle: Previous edit failed. Horizon saved the field as blank. Presumably because the class is unknown in the current version of puppet manifests it has. Strange that it normalises in this way.
- 20:28 Krinkle: Edit horizon "Other classes" config for deployment-prep/deployment-cache-text04. Rename role::prometheus::varnish_exporter to profile::prometheus::varnish_exporter
- 20:13 Krinkle: Puppet run still failing on Beta cluster varnish: "Error: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::prometheus::varnish_exporter"
- 09:29 hashar: fab docker_pull_image:wmfreleng/tox
- 09:26 hashar: docker push wmfreleng/tox:v2017.10.23.09.05 && docker push wmfreleng/tox:latest - https://gerrit.wikimedia.org/r/385950
2017-10-20
- 10:00 elukey: cherry pick https://gerrit.wikimedia.org/r/#/c/385339 to the operations/puppet git repo on puppetmaster02
- 03:34 Krinkle: Beta Cluster varnish (text04) has not had a Puppet run for over 10 days (15165 minutes ago). Error: " puppet-agent: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class role::prometheus::varnish_exporter for deployment-cache-text04 .. Not using cache on failed catalog .. Could not retrieve catalog; skipping run"
2017-10-19
- 11:21 zeljkof: Reloading Zuul to deploy 26f4ff5
2017-10-18
- 18:32 greg-g: MaxSem ran `foreachwiki extensions/LoginNotify/maintenance/migratePreferences.php` on deployment-prep
- 09:14 dcausse: deployment-prep: upgrading elasticsearch to 5.5.2
- 08:41 hashar: deployment-mediawiki07: install --owner=nutcracker -d /var/run/nutcracker && systemctl start nutcracker # T178457
- 08:38 hashar: deployment-videoscaler01: install --owner=nutcracker -d /var/run/nutcracker && systemctl start nutcracker # T178457
2017-10-17
- 22:08 addshore: replaced integration-slave-docker-c2-m4-d40-1005 with integration-slave-docker-1005 T178409
- 21:48 addshore: added slave integration-slave-docker-1006 (1x 4GB ram executor)
- 21:47 addshore: delete wmfreleng/mediawiki-extensions-phan from docker hub
- 14:05 addshore: deleted slave integration-slave-docker-1004
- 13:35 addshore: swapped integration-slave-docker-1004 for integration-slave-docker-c2-m4-d40-1004 (So we have more 4GB executors)
- 09:45 addshore: reload zuul for https://gerrit.wikimedia.org/r/384673
- 08:55 addshore: delete unused mwext-php70-phan-jessie-docker 'project' in jenkins UI
- 08:54 addshore: reload zuul for https://gerrit.wikimedia.org/r/384614
2017-10-16
- 20:48 halfak: deploying ores fb55ab8 T175180 (fixes eswiki)
- 20:36 halfak: deploying ores 42c5663 T175180 (rolling back)
- 20:11 halfak: deploying ores 0f3fe9f T175180 (second attempt)
- 20:10 no_justification: deployment-prep Both repos date from July
- 20:10 no_justification: deployment-prep Dropped 2 deploy-cache entries for ORES from deployment-sca03
- 19:57 halfak: deploying ores 0f3fe9f T175180
- 17:29 addshore: reloaded zuul for https://gerrit.wikimedia.org/r/384565
- 17:09 addshore: addshore@integration-slave-docker-c2-m4-d40-1005:/srv/git/mediawiki$ sudo git clone --bare https://gerrit.wikimedia.org/r/p/mediawiki/core.git
2017-10-14
- 00:48 MaxSem: reverting
- 00:47 MaxSem: Trying PHP7 mode on depoyment-prep with https://wikitech.wikimedia.org/w/index.php?diff=1772791 (ping T173786)
2017-10-13
- 16:34 Amir1: ladsgroup@deployment-tin:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildPropertyInfo.php --wiki=wikidatawiki (T177857)
- 13:41 zeljkof: Reloading Zuul to deploy b5b1dc2
- 10:43 zeljkof: Reloading Zuul to deploy 320f065
2017-10-11
- 19:59 hashar: deployment-prep: deploying jobrunner to catchup with changes.
- 18:19 hashar: beta: rebased puppet master due to a conflict with b3c6968b3c
- 15:32 _joe_: removing deployment-pdf01, T177931
- 08:33 hashar: Image snapshot-ci-jessie-1507710117 in wmflabs-eqiad is ready
- 08:22 hashar: nodepool: refreshing Jessie snapshot after some puppet patches got merged
2017-10-10
- 17:51 Amir1: add "Ladsgroup" to oversight members in enwiki in beta cluster to test T177705
- 16:29 Amir1: adding "Ladsgroup" to admins in wikidatawiki in beta cluster
2017-10-09
- 13:26 hashar: Upgraded Jenkins to 2.73.1 earlier today
- 08:53 hashar: hard restart integration-slave-docker-1001 via horizon. It is deadlocked somehow. - T177749
2017-10-06
- 13:22 hashar: Jenkins: adding Maven-3.0.5 to the tool configuration https://integration.wikimedia.org/ci/configureTools/
- 11:58 hashar: Jenkins: installed Warnings plugin
- 11:54 hashar: Jenkins: removing the Violations plugin. It is not used.
- 09:22 hashar: integration: purged bunch of old containers: sudo cumin 'name:slave-docker' 'yes | docker container prune'
2017-10-05
- 19:15 hasharAway: rebooting integration-slave-docker-1002 to catch with kernel upgrade and pooling it back in Jenkins - T177039
- 19:11 hasharAway: rebooting integration-slave-jessie-1002 to catch with kernel upgrade and pooling it back in Jenkins - T177039
- 13:16 hashar: Image snapshot-ci-jessie-1507208677 in wmflabs-eqiad is ready
- 11:47 hashar: Refreshing Nodepool Jessie snapshot to get java 8 by default - T162828
- 10:56 hashar: integration: unbreak the puppet master. Was stuck do a cherry pick that needed a rebase
- 05:56 legoktm: deploying https://gerrit.wikimedia.org/r/382361
- 04:16 legoktm: deploying https://gerrit.wikimedia.org/r/382354
2017-10-04
- 13:19 andrewbogott: migrating 'deployment-kafka-jumbo-1' to labvirt1017
2017-10-03
- 22:38 thcipriani: git stash /srv/mediawiki-staging/php-master/extensions/Echo to fix beta-code-update-eqiad
- 14:26 hashar: Created https://github.com/wikimedia/analytics-wikistats2 - T177288
- 14:23 hashar: Gerrit: created analytics/wikistats2.git for fdans - T177288
- 07:15 legoktm: deploying https://gerrit.wikimedia.org/r/381937
- 05:33 legoktm: deploying https://gerrit.wikimedia.org/r/381812 https://gerrit.wikimedia.org/r/381378
2017-10-02
- 15:52 zeljkof: Reloading Zuul to deploy 895f33d
- 14:57 addshore: docker push npm image from https://gerrit.wikimedia.org/r/#/c/381384/2
- 14:47 zeljkof: Reloading Zuul to deploy 9ac7821
- 14:22 zeljkof: Reloading Zuul to deploy 00890a5
- 12:32 addshore: docker push composer, mediawiki-phan & mediawiki-phpcs latest tags built from https://gerrit.wikimedia.org/r/381392
- 12:23 addshore: docker push ci-jessie, php & php-mediawiki latest tags built from https://gerrit.wikimedia.org/r/381392
- 12:16 addshore: marking integration-slave-docker-1002 as offline T177039
- 11:08 zeljkof: Reloading Zuul to deploy c1d7b5f
2017-09-30
- 14:28 zeljkof: Reloading Zuul to deploy c08a3ad
2017-09-29
- 19:45 hashar: Deleting integration-slave-jessie-php55
- 17:34 zeljkof: Reloading Zuul to deploy 0e26c86
- 16:42 zeljkof: Reloading Zuul to deploy 09445b8
- 15:10 zeljkof: Reloading Zuul to deploy 7f66813
- 14:15 tabbycat: maurelio@deployment-tin:~$ mwscript cleanupSpam.php --wiki=deploymentwiki *.logininput.org ( testing w/o delete T176206 / 7f842058602c )
- 14:10 tabbycat: maurelio@deployment-tin:~$ mwscript cleanupSpam.php --wiki=deploymentwiki *.loginpartner.org --delete ( testing T176206 / 7f842058602c )
- 13:00 hashar: github: created https://github.com/wikimedia/integration-quibble for gerrit replication
- 12:53 hashar: gerrit: marked labs/tools/grrrit archived
- 09:53 addshore: addshore@integration-slave-docker-1001:~$ sudo docker ps --filter "status=exited" | grep 'weeks ago' | awk '{print $1}' | xargs --no-run-if-empty sudo docker rm
- 09:53 addshore: addshore@integration-slave-docker-1001:~$ sudo docker ps --filter "status=exited" | grep 'months ago' | awk '{print $1}' | xargs --no-run-if-empty sudo docker rm
- 09:40 addshore: marking integration-slave-docker-1001 as online - T177039
- 09:33 addshore: rebooting integration-slave-docker-1001
- 09:10 addshore: wm-ci-docker-push mediawiki-phpcs:v2017.09.29.09.08 & latest https://gerrit.wikimedia.org/r/381413
- 05:59 legoktm: marking integration-slave-docker-1001 as offline - T177039
- 00:19 mutante: releases1001 - created user for "no_justification", dropped pass in home dir
- 00:12 mutante: jenkins now configured and running at https://releases.wikimedia.org/ci/ (T164030) - but needs additional admin users and puppet is still disabled for temp hack fix
2017-09-28
- 16:58 addshore: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/381242
- 15:37 addshore: docker push docker.io/wmfreleng/mediawiki-phpcs:v2017.09.28.15.28
- 11:35 hashar: docker push wmfreleng/tox:v2017.09.28.11.34 Adds XDG_CACHE_HOME=/cache https://gerrit.wikimedia.org/r/#/c/380961/
- 11:34 hashar: docker push wmfreleng/ci-jessie:v2017.09.28.11.33 . Adds XDG_CACHE_HOME=/cache https://gerrit.wikimedia.org/r/#/c/380961/
- 09:20 moritzm: upgraded mediawiki04-mediawiki06 in deployment-prep to HHVM 3.18.5
- 08:39 hashar: Deleted integration-saltmaster and deployment-salt02 . Replaced by integration-cumin and deployment-cumin - T176314
- 08:32 hashar: Migrated Hiera config from https://wikitech.wikimedia.org/wiki/Hiera:Integration to Horizon
- 08:31 hashar: Removing salt configuration from integration and deployment-prep projects. Replaced by cumin. - T176314
2017-09-27
- 21:25 hashar: salt is being replaced by cumin instances being deployment-cumin and integration-cumin . Check this out: https://wikitech.wikimedia.org/wiki/Cumin !
- 20:12 hashar: Deleted aptly.integration.eqiad.wmflabs and the https://integration-aptly.wmflabs.org/repo/ webproxy. They were for php5.5 packages on jessie, now available on apt.wm.o - T174972
- 19:39 hashar: deployment-prep: purging "ferm" on hosts that no more have it applied via puppet. There were some old iptables rules left around blocking access
- 16:38 addshore: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/381024
- 12:38 addshore: Reloading Zuul to deploy - Add lintr-docker-non-voting [integration/config] - https://gerrit.wikimedia.org/r/380746
- 12:27 hasharAway: docker push wmfreleng/tox:v2017.09.27.12.26 https://gerrit.wikimedia.org/r/#/c/380926/
- 11:01 addshore: docker push docker.io/wmfreleng/mediawiki-phan:v2017.09.27.10.53 & latest (From https://gerrit.wikimedia.org/r/#/c/380940)
- 11:01 addshore: docker push docker.io/wmfreleng/php-mediawiki:v2017.09.27.10.51 & latest (From https://gerrit.wikimedia.org/r/#/c/380940)
- 11:01 addshore: docker push docker.io/wmfreleng/composer:v2017.09.27.10.49 & latest (From https://gerrit.wikimedia.org/r/#/c/380940)
- 11:01 addshore: docker push docker.io/wmfreleng/lintr:v2017.09.27.10.45 & latest (From https://gerrit.wikimedia.org/r/#/c/380940)
- 11:01 addshore: docker push docker.io/wmfreleng/php:v2017.09.27.10.21 & latest (From https://gerrit.wikimedia.org/r/#/c/380940)
- 11:01 addshore: docker push docker.io/wmfreleng/tox:v2017.09.27.10.21 & latest (From https://gerrit.wikimedia.org/r/#/c/380940)
- 10:08 addshore: docker push docker.io/wmfreleng/ci-jessie:v2017.09.27.09.59 & latest (from https://gerrit.wikimedia.org/r/#/c/378033/5)
2017-09-26
- 16:15 addshore: docker push docker.io/wmfreleng/zuul-cloner:v2017.09.26.16.09 & latest (from PS11 of https://gerrit.wikimedia.org/r/379479)
- 16:01 addshore: docker push docker.io/wmfreleng/mediawiki-phpcs:v2017.09.26.15.45 & latest (From PS10 of https://gerrit.wikimedia.org/r/379479)
- 13:58 addshore: added hashar to https://hub.docker.com/u/wmfreleng
- 13:07 moritzm: adding deployment-videoscaler01 to deployment-prep (stretch-based video scaler)
- 12:45 addshore: Reloading Zuul to deploy - Low prio queue for libraryupdater [integration/config] - https://gerrit.wikimedia.org/r/380307
- 12:28 addshore: fab docker_pull_image:wmfreleng/lintr:v2017.09.26.12.04
- 12:26 addshore: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/379818
- 12:06 addshore: docker push docker.io/wmfreleng/ & latest (PS6 of https://gerrit.wikimedia.org/r/378831 )
2017-09-25
- 17:07 mutante: Greg is now a contint-admin
- 12:36 addshore: addshore@integration-saltmaster:~$ sudo salt -v '*slave-docker*' cmd.run 'sudo docker rmi wmfreleng/operations-puppet:0.0.1 wmfreleng/operations-puppet:0.1.0'
- 12:30 addshore: Reloading Zuul to deploy Refactor 'operations-puppet-tests-docker' into macros for easy reuse [integration/config] - https://gerrit.wikimedia.org/r/379959
- 09:12 moritzm: added deployment-mediawiki07 to deployment-prep (stretch-based app server, WIP)
2017-09-24
- 10:49 addshore: addshore@integration-saltmaster:~$ sudo salt -v '*slave-docker*' cmd.run "sudo docker images --no-trunc --format 'Template:.ID Template:.CreatedSince' | grep ' months' | awk '{ print $1 }' | xargs --no-run-if-empty docke r rmi"
- 10:37 addshore: docker push docker.io/wmfreleng/lintr:v2017.09.24.10.33 & latest (https://gerrit.wikimedia.org/r/#/c/378831/2 (actually this time))
- 09:58 addshore: docker push docker.io/wmfreleng/lintr:v2017.09.22.18.51 & latest (https://gerrit.wikimedia.org/r/#/c/378831/2)
2017-09-22
- 21:55 tabbycat: Granted Greg G. 'staff' global rights on the beta cluster per request
- 20:37 hashar: Image snapshot-ci-jessie-1506112074 in wmflabs-eqiad is ready
- 20:28 hashar: updating nodepool image for jessie [2/x]
- 20:03 hasharAway: Updating nodepool image for jessie
- 17:22 addshore: docker push docker.io/wmfreleng/tox:v2017.09.22.17.16 & latest # (From current master)
- 15:24 hashar: Restarted Jenkins (out of memory)
- 10:06 hashar: deployement-salt02 migrated hiera config from wikitech to horizon. Removed the class role::deployment::salt_masters
- 08:44 hashar: Upgraded docker on integration-slave-docker-1001 and integration-slave-docker-1002 - T176267
- 07:13 greg-g: some jsduck jobs are running now, serially, for the backlogged queue. Unsure of starved jobs (integration-config-qa, pywikibot-beta-cluster, etc)
- 07:04 greg-g: deleting stuck mediawiki-core-jsduck-publish jobs in Jenkins UI
- 06:57 greg-g: pinged an opsen, hopefully they'll restart zuul shortly
- 06:45 greg-g: Zuul is stuck, no jobs are processing
2017-09-21
- 10:23 elukey: removed 6fdf6ee653 from deployment-prep's puppet master cherry picks (seemed an old version of https://gerrit.wikimedia.org/r/#/c/357985)
2017-09-20
- 15:46 addshore: reloading zuul for https://gerrit.wikimedia.org/r/#/c/379250/
- 13:59 addshore: docker push docker.io/wmfreleng/mediawiki-phan:v2017.09.20.13.49 & latest # built from master
- 13:59 addshore: docker push docker.io/wmfreleng/composer:v2017.09.20.13.44 & latest # built from master
- 13:59 addshore: docker push docker.io/wmfreleng/zuul-cloner:v2017.09.20.13.44 & latest # built from master
- 13:59 addshore: docker push docker.io/wmfreleng/php-mediawiki:v2017.09.20.13.43 & latest # built from master
- 13:59 addshore: docker push docker.io/wmfreleng/php:v2017.09.20.13.40 & latest # built from master
- 13:07 tabbycat: deployment-prep Ran cleanupSpam.php on deploymentwiki. Further testing with regards to ongoing development and updating of the script.
- 11:53 addshore: Reloading Zuul (Testing)
2017-09-19
- 17:26 legoktm: removed rights from User:Sau226 on beta cluster due to block of account used for browser tests
- 09:13 tabbycat: Re-run previous script and it worked this time, see https://deployment.wikimedia.beta.wmflabs.org/wiki/Template_talk:Rotate/en
- 09:11 tabbycat: Ran mwscript cleanupSpam.php on the beta cluster, but it didn't worked (looks it is not fetching the domains properly)
2017-09-18
- 20:12 addshore: deleted unused images that were *months old* on docker slaves
- 19:01 addshore: addshore@contint1001:~$ sudo service zuul reload
- 18:45 thcipriani: reloading zuul to test https://gerrit.wikimedia.org/r/#/c/378665/2
- 16:53 elukey: removed https://gerrit.wikimedia.org/r/#/c/377753/ from the git cherry-picks in operations/puppet on puppetmaster02
2017-09-17
- 18:59 addshore: Reloading Zuul to deploy archiving of 2 extensions
2017-09-14
- 19:37 tgr: updated PrivateSettings.php for T175868
- 10:38 elukey: cherry-pick https://gerrit.wikimedia.org/r/#/c/377753/7 on deployment-prep's puppetmaster02 to test it on the new kafka jumbo instances
- 10:35 hashar: CI puppet master: added class geoip::data::package and parameters: puppetmaster::geoip::fetch_private: false puppetmaster::geoip::use_proxy: false - T175864
2017-09-13
- 10:13 addshore: docker push docker.io/wmfreleng/operations-puppet:v2017.09.13.09.23 (#d693f74c9b3404220a2ad2934f526d4f4455914b)
- 09:25 hashar: Deleting integration-slave-trusty-1003 and integration-slave-trusty-1001 - T175696
- 09:14 hashar: nodepool: openstack image delete image-ci-trusty - T175696
- 07:49 hashar: Jenkins: removing the Ubuntu JDK from https://integration.wikimedia.org/ci/configureTools/
- 07:40 hashar: jenkins: on nodes, removing the labels phpflavor-* they are no more needed - T161882
- 07:40 hashar: jenkins: on nodes, removing the labels phpflavor-* they are no more needed - T 161882
2017-09-12
- 20:35 hashar: pooling integration-slave-jessie-1003 and integration-slave-jessie-1004
- 19:40 hashar: hacked integration-slave-jessie hosts to ship them php5.5
- 18:49 hasharAway: nodepool: deleted image image-ci-trusty_old_20170804 Keeping image-ci-trusty just in case
- 14:57 hashar: Deleted all left over jenkins jobs having ci-trusty-wikimedia label. - T161882
- 14:46 hashar: provisionning integration-slave-jessie-1003 and integration-slave-jessie-1004 to move php55lint to them. NOT READY YET - T161882
- 14:05 hashar: Deleting integration-slave-trusty-1004 - T161882
- 13:09 hashar: nodepool: deleting alien instance: openstack server delete ci-jessie-wikimedia-815477
- 11:09 hashar: Image snapshot-ci-jessie-1505213295 in wmflabs-eqiad is ready
- 10:48 hashar: nodepool: force updating jessie image to grab php5.5-luasandbox - T161882 T174972
2017-09-11
- 23:27 thcipriani: restarting jenkins
- 22:38 legoktm: deploying https://gerrit.wikimedia.org/r/377361
- 12:47 hashar: Nodepool: refreshing jessie snapshot to get php5.5-luasandbox installed
2017-09-10
- 01:44 bd808: nodepool running steadily again, but has been heavily throttled to hopefully prevent another weekend thundering herd of doom failure for the OpenStack backend
2017-09-09
- 22:15 bd808: `sudo journalctl -u nodepool --since today --no-pager` shows many LaunchStatusException failures.
2017-09-07
- 13:02 hashar: nodepool: Image snapshot-ci-jessie-1504788047 in wmflabs-eqiad is ready | T174972
- 11:58 hashar: nodepool: updating snapshot-ci-jessie to add php5.5-redis | T161882 T174972
- 11:10 addshore: Reloading Zuul to deploy "Add gate-submit jobs for analytics/wmde/* repos"
- 02:44 legoktm: deploying https://gerrit.wikimedia.org/r/376460
2017-09-06
- 21:32 bearND: Update mobileapps to 2cb6281 (T168848 T169277 T169274 T162179 T164033 T167921 T174698 T168848 T174808)
2017-09-05
- 23:03 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/376034
- 19:34 gilles: deployed PrivateSettings.php change to add Thumbor username to Swift configuration
2017-09-04
- 15:59 zeljkof: Reloading Zuul to deploy ca1c6ec
- 12:21 hashar: Image snapshot-ci-jessie-1504527142 in wmflabs-eqiad is ready
- 11:37 hashar: nodepool: refreshing jessie snapshot
- 10:03 addshore: Reloading Zuul to deploy mwext-php70-phan-jessie-docker experimental job
- 00:42 legoktm: legoktm@contint1001:/srv/zuul/git/mediawiki/libs$ sudo -u zuul rm -rf XMPReader
2017-09-02
- 08:32 legoktm: rm -rf /var/logs/kafka on deployment-kafka01 to free up disk space
2017-08-31
- 23:32 Krenair: fixed deployment-imagescaler0[12] puppet by installing a package and file manually, some puppetisation still needed - https://phabricator.wikimedia.org/T174746
- 23:04 Krenair: that also did deployment-cache-(upload|text)04
- 22:50 Krenair: fixed deployment-ms-be0[34] puppet by removing cherry-pick of https://gerrit.wikimedia.org/r/#/c/371582/1 - details in a comment there
- 15:49 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/374970/
- 14:52 zeljkof: Reloading Zuul to deploy dd50483
- 10:06 zeljkof: Reloading Zuul to deploy e91b30f
- 06:25 legoktm: deploying https://gerrit.wikimedia.org/r/374937
2017-08-30
- 12:49 hashar: gerrit: marked wikimedia/communications/WMBlog as read-only - T172372
2017-08-29
- 15:39 hashar: Created integration-slave-jessie-php55 to try out a php5.5 package on Jessie - T161882
- 15:06 hashar: nodepool: deleting alien instance: openstack server delete ci-jessie-wikimedia-793795
- 08:45 hashar: Restarting Jenkins for openjdk update
- 08:11 hashar: refreshing all Jenkins jobs with a newer version of JJB
2017-08-28
- 14:54 hashar: integration: rebase integration puppet master. Got a conflict due to r -> r_lang renaming ( https://gerrit.wikimedia.org/r/#/c/363337/ )
- 08:52 hashar: gerrit: added ldap/ciadmin to the 'integration' group. T169557 T173233
2017-08-25
- 15:11 zeljkof: Reloading Zuul to deploy b6704e2
2017-08-24
- 21:55 mdholloway: disk was full on integration-slave-jessie-android; deleted ~8gb of old screenshots from /tmp to clear some space
- 16:54 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/372780/7
- 15:37 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/371138/
- 11:56 zeljkof: Reloading Zuul to deploy c20a740
2017-08-22
- 06:49 legoktm: deploying https://gerrit.wikimedia.org/r/370695
- 06:44 legoktm: deploying https://gerrit.wikimedia.org/r/372229
2017-08-21
- 18:18 mutante: addshore is now a contint-admin
2017-08-18
- 20:21 legoktm: deploying https://gerrit.wikimedia.org/r/372222
2017-08-16
- 23:46 legoktm: reloading Zuul to deploy https://gerrit.wikimedia.org/r/372208 https://gerrit.wikimedia.org/r/371653 https://gerrit.wikimedia.org/r/371757 https://gerrit.wikimedia.org/r/371640
2017-08-15
2017-08-14
- 09:46 TabbyCat: maurelio@deployment-tin:/srv/mediawiki/dblists$ expanddblist flow-computed > /home/maurelio/flow-test.dblist (to test expandblist for a patch I am working on)
2017-08-11
- 20:25 addshore: added mediawiki::maintenance::wikidata to deployment-tin
2017-08-07
- 15:11 thcipriani: restarting jenkins for plugin update
2017-08-06
- 13:28 TabbyCat: Ran mwscript extensions/WikimediaMaintenance/dumpInterwiki.php deploymentwiki on the beta cluster
2017-08-04
- 15:21 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/370222/2
2017-08-03
- 12:02 hashar: Added integration-slave-docker-1004 to the pool of jenkins slaves - T150502
- 10:12 hashar: gerrit: marked wikimedia/communications/WP-Victor read-only and [ARCHIVED] - T107430
- 04:50 SMalyshev: update cherry-pick for https://gerrit.wikimedia.org/r/#/c/299825/8 on deployment-puppetmaster02.deployment-prep.eqiad.wmflabs
2017-08-02
- 22:08 MaxSem: Running rebuildall.php on beta ruwiki
- 20:17 bearND: Update mobileapps to 2d8e8f6
- 11:31 hashar: Image snapshot-ci-jessie-1501673225 in wmflabs-eqiad is ready T169602
- 10:51 hashar: Image snapshot-ci-jessie-1501670727 in wmflabs-eqiad is ready - T169602
- 09:02 hashar: Regenerating Nodepool Jessie image from scratch to get rid of tox 1.9.2 installed under /usr/local - T169602
- 08:44 hashar: Image snapshot-ci-jessie-1501662758 in wmflabs-eqiad is ready - T169602
- 08:42 hashar: - T169602
- 08:32 hashar: Regenerating Nodepool jessie image to upgrade tox from 1.9.2 to 2.5.0 - T169602
2017-08-01
- 15:45 hashar: Image snapshot-ci-jessie-1501601670 in wmflabs-eqiad is ready && purging old instances T161861
- 15:44 hashar: Debug: Executing '/usr/bin/npm install -g npm@3.8.3' - T161861
- 15:34 hashar: Refreshing nodepool Jessie image to bump npm from 2.x to 3.8.x T161861
- 10:12 hashar: Stopped Zuul / CI for mass mediawiki extension changes
2017-07-28
- 21:11 MaxSem: Dropped table wikigrok_questions from beta enwiki
- 12:19 zeljkof: Reloading Zuul to deploy 47a07e0
- 00:17 Krinkle: Testing job insertion on beta cluster from deployment-tin triggers PHP Notice: Undefined index: uuid in EventBus/JobQueueEventBus.php:102, PHP Notice: Undefined index: sha1 in EventBus/JobQueueEventBus.php:99
2017-07-26
- 21:35 Reedy: kill two long running update.php jobs on deployment-tin
- 13:39 zeljkof: Reloading Zuul to deploy 8787b4b
- 12:04 zeljkof: Reloading Zuul to deploy 79781d8
- 11:39 zeljkof: Reloading Zuul to deploy 723ab49
- 11:31 hashar: realign installed debian packages on integration-slave-jessie-1001 and integration-slave-jessie-1002 - T171724
- 09:25 hashar: deployment-tin deleting temporary l10n cache from July 19th 20:09 at /tmp/scap_l10n_3608512748 1.5G
- 09:24 hashar: deployment-cache-upload04 deployment-cache-text04 upgraded logster 0.0.10-1~jessie1 -> 0.0.10-2~jessie1 - T171318
2017-07-25
2017-07-24
- 21:56 bearND: Update mobileapps to b608ec8
- 15:03 hashar: Added webperformance Jenkins slave https://integration.wikimedia.org/ci/computer/webperformance/ with a single executor - T166756
- 14:57 hashar: recreating integration-webperf instance has simply "webperformance" Same 2CPU / 2GB RAM / 40G disk - T166756
- 14:57 hashar: recreating integration-webperf instance has simply "webperformance" Same 2CPU / 2GB RAM / 40G disk
- 14:40 hashar: Booting integration-webperf instance 2CPU / 2GB RAM / 40G disk. Intended to host webperformance long running jobs . T166756
- 11:02 hashar: Removing profile::swift::storage::labs class from deployment-ms-be03 and deployment-ms-be04 to let puppet run. Reapplying it after. - T171174 T171454
- 10:59 hashar: Removing class from deployment-trending01 to let puppet run. Reapplying it after. - T171174
- 10:54 hashar: Removing classes from deployment-sca02 and deployment-sca03 to let puppet run. Reapplying it after. - T171174
- 10:32 hashar: Removing profile::etcd from deployment-conf03 to let puppet run. Reapplying it after. - T171174
- 10:12 hashar: Removing role::mathoid from deployment-mathoid to let puppet run. Reapplying it after. - T171174
- 10:09 hashar: Removing role::changeprop from deployment-changeprop to let puppet run. Reapplying it after. - T171174
- 10:06 hashar: Removing role::ocg from deployment-mcs01 to let puppet run. Reapplying it after. - T171174
- 10:02 hashar: Removing role::mobileapps from deployment-mcs01 to let puppet run. Reapplying it after. - T171174
2017-07-21
- 14:55 hashar: Jenkins: upgraded Android Emulator plugin with https://gerrit.wikimedia.org/r/#/c/366253/ && https://gerrit.wikimedia.org/r/#/c/366484/ - T150623
- 14:12 hashar: added novaadmin to deployment-prep as a regular user. That lets MediaWiki OpenStack API list the instances T171280
- 13:56 hashar: Created github mirror repo https://github.com/wikimedia/wikibase-wikiba.se T171160
- 10:46 hashar: Gerrit: created wikibase/wikibase.se repo for Amir1 / T171160
2017-07-20
- 16:42 hashar: How to fix ssh access on beta cluster instances: https://phabricator.wikimedia.org/T171174#3456966
- 15:30 hashar: deployment-prep : removing project wide puppet classes from https://horizon.wikimedia.org/project/puppet/ All are role::eventlogging::analytics::*
- 15:08 hashar: removed profile::recommendation_api from deployment-sca01 to try to fix the ssh access for mobrovac T171173 T171174
- 14:57 zeljkof: reloading Zuul to deploy 80b9d85
- 14:31 hashar: deployment-prep: manually cleaned out the puppet master configuration. It was all screwed up. Notably I removed bits about the puppetdb
- 10:20 zeljkof: Reloading Zuul to deploy 80b9d85
- 09:17 hashar: Spawning and pooling integration-slave-docker-1003 as replacement to integration-slave-docker-1000 (broken) - T150502
- 09:03 hashar: Restoring castorby updating all jobs to point to castor02 ( https://gerrit.wikimedia.org/r/366524 ) Starts with a cold cache :( - T171148
- 08:53 hashar: Created castor02.integration.eqiad.wmflabs with puppet role role::ci::castor::server and adding it to Jenkins. Will then update the Jenkins jobs to point to it - T171148
- 08:00 hashar: Disabled castor entirely via https://gerrit.wikimedia.org/r/366520 . The instance is broken - T171148
- 07:55 hashar: Refreshing all Jenkins jobs defined in JJB in order to then disable castor entirely for T171148
- 07:09 _joe_: rebooting castor, jobs are failing, and no one seems able to login
- 07:05 _joe_: adding myself to projectadmins for integration, trying to troubleshoot castor
- 01:38 thcipriani: scap on beta was failing because during the ldap downtime puppet created a shadow mwdeploy user, fixed using vipw and vigr
2017-07-19
- 14:43 hashar: Jenkins: uploaded a patched android-emulator plugin for T150623 and restarting Jenkins
- 13:55 hashar: Jenkins: added JDK "Debian - OpenJdk 7" with JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
- 12:54 hashar: Gerrit: created repo integration/jenkinsci/android-emulator-plugin.git owned by access group integration-jenkinsci-android-emulator-plugin which has Mholloway - T170904
2017-07-18
- 16:26 halfak: manually restarted uwsgi-ores and celery-ores-worker on deployment-sca03
- 16:19 halfak: manually installed "aspell-el" on deployment-sca03 (work around for ongoing puppet issues)
- 09:04 hashar: deleted integration-slave-trusty-1006
- 03:57 twentyafterfour: Fixed deployment-imagescaler01 by cherry-picking https://gerrit.wikimedia.org/r/#/c/365891/ on deployment-puppetmaster02
2017-07-17
- 18:20 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/365198
2017-07-14
- 20:16 Amir1: cpan[1]> install LWP::UserAgent on tin
2017-07-13
- 17:04 thcipriani: restarting jenkins for updates
2017-07-12
- 20:07 bearND: Update mobileapps to d30dae2
- 18:19 greg-g: where "things" == nodepool instance delete/creation
- 18:18 greg-g: things are back to a bad state, chase etc investigating
- 17:52 greg-g: nodepool is back to making instances and running jobs, thanks Cloud team
- 17:22 greg-g: CI is backed up, only one nodepoll instance running for the last long while, many in building
- 00:35 legoktm: deploying https://gerrit.wikimedia.org/r/364628
2017-07-11
- 21:30 legoktm: deploying https://gerrit.wikimedia.org/r/364601
2017-07-09
- 01:15 Amir1: ladsgroup@deployment-tin:~$ mwscript extensions/ORES/maintenance/CheckModelVersions.php --wiki=enwiki (T170026, T165716)
2017-07-07
- 14:53 hashar: deployment-prep: change webproxy http://recommendation-api-beta.wmflabs.org/ to deployment-sca02 (has the proper security rule) - T148129
- 14:53 hashar: deployment-prep: add port 9632 to security group "sca" https://horizon.wikimedia.org/project/access_and_security/security_groups/593/ - T148129
- 14:03 hashar: Image snapshot-ci-trusty-1499435837 in wmflabs-eqiad is ready
- 13:57 hashar: Nodepool: updating snapshot-ci-trusty
- 13:56 hashar: Nodepool: uploaded new Ubuntu Trusty image
2017-07-06
- 17:28 thcipriani: committed changes to modules/kafkatee on deployment-puppetmaster02 since having them uncommitted broke git-sync-upstream
- 16:20 hashar: Deleting Nodepool snapshot snapshot-ci-jessie-1499350442 - faulty php7.0-sqlite package that breaks phan jobs - T169904
- 15:29 hashar: deployment-cache-upload04 manually ran apt-get upgrade to downgrade ldap-utils and libldap-2.4-2 (caused puppet failure)
- 14:14 hashar: regenerating mediawiki-core-qunit-selenium-jessie jenkins job
- 12:05 hashar: deployment-prep created Web proxy for recommendation-api-beta.wmflabs.org -> http://10.68.20.183:9632 (deployment-sca01) for schana
- 02:38 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/363519
2017-07-04
- 14:10 hashar: manually upgraded apache2 on deployment-puppetmaster02 see T159254
- 13:33 hashar: beta cluster puppet is broken: Error: Could not send report: Connection refused - connect(2) for "deployment-puppetmaster02.deployment-prep.eqiad.wmflabs" port 8140
- 09:28 hashar: gerrit: marking read-only mediawiki/extensions/Nonlinear - T169519
2017-07-03
- 11:34 hashar: jenkins: refreshing all jobs and updating the castor-save bit ( https://gerrit.wikimedia.org/r/#/c/361843/ )
2017-06-30
- 08:16 hashar: Gerrit: changing repos to read-only: analytics/kraken analytics/kraken/deploy analytics/vagrant/kraken - T169303
2017-06-29
- 23:17 legoktm: deploying https://gerrit.wikimedia.org/r/362314
2017-06-28
- 15:55 hashar: beta: git gc mediawiki repos in /srv/mediawiki-staging
- 15:47 hashar: beta: git -C /srv/deployment/ores/deploy/submodules/editquality gc (saving 380MBytes)
- 15:33 hashar: running git gc under /srv/mediawiki-staging
- 14:43 hashar: pypi.python.org is back again - T169091
- 14:33 elukey: running alter tables on the EL database in deployment-eventlogging03.deployment-prep.eqiad.wmflabs
- 14:06 hashar: pypi.python.org has an issue with its CDN . That would affect any CI jobs relying on tox/python - See https://status.python.org for updates and T169091
- 14:04 hashar: pypi.python.org has an issue with its CDN . That would affect any CI jobs relying on tox/python - See https://status.python.org for updates
- 10:06 hashar: Unblocked beta cluster jenkins job. Have been stalled for a while
2017-06-27
- 22:58 Amir1: cherry-picking gerrit:360891/3
- 22:42 Amir1: cherry-picking gerrit:360891/2
- 21:58 Amir1: mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type 'string' --property-id P34
- 18:31 hashar: Image snapshot-ci-jessie-1498587497 in wmflabs-eqiad is ready - T169004
- 18:18 hashar: Regenerating Jessie nodepool image to hopefulyl bring back hhvm-tidy package - T169004
- 17:39 Amir1: running mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type 'external-id' --property-id P34
2017-06-26
- 22:24 halfak: deploying ores-prod-deploy:82dfd56 to beta (note: T168099)
- 22:20 halfak: deploying ores-prod-deploy:82dfd56 to beta
- 20:33 bearND: Update mobileapps to 0b05026
- 18:44 hashar: nodepool image-delete 1636 # Deletes snapshot-ci-trusty-1498491445 which lack nodejs when we still need it.
- 18:23 twentyafterfour: renamed previously active image to 'image-ci-trusty_bad_20170626'
- 18:22 twentyafterfour: reverted nodepool image-ci-trusty to previous version 'image-ci-trusty-old_20170626'
- 15:41 hashar: Image snapshot-ci-trusty-1498491445 in wmflabs-eqiad is ready
- 15:34 hashar: Rebuilding nodepool image for trusty and regenerating snapshots
- 09:19 hashar: gerrit: marked wikimedia/bugzilla/* repos read-only
2017-06-24
- 06:02 legoktm: deployment-flourine02 /srv partition is alerting on low disk space but once logs get automatically gzip'd it should be fine
2017-06-23
- 20:59 hasharAway: deployment-db03 reinstall ldap-utils, libldap-2.4-2 2.4.44+dfsg-4~bpo8+1 > 2.4.41+dfsg-1+wmf1
- 20:54 hasharAway: apt-get upgrade deployment-elastic06
2017-06-22
- 19:02 Amir1: cherry-picking gerrit:360891/1 (T163922)
- 13:35 hashar: Gerrit: adding Bearloga (Mikhail Popov) to the 'search' group . That also makes him an owner to wikimedia/discovery/* - T168588
- 13:35 hashar: Gerrit: adding Bearloga (Mikhail Popov) to the 'search' group . That also makes him an owner to wikimedia/discovery/*
- 08:18 hashar: deployment-prep: removed /etc/apt/preferences.d/puppet.pref which was pinning puppet packages to jessie-backports and hence 4.8.x! - T168511
- 08:16 hashar: deployment-prep: removed /etc/apt/preferences.d/puppet.pref which was pinning puppet packages to jessie-backports and hence 4.8.x!
- 08:12 hashar: deployment-prep: upgraded puppet to 3.8.5 on all instances
2017-06-21
- 20:03 bearND: Update mobileapps to 21f771d
- 19:54 hashar: deployment-tin stopped keyholder and armed it
- 19:25 hashar: hard rebooting deployment-db04
- 19:20 hashar: hard rebooting deployment-db03
- 18:52 hashar: Removing /etc/apt/sources.list.d/wikimedia_mariadb.list (content: deb http://apt.wikimedia.org/wikimedia precise-wikimedia mariadb )
- 18:51 hashar: fixing up apt config on deployment-db03 and deployment-db04 / upgrade packages and kernel / reboot
- 17:02 hashar: upgrading kernel and puppet on deployment-mcs01 deployment-restbase01 and deployment-restbase02 - T168541
- 17:00 hashar: upgrading kernel and puppet on deployment-changeprop and deployment-conf03 - T168541
- 16:56 hashar: upgrading kernel and puppet on deployment-aqs01 deployment-aqs02 and deployment-aqs03 - T168541
- 16:38 hashar: rebooting deployment-cache-upload04 and deployment-cache-text-04 - T168541
- 16:29 hashar: upgrading deployment-apertium02 and deployment-eventlogging04 - T168541
- 16:23 hashar: upgrade and reboot deployment-prometheus01
- 16:11 hashar: rebooting deployment-ms-fe02
- 16:11 hashar: rebooting deployment-ms-be04
- 16:09 hashar: rebooting deployment-ms-be03
- 16:03 hashar: upgrading deployment-ms-fe02 deployment-ms-be03 and deployment-ms-be04
- 15:57 hashar: apt-get upgrade and reboot of deployment-memc04 and deployment-memc05
- 15:52 hashar: rebooting deployment-etcd-01
- 15:48 hashar: apt-get upgrade deployment-etcd-01
- 15:35 hashar: deployment-prep changing Varnish director for citoid from citoid.wmflabs.org to citoid-beta.wmflabs.org ( via https://horizon.wikimedia.org/project/prefixpuppet/ ) - T168519
- 14:41 hashar: deployment-tmh01 is down for some reason
- 14:21 hashar: deployment-prep: force running puppet on all instances
- 14:17 hashar: finally fixed puppet on deployment-prep !
- 14:02 hashar: deployment-puppmaster (cd /etc/puppet && ln -s /var/lib/git/operations/puppet/manifests && ln -s /var/lib/git/operations/puppet/modules)
- 13:26 hashar: deployment-prep: puppet master got erroneously upgrade to puppet* 4.8. Roll it back to 3.8 which fail, and then back to 3.7!
- 12:47 hashar: broke deployment-prep puppet master while upgrading it :(
- 12:28 hashar: deployment-imagescaler01 removed puppetmaster and puppetmaster-common packages
- 12:04 hashar: apt-get dist-upgrade on deployment-mediawiki hosts
- 11:59 hashar: armed keyholder on deployment-tin and deployment-mira
- 11:15 hashar: deployment-cache-text04 : apt-get dist-upgrade
- 11:12 hashar: varnish fails on deployment-cache-text04
- 11:08 hashar: deployment-prep : rebooting deployment-tin deployment-mira deployment-cache-text04 deployment-cache-upload04
- 11:00 hashar: deployment-prep apt-get upgrade and reboot all hosts
- 10:21 hashar: deployment-zotero01 apt-get upgrade and rebooted
- 09:59 hashar: integration: removing swift / python-swift from integration-puppetmaster01
- 09:57 hashar: Upgrading puppet 3.7.2 .. 3.8.5 on integration-slave-docker-1001 and integration-slave-docker-1002
- 09:39 hashar: integration: deleting swift and and swift-storage-01 unused
- 09:38 hashar: upgrading/Rebooting all instances from integration project to catch up with Linux kernel upgrades
2017-06-20
- 19:25 hashar: Nodepool rate being bumped from 1 query per 6 seconds to 1 query per 5 seconds ( https://gerrit.wikimedia.org/r/#/c/358601/ )
- afk: deployment-tin stuck on post-merge queue for the past 13 hours, unstuck now
2017-06-19
- afk: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/360091/
- 08:29 hashar: Gerrit: added Ladsgroup to 'mediawiki' group - T165860
2017-06-18
- 19:26 Reedy: Re-enabled beta-update-databases-eqiad as wikidatawiki takes < 10 minutes T168036 T167981
- 19:25 Reedy: A lot of items on beta wikidatawiki deleted T168036 T167981
2017-06-16
- 23:41 Reedy_: also deleting a lot of Property:P* pages on beta wikidatawiki T168106
- 22:55 Reedy: deleting Q100000-Q200000 on beta wikidatawiki T168106
- 19:04 Reedy: disabled beta-update-databases-eqiad because it's not doing much useful atm
- 14:56 zeljkof: Reloading Zuul to deploy 18a50a7
- 14:40 hashar: integration-slave-jessie-1001 apt-get upgrade to downgrade python-pbr to 0.8.2 as pinned since T153877. /usr/bin/unattended-upgrade magically upgraded it for some reason
- 06:49 Reedy: script upto `Processed up to page 336425 (Q235372)`... hopefully it's finished by morning
- 03:13 Reedy: running `mwscript extensions/Wikibase/repo/maintenance/rebuildTermSqlIndex.php --wiki=wikidatawiki` in screen as root on deployment-tin for T168036
- 03:10 Reedy: running `mwscript extensions/Wikibase/repo/maintenance/rebuildEntityPerPage.php --wiki=wikidatawiki` in screen as root on deployment-tin for T168036
- 02:23 Reedy: cherry-picked https://gerrit.wikimedia.org/r/#/c/354932/ onto beta puppetmaster
2017-06-15
- 16:34 RainbowSprinkles: deployment-prep: Disabled database updates for awhile, running it by hand
- 10:39 hashar: apt-get upgrade on deployment-tin
- 00:52 thcipriani: deployment-tin jenkins agent borked for 4 hours, should be fixed now
2017-06-14
- 12:24 hashar: gerrit: marked mediawiki/skins/Donate has read-only ( https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/skins/Donate ) - T124519
2017-06-13
- 22:05 hashar: Zuul resarted manually from a terminal on contint1001. It does not have any statsd configuration so we will miss metrics for a bit till it is restarted properly.
- 21:13 hashar: Gracefully restarting Zuul
- 20:37 hashar: Restarting Nodepool. apparently confused in pool tracking and spawning to many Trusty nodes (7 instead of 4)
- 20:31 hashar: Nodepool: deleted a bunch of Trusty instances. It scheduled lot of them that are taking slots in the pool. Better have jessie nodes to be spawned instead since there is high demand for them
- 20:19 hashar: deployment-prep: added Polishdeveloper to the "importer" global group. https://deployment.wikimedia.beta.wmflabs.org/wiki/Special:GlobalUserRights/Polishdeveloper - T167823
- 18:47 andrewbogott: root@deployment-salt02:~# salt "*" cmd.run "apt-get -y install facter"
- 18:46 andrewbogott: using salt to "apt-get -y install facter" on all deployment-prep instances
- 18:38 andrewbogott: restarting apache2 on deployment-puppetmaster02
- 18:37 andrewbogott: doing a git fetch and rebase for deployment-puppetmaster02
- 17:00 elukey: hacking apache on mediawiki05 to test rewrite rules
- 16:04 Amir1: cherry-picked 357985/4 on puppetmaster
- 15:59 halfak: deployed ores-prod-deploy:862aea9
- 13:47 hashar: nodepool force running puppet for: lower min-ready for trusty [puppet] - https://gerrit.wikimedia.org/r/356466
- 10:53 elukey: rolling restart of all kafka brokers to pick up the new zookeper change (only deployment-zookeeper02 available)
- 10:36 elukey: delete deployment-zookeeper01 (old trusty instance, replaced with a jessie one)
- 09:50 elukey: big refactoring for zookeeper merged in operations/puppet - https://gerrit.wikimedia.org/r/#/c/354449 - ping the Analytics team for any issue
2017-06-12
- 14:22 hashar: Image snapshot-ci-trusty-1497276913 in wmflabs-eqiad is ready
- 14:15 hashar: Nodepool: regenerating Trusty images to confirm that removal of keystone admin_token is a noop for nodepool - T165211
- 12:44 hashar: Image snapshot-ci-jessie-1497270581 in wmflabs-eqiad is ready
- 12:30 hashar: nodepool: refreshing Jessie snapshot to upgrade HHVM from 3.12 to 3.18 - T167493 T165074
- 08:47 hashar: deployment-prep : salt -v '*' cmd.run 'apt-get clean'
2017-06-09
- 20:30 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/358092/1
- 18:50 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/358067/3
2017-06-07
- 17:49 elukey: forced /usr/local/bin/git-sync-upstream manually on puppetmaster02
- 17:30 elukey: manually fixed rebase issue for operations/puppet on puppetmaster02 (empty commit due to the change for scap3 and jobrunners)
- 09:33 elukey: restart kafka brokers to pick up the new zookeeper settings
- 09:00 elukey: adding deployment-zookeeper02.eqiad.wmflabs to Hiera:deployment-prep
- 08:43 gehel: upgrading kibana to v5.3.3 on deployment-logstash2
- 08:35 gehel: rolling back to kibana 5.3.2, incompatible elasticsearch version
- 08:28 gehel: upgrading kibana to v5.4.1 on deployment-logstash2
2017-06-06
- 14:34 hashar: deleting buildlog.integration.eqiad.wmflabs was mean to receive Jenkins logs in ElasticSearch. We are experimenting with relforge1001.eqiad.wmnet now - T78705
- 12:37 hashar: Removing HHVM from permanent Trusty slaves
- 10:44 elukey: running eventlogging_cleaner.py (https://gerrit.wikimedia.org/r/#/c/356383/) on eventlogging to test the cleaning of old events
- 09:24 hashar: Deleting deployment-phab02 instance. Has been shut off since April 23rd - T167090
- 07:51 hashar_: Fixed puppet on deployment-aqs instances
2017-06-05
- 15:38 elukey: manually hacking deployment-jobrunner02.deployment-prep.eqiad.wmflabs to test a new config
2017-06-02
- 19:51 hashar: integration: granted ebernhardson sudo
- 12:12 hashar: jenkins: rebuild logstash plugin from HEAD of master for jenkins 2 back compat. logstash-1.2.0-4-gbcbc19e - T78705
2017-06-01
- 20:14 bearND: Update mobileapps to c4dc72d
- 20:12 mdholloway: killed the running emulator processes on integration-slave-jessie-android to get it booting again following yesterday's gerrit outage
- 13:39 hashar: Gerrit: change integration.git project to "Rebase if Necessary" with "Allow content merges" - T131008
- 13:10 hashar: Gerrit allow content merge for integration/config ( https://gerrit.wikimedia.org/r/#/admin/projects/integration/config ) - T131008
- 08:03 hashar: Purged all mysql bin files from deployment-db03 ( rm -fR /srv/sqldata/T166060 ) - T166060
2017-05-31
- 20:21 hashar: Jenkins: upgrading git-client-plugin 2.4.5..2.4.6 T166557
- 07:50 hashar: deployment-db04: mysql> set global expire_logs_days = 7 - to expire bin logs faster (instead of 30 days) - T166060
- 07:49 hashar: deployment-db03: mysql> set global expire_logs_days = 7 - to expire bin logs faster (instead of 30 days) - T166060
2017-05-30
- 22:08 hasharAway: Changed integration/config.git submit type from "Fast forward only" to "Rebase if Necessary" T131008
2017-05-29
- 14:44 elukey: reverted previous config on redis01
- 14:36 elukey: set redis-cli -a "$(sudo grep -Po '(?<=masterauth ).*' /etc/redis/tcp_6379.conf)" -p 6381 config set tcp-keepalive 300 on redis01 as test (rollback: redis-cli -a "$(sudo grep -Po '(?<=masterauth ).*' /etc/redis/tcp_6379.conf)" -p 6381 config set tcp-keepalive 0)
- 10:22 hashar: force refreshed Nodepool Trusty images. Was stuck somehow
- 10:06 hashar: deployment-tin rm -fR /usr/src/hhvm T166492
- 09:51 hashar: deployment-tin: rm /var/lib/l10nupdate/caches/cache-master/*.json T166492
2017-05-26
- 09:20 elukey: installing hhvm_3.18.2+dfsg-1+wmf4+exp1_amd64.deb on jobrunner02
- 07:20 elukey: hacking on jobrunner02 in deployment-prep
- 01:28 bearND: Update mobileapps to db6493c
2017-05-25
- 19:46 hashar: deployment-tin manually cleaning disk space
- 16:44 elukey: restored hhvm on jobrunner02
- 16:03 bearND: Update mobileapps to 946fe1f
- 10:33 elukey: manual install of hhvm_3.18.2+dfsg-1+wmf4+exp1_amd64.deb on jobrunner02 to test a fix for the Redis.php lib
- 02:46 RainbowSprinkles: running `mwscript extensions/Flow/maintenance/FlowUpdateUserWiki.php --wiki=enwiki` in a screen on deployment-tin, probably going to take all night
2017-05-24
- 16:04 hashar: rebooting integration-slave-trusty-1003 to catch up with kernel upgrade
- 12:22 hashar: deployment-prep: finished rebase of puppet.git
- 10:19 hashar: deployment-prep rebased puppet repo with: git rebase -X theirs
- 10:10 hashar: deployment-prep : resetting puppet master to last known snapshot snapshot-20170523T0010 . All cherry picks got deleted
- 10:09 hashar: deployment-etcd-01: fixed puppet run
- 08:38 moritzm: updated puppet on deployment-puppetmaster02 to 3.8.5-2~bpo8+2
2017-05-23
- 16:55 RainbowSprinkles: there was no data
- 16:55 RainbowSprinkles: dropped flow_ext_ref from commonswiki on beta. schema migration is busted, going to let it recreate table
- 08:20 hashar: Updating Nodepool snapshot-ci-trusty
- 08:19 hashar: Regenerated Nodepool base image for Trusty. Got rid of hhvm from it
2017-05-22
- 12:11 greg-g: ran git prune and rm'd the gc.log file
- 11:40 greg-g: gjg@deployment-tin:/srv/mediawiki/.git/gc.log has warning: There are too many unreachable loose objects; run 'git prune' to remove them.
2017-05-21
- 12:05 Reedy: deployment-tin is back online
- 10:41 Reedy: disabled jerkins on deployment-tin again
- 09:10 greg-g: beta-update-database-eqiad has been hitting the timelimit since May 19th
- 09:02 Reedy: brought deployment-tin back online a while ago
2017-05-20
- 09:10 greg-g: executers are running again
- 09:02 greg-g: All executers in Jenkins are "offline" including the permament ones
2017-05-19
- 19:05 mutante: fixing role class config on deployment-phab* (remove role::phabricator::main, add role::phabricator_server in context prefix "deployment-phab. remove again from instance level for phab-01
- 18:40 mutante: deployment-phab01 still has puppet error "Could not find class role::phabricator::main" and that should simply be removed from it, but i can NOT find it in Horizon, i checked instance config, project config, the "Other" section, the "All classes" tab. Because it's gone. But how do i fix the instance config then?
- 18:39 mutante: applying role::phabricator_server on instance deployment-phab01 (it had error, could not find role::phabricator::main and the name changed in role/profile conversion)
2017-05-15
- 10:46 addshore: enabled beta-code-update-eqiad for some testing
- 10:38 addshore: temporarily disabled beta-code-update-eqiad for some testing
2017-05-13
- 20:31 bd808: Deleted stuck mediawiki-core-doxygen-publish job. Jenkins had it marked for a particular nodepool instance that was offline.
2017-05-12
- 13:12 hashar: Trying to refresh Nodepool Jessie image. Should get HHVM pinned to 'experimental' component => 3.12.x
2017-05-11
- 20:43 hashar: nodepool: delete today jessie image snapshot. It comes with HHVM 3.18 which segfault with MediaWiki/PHPUnit. Rolled back to snapshot-ci-jessie-1494425642 from 30 hours ago. T165074
- 12:57 godog: cherry-pick https://gerrit.wikimedia.org/r/#/c/353282/
2017-05-10
- 20:28 bearND: Update mobileapps to 75b135e
- 18:32 mutante: deployment-tin/mira: the change of the role class name was because of https://gerrit.wikimedia.org/r/#/c/344728/ which moved deployment::server to profile/role structure. both instances configured accordingly now. the remaining issue with "id_rsa.bromine" should be all unrelated
- 18:28 mutante: deployment-mira: configure puppet config in horizon, remove "role::deployment::server", use correct new name "role::deployment_server" (moved to profile). (a bit tricky because then in Horizon it seems to disappear from the "others" section, but if you click the "all" tab you get to see the class names
- 18:12 mutante: deployment-tin: puppet run now ok, except ":Upload/File[/var/lib/releases/.ssh/id_rsa.bromine.eqiad.wmnet]: Could not evaluate:" this should be an unrelated issue
- 18:05 mutante: deployment-tin: configure to use role::deployment_server (instead of deployment::server), for some reason now Horizon shows _nothing_ under "other classes" where this was before
- 17:58 mutante: deployment-tin: deleting puppet lock file (claimed it was running but also didnt run since > 900 min), looking at fixing deployment::server role name change
- 15:26 elukey: refresh cherry pick gerrit/352582 on puppet master (rebase -i to remove, then cherry pick)
- 14:34 elukey: cherry pick gerrit/352582 to puppet master
- 12:35 hashar: deployment-prep: git -C /srv/mediawiki-staging/php-master/extensions rm --cached SemanticFormsInputs
- 08:04 hashar: merging 'composer test' into mwext-testextension-* jobs https://gerrit.wikimedia.org/r/#/c/352160/ - T161895
2017-05-09
- 12:44 hashar: deployment-ircd upgrading puppet 3.7.2 => 3.8.5
- 12:19 hashar: Unbroke puppet on deployment-irc and deployment-urldownloader . Both choked on a ruby one-liner, fixed via https://gerrit.wikimedia.org/r/#/c/336840/
2017-05-08
- 21:42 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/351131/
- 00:57 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/351130
2017-05-06
- 01:16 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/352154
2017-05-05
- 20:11 hasharDisappear: Mass pushing addition of jakub-onderka/php-console-highlighter to all mediawiki extensions having php-parallel-lint ( example: https://gerrit.wikimedia.org/r/#/c/352215/ )
- 09:20 godog: cherry-pick https://gerrit.wikimedia.org/r/#/c/350817 on deployment-puppetmaster02
- 09:17 addshore: temporarily enabled beta-code-update-eqiad
- 09:12 addshore: temporarily disabled beta-code-update-eqiad
- 08:30 godog: cherry-pick https://gerrit.wikimedia.org/r/#/c/350220/ on deployment-puppetmaster02
2017-05-04
- 10:47 hashar: puppet ca destroy deployment-zookeeper01.eqiad.wmflabs
- 10:46 hashar: puppet ca destroy deployment-ores-redis-02.deployment-prep.eqiad.wmflabs (no such instance)
- 10:46 hashar: puppet ca sign deployment-ores-redis-02.deployment-prep.eqiad.wmflabs
- 10:39 hashar: Removing puppetmaster: puppetmaster.thumbor.eqiad.wmflabs from deployment-imagescaler01 - T153319
- 10:37 hashar: deployment-prep: force recompilation of puppet.conf : salt -v '*' cmd.run 'echo >> /etc/puppet/puppet.conf.d/10-main.conf' - T153319
- 10:37 hashar: deployment-prep: force recompilation of puppet.conf : salt -v '*' cmd.run 'echo >> /etc/puppet/puppet.conf.d/10-main.conf'
- 10:31 hashar: deployment-phab01 / deployment-imagescaler01 rm /etc/puppet/puppet.conf.d/10-self.conf - T153319
- 10:29 hashar: Unbroke puppet on deployment-imagescaler01 and removing role::puppetmaster::self - T153319
- 10:16 hashar: Unbroke puppet on deployment-phab01 - T153319
- 07:30 hashar: deployment-prep: adding TTO (This, that and the other) as a project member to grant shell access - T163887
2017-05-03
- 17:39 mdholloway: (this concerns integration-slave-jessie-android)
- 17:37 mdholloway: enabled automatic Android component installation for the Android Gradle plugin, rebuilt the SDK, and deleted the old one
- 15:54 hashar: Granted sudo right for Niedzielski accounts on Android CI slave. Already has it with the other labs account Sniedzielski - T164388
- 15:38 hashar: Granted mdholloway (mobile team) full sudo access on integration labs project so he can reach integration-slave-jessie-android - T164388
2017-05-02
- 21:14 hashar: Manually cancelled a few mediawiki-core-jsduck-publish and mediawiki-core-doxygen-publish job in Jenkins build queue. They seems to deadlock Jenkins somehow :(
- 19:59 hashar: Regenerate jobs selenium-GettingStarted from JJB - T164296
- 19:51 hashar: Jenkins: rolling back Performance plugin from 2.2 to 2.0 due to an exception / failure to find a junit xml file. T164296
- 19:02 hashar: Added multichill ( https://github.com/multichill ) to the Wikimedia Github organization
- 10:21 godog: bounce varnish and varnish-frontend on deployment-cache-upload04
- 10:16 godog: upgrade scap on deployment-tin to overcome AttributeError: Lock instance has no attribute 'get_lock_excuse'
- 09:41 godog: flip deployment-cache-upload04 to deployment-ms-fe02 - T162247
- 08:17 hashar: Reconfigured all Jenkins jobs via jjb
2017-05-01
- 20:39 hashar: Updated REL1_29 branch of ImportArticles / OAuth / Quiz and Wikispeech so they get phpcs ( https://gerrit.wikimedia.org/r/#/c/350984/ )
- 20:26 hashar: nodepool: deleting alien instance ci-trusty-wikimedia-631443 4e66ad7e-b9d3-4af1-b559-3f54968d376e
- 02:49 TimStarling: on puppetmaster02 manually updating /etc/conftool/data-local
- 02:37 TimStarling: on puppetmaster02 updated cherry pick for https://gerrit.wikimedia.org/r/#/c/347360
2017-04-27
- 18:18 urandom: deployment-prep: restarting cassandra-metrics-collector on deployment-restbase0[1-2]
- 07:26 Amir1: cherry-picking 348184/4 (T161563)
2017-04-26
- 23:36 urandom: removing r/350485 from deployment-prep
- 21:53 urandom: cherry-picking r/350485 to deployment-prep
- 20:20 bearND: Update mobileapps to 14bd4a5
- 15:24 godog: add new deployment-ms-be0[34] backends to swift in deployment-prep - T162247
2017-04-25
- 21:57 halfak: deployed ores cc12103
- 06:46 Amir1: uncherry-pick f6ce64e99a and 225b8d4e82 (T161563)
2017-04-22
- 20:17 hashar: Added FlorianSW to Github organization "wikimedia" (no team though)
2017-04-21
- 12:25 hashar: T104048 zuul enqueue --trigger gerrit --pipeline postmerge --project AhoCorasick --change 345433,1
- 09:32 hashar: Zuul: deploying "Decouple repos from mediawiki gate queue" 7a79f752363a / T107529
- 09:30 elukey: hack reverted on tin and scap pull performed on jobrunner02
2017-04-20
- 17:09 elukey: reverted hack on deployment-tin (apparently no effects on the jobrunner)
- 16:41 elukey: temporary disable puppet on deployment-tin to remove jobrunner02 from scap dsh; manually enable persistent connection between it and rdb redis hosts
2017-04-19
- 16:34 hashar: deleted nodepool alien ci-jessie-wikimedia-613597
- 09:20 hashar: apt-get upgrade deployment-tin deployment-mira
- 09:16 hashar: apt-get upgrade on deployment-mx deployment-redis01 deployment-redis02 deployment-cache-text04
- 02:58 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/348896
2017-04-18
- 14:29 hashar: unbreaking integration puppetmaster. Broke it when upgrading the puppet package :(
- 14:09 hashar: integration: upgrade puppet on Jessie permanent slaves 3.7.2 -> 3.8.5 (and add ruby-rgen). Done via: salt -v '*' pkg.upgrade
- 13:17 elukey: upgrade deployment-jobrunner02 to hhvm 3.18.2+wmf2 - T162354
- 10:07 godog: upgrade swift to 2.2.0 on deployment-ms*
2017-04-14
- 12:29 hashar: Delete integration-c1 instance (32GB RAM) on labvirt1004. It was used as a workaround for T161006
- 08:17 hashar: beta: cherry picking again 348184/4 'service: use gzip for logging in uwsgi' for T161563
- 08:03 hashar: beta: resetting puppetmaster to last good tag snapshot-20170414T0030 A cherry pick for T161563 end up dropping three patches which broke other parts of the infrastructure
- 07:52 hashar_: Puppet failing on deployment-tin and deployment-mira . Some patches have been dropped from the puppet master :-((
- 00:59 Amir1: three cherry-picks failed to merge, skipped them 93dad5b 92c7d0b 21d60a4
- 00:45 Amir1: cherry-picking 348184/1 (T161563)
2017-04-13
- 15:37 hashar: deployment-mediawiki04 clearing /var/cache/hhvm/fcgi.hhbc.sq3
- 15:15 hashar: Deployed mediawiki-core-qunit-selenium-jessie job (runs qunit + selenium with webdriverio) https://gerrit.wikimedia.org/r/#/c/347587/ - T139740
2017-04-12
- 15:14 hashar: rm -fR /mnt/home/jenkins-deploy/.android/build-cache/* # T162635
- 14:56 hashar: integration-slave-jessie-1001 : mv /mnt/home/jenkins-deploy/.android-sdk /mnt/home/jenkins-deploy/.android-sdk.T162635.back for T162635
- 14:54 hashar: integration-slave-jessie-1002 : mv /mnt/home/jenkins-deploy/.android-sdk /mnt/home/jenkins-deploy/.android-sdk.T162635.back for T162635
- 10:37 hashar: Jenkins email-ext plugin got upgraded. Some groovy templating might be prevented and would have to be reviewed/approved via https://integration.wikimedia.org/ci/scriptApproval/
- 08:52 hashar: Cancelled bunch of mediawiki-core-doxygen-publish jobs that were keeping the queue busy/deadlocked builds. Should be moved to poll scm instead ( T115755 )
2017-04-11
- 15:59 hashar: integration-config-tox-jessie job is broken due to the JJB upgrade
- 15:40 hashar: Upgraded JJB to latest master 4f77324f with a couple cherrypicks on top of that. 022738f8...edebce7f T162674
- 15:36 hashar: Updating selenium-* jobs configuration for the performance plugin due to JJB upgrade T162674
- 15:24 hashar: Adding parameter ZUUL_VOTING to all Jenkins jobs due to JJB upgrade T162674
- 15:13 hashar: Forced updated jenkins-job-builder 86478421...022738f8 - T162674
- 13:44 hashar: Forced updated jenkins-job-builder 1639a86e...86478421 - T162674
- 13:44 hashar: Updating all Jenkins jobs using the git plugin due to JJB change cdfeb7b - T162674
- 12:35 hashar: Force updated jenkins-job-builder from 1.5.0 to 1.6.0 and bumped python-jenkins to 0.4.14. 6fcaf39b...1639a86e - T162674
- 12:35 hashar: Force updated jenkins-job-builder from 1.5.0 to 1.6.0 and bumped python-jenkins to 0.4.14. 6fcaf39b...1639a86e
- 10:41 hashar: Enable webdriver.io browser tests for MediaWiki core - https://gerrit.wikimedia.org/r/#/c/324719/ - T139740
- 09:50 hashar: Regenerating MediaWiki doxygen documentations for all 1.23.x releases.
- 08:55 hashar: Retriggering MediaWiki doxygen publishing job for 1.26.0 - T162506 : zuul enqueue-ref --trigger gerrit --pipeline publish --project mediawiki/core --ref refs/tags/1.26.0 --newrev 981ec62
2017-04-10
- 21:17 hashar: marked a nodepool node online manually. The instance was up but Jenkins failed to reach it due to some SEVERE: I/O error in channel
- 20:52 hashar: integration-slave-jessie-1001 : cleaning up /tmp: sudo find /tmp -path '/tmp/android-tmp-robo*' -delete # T162635
- 20:49 hashar: integration-slave-jessie-1002 : cleaning up /tmp: sudo find /tmp -path '/tmp/android-tmp-robo*' -delete # T162635
- 20:08 bearND: Update mobileapps to 1695900
2017-04-06
- 16:36 halfak: staging ores:554ea12
- 12:23 hashar: Image snapshot-ci-trusty-1491480759 in wmflabs-eqiad is ready
- 12:13 hashar: Updating Nodepool Trusty image to let Linux overcommit memory ( https://gerrit.wikimedia.org/r/#/c/346634/ )
2017-04-05
- 13:34 ema: testing possible fix for T162035 on deployment-ms-fe01
2017-04-04
- 21:29 hashar: contint1001 : rm -fR /srv/zuul/git/mediawiki/services/graphoid/deploy due to T157818
- 21:26 hashar: contint2001 : rm -fR /srv/zuul/git/mediawiki/services/graphoid/deploy due to T157818
- 20:58 hashar: integration: purging precise cow images from integration-slave-jessie-1001 and integration-slave-jessie-1002 ( https://gerrit.wikimedia.org/r/#/c/345836/ )
- 20:58 hashar: rebased integration puppet master
- 20:02 legoktm: deploying https://gerrit.wikimedia.org/r/346348
2017-04-03
- 20:43 bearND: Update mobileapps to fdd4e31
- 20:39 hashar: Nodepool: holding instance ci-trusty-wikimedia-597386 in an attempt debug Wikibase/Scribunto memory usage exploding T125050
- 20:37 hashar: jenkins: disabled/reenabled gearman plugin to unlock the beta cluster related jobs
- 09:17 hashar: deployment-jobrunner02 : cherry picked a monkey patch for Redis::close() to prevent it from sending QUIT command ( https://gerrit.wikimedia.org/r/#/c/346117/ ) - T125735
2017-04-01
- 09:48 Sagan: puppet on deployment-tin looks like it is not running properly
2017-03-29
- 23:51 Krinkle: Free up space on integration-slave-jessie-1001 by removing old /srv/jenkins-workspace and /srv/pbuilder dirs
- 19:57 thcipriani: added --force flag for scap in beta-scap-eqiad temporarily
- 18:41 ebernhardson: upgrading elasticsearch and kibana to 5.1.2 on deployment-logstash2 to test puppet+integration prior to prod deployment
- 15:18 hashar: Delete a 32GB instance integration-ci - T161006
2017-03-28
- 19:53 hashar: Populating package manager cache of oojs-ui-npm-run-jenkins-node-6-jessie by manually triggering a build with ZUUL_PIPELINE=postmerge T155483
- 19:34 hashar: Migrate oojs/ui to just run 'npm jenkins' https://gerrit.wikimedia.org/r/345203 / T155483
- 16:05 halfak: deployed ores:18beebf (T160638)
- 13:22 gehel: restarting elasticsearch on deployment-elastic05 to reload log4j configuration
- 10:28 hashar: Jenkins: installing Android Lint plugin 2.4 - T161305
- 07:42 hashar: nodepool cleared a couple alien instances
2017-03-27
- 17:02 ebernhardson: cherry pick https://gerrit.wikimedia.org/r/344964 to puppetmaster to test upgrade to logstash 5.x
- 11:10 hashar: Image snapshot-ci-jessie-1490612363 in wmflabs-eqiad is ready
- 10:59 hashar: Updating Nodepool Jessie image to include PhantomJS (take two) - T137112
- 10:58 hashar: Image snapshot-ci-jessie-1490611594 in wmflabs-eqiad is ready
- 10:47 hashar: Updating Nodepool Jessie image to include PhantomJS - T137112
- 10:20 hashar: Restarting Jenkins to drop the Throttle Concurrent Builds plugin - T158596
2017-03-25
- 10:46 Amir1: deleting deployment-ores-redis (T160762)
- 10:39 Amir1: changing ores redis address to deployment-ores-redis-01 (T160762)
- 10:02 Amir1: deleted deployment-ores-redis-02
2017-03-24
- 21:34 Amir1: launching deployment-ores-redis-02 (T160762)
2017-03-23
- 16:07 mobrovac: restbase deploying 752ca4b7
- 15:52 hashar: Deleting integration-slave-trusty-1011 m1.large. One less perm slave to take care about
- 14:02 hashar: deployment-ms-be01 and deployment-ms-be02 : Lower Swift replicator on, upgrade package, reboot hosts. T160990
2017-03-22
- 09:45 hashar: beta: purging all Linux kernel from Swift instances
- 08:48 hashar: deployment-ms-be01: swift-init reload all - T160990
- 08:45 hashar: deployment-ms-be01: swift-init reload container - T160990
- 08:43 hashar: deployment-ms-be01: swift-init reload object - T160990
2017-03-21
- 16:47 halfak: halfak@deployment-ores-redis:~$ redis-cli -h deployment-ores-redis.deployment-prep.eqiad.wmflabs -p 6380 -a areallysecretpassword flushall (T160762)
- 16:07 Amir1: ladsgroup@deployment-ores-redis:~$ redis-cli -h deployment-ores-redis.deployment-prep.eqiad.wmflabs -p 6380 -a areallysecretpassword flushall (T160762)
- 11:27 hashar: integration: purging old packages on permanent slaves, mostly old kernels: apt-get autoremove --purge
- 09:06 hashar: CI deploying config hack "High priority test pipeline" : https://gerrit.wikimedia.org/r/343318 - T160667
2017-03-20
- 20:51 andrewbogott: migrating deployment-urldownloader to labvirt1013
- 20:45 andrewbogott: migrating deployment-pdf01 to labvirt1011
- 20:14 andrewbogott: migrating deployment-puppetmaster02 to a different labvirt
- 20:09 bearND: Update mobileapps to c0ab01d
- 08:51 hashar: Jenkins: depooling / deleting Precise instances.
2017-03-17
- 14:08 hashar: salt -v '*precise*' cmd.run 'puppet agent --disable "Pending shutdown on March 20th - T158652"'
2017-03-16
- 21:48 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/343113/
- 15:03 hashar: deployment-prep setting role::logging::mediawiki::udp2log::rotate: 15 in project wide hiera configuration
2017-03-15
- 20:29 bearND: Update mobileapps to bb8fcf2
- 19:02 niedzielski: Reloading Zuul to deploy f1c9073
- 15:55 Reedy: Removed hhvm statcache cherrypick from beta puppetmaster
- 11:09 elukey: Restore prod version of memcached on deployment-memc04 after experiment (I installed a new version a while ago)
- 10:22 elukey: created instances deployment-aqs0[23] to have better testing for the AQS beta environment
- 09:10 addshore: addshore@deployment-tin mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki=hewiktionary
- 09:10 addshore: addshore@deployment-tin mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki=dewiktionary
- 09:08 addshore: addshore@deployment-tin mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki=enwiktionary
- 08:56 addshore: addshore@deployment-tin mwscript extensions/Cognate/maintenance/populateCognatePages.php --wiki=enwiktionary // (ParameterTypeException, T160503)
- 08:50 addshore: addshore@deployment-tin mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki=enwiktionary --site-group=wiktionary // (3 sites added)
- 08:49 addshore: addshore@deployment-tin mwscript extensions/Wikidata/extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=enwiktionary --force-protocol=https --load-from=https://deployment.wikimedia.beta.wmflabs.org/w/api.php
- 08:49 addshore: addshore@deployment-tin mwscript sql.php --wiki=enwiktionary "TRUNCATE sites; TRUNCATE site_identifiers;"
- 08:44 addshore: addshore@deployment-tin mwscript extensions/Wikidata/extensions/Wikibase/lib/maintenance/populateSitesTable.php --wiki=enwiktionary --force-protocol=https
- 08:43 addshore: addshore@deployment-tin mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki=dewiktionary --site-group=wiktionary // (0 sites added)
- 08:43 addshore: addshore@deployment-tin mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki=enwiktionary --site-group=wiktionary // (1 site added)
2017-03-14
- 19:22 thcipriani: removed alien nodepool instance via: openstack server delete ci-jessie-wikimedia-566503
- 10:15 hashar: Added Niedzielski to integration.
- 09:54 hashar: Jenkins: dropping Sniedzielski more specific permissions. Account is already in wmf ldap group
2017-03-13
- 13:19 hashar: Depooled Precise instances from Jenkins T158652 leaving the instances up for now.
- 11:38 hashar: Deleting php53lint jobs. Replacing them with php55 equivalents
- 09:39 hashar: upgrading puppet on deployment-pdf01
- 09:30 hashar: Removing old kernel packages from deployment-pdf01 to free up disk space
- 08:55 hashar: Deleting deployment-copper Fails puppet due to broken OpenStack metadata http://169.254.169.254/openstack/2015-10-15/meta_data.json (fails) and no more needed (per elukey )
2017-03-10
2017-03-09
- 16:20 gehel: upgrading elasticsearch on deployment-prep to v5.1.2
- 09:39 hashar: deployment-prep: rebasing puppet master. Got stall due to a submodule update apparently
2017-03-08
- 22:45 Reedy: https://gerrit.wikimedia.org/r/#/c/341916/ cherry picked onto deployment-puppetmaster02
2017-03-07
- 22:39 hashar: upgrading jenkins02.ci-staging to jenkins 2.x
- 15:26 hashar: ci-staging, enabling puppet master auto signing ( puppetmaster::autosigner: true )
- 08:25 hashar: Image snapshot-ci-jessie-1488874660 in wmflabs-eqiad is ready (Chromium 55->56 among others) - T153038
- 08:16 hashar: Pushing new Jessie image: image-jessie-20170306T224719Z.qcow2
2017-03-06
- 19:03 addshore: mwscript sql.php --wiki=aawiki "CREATE DATABASE cognate_wiktionary"
- 16:03 hashar: Jenkins upgrading "Git client plugin" 1.19.6 to 2.3.0
2017-03-02
- 20:47 hashar: deployment-prep: restarted apache/puppet master. Maybe that will fix ssh_known_hosts being emptied from time to time T159332
- 19:32 thcipriani: snapshot-ci-jessie updated for nodepool
- 19:15 thcipriani: running: nodepool image-update wmflabs-eqiad snapshot-ci-jessie to manually update the ci-jessie snapshot for nodepool
- 18:26 godog: integration update composer on '*slave*'
- 11:52 hashar: gerrit: killed a stalled connection: dd511e52 Feb-27 07:11 git-receive-pack '/mediawiki/services/zotero/translators'
- 09:53 hashar: Image snapshot-ci-jessie-1488447340 in wmflabs-eqiad is ready
- 09:29 hashar: Image snapshot-ci-trusty-1488446586 in wmflabs-eqiad is ready
- 09:18 hashar: upgrading composer on permanent slaves for T125343 : salt -v '*slave*' cmd.run 'cd /srv/deployment/integration/composer && git pull'
- 09:16 hashar: upgrade composer to 1.1.0 https://gerrit.wikimedia.org/r/#/c/339645/
- 08:40 elukey: upgrading apache2 on deployment-mediawiki* - latest debian DSA, introduces https://httpd.apache.org/docs/2.4/mod/core.html#httpprotocoloptions (risk of HTTP 400 responses regression, contact elukey or moritzm if you see any issue)
2017-03-01
- 19:09 addshore: "mwscript extensions/WikimediaMaintenance/addWiki.php --wiki=aawiki he wiktionary hewiktionary he.wiktionary.beta.wmflabs.org" T158628
- 17:11 hashar: cleaned out Jenkins security matrix to drop users that are no more used/inexistent -- T69027
- 14:13 hashar: deployment-prep : on deployment-tin removed empty dir /etc/ssh/userkeys/root.d . Causes puppet noise
- 12:21 hashar: deployment-prep cleaning out git repos on deployment-tin
- 10:00 legoktm: deployed https://gerrit.wikimedia.org/r/340280 to slaves
- 04:28 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/340465
- 01:03 Reedy: beta-scap-eqiad giving Host key verification failed
2017-02-28
- 19:43 thcipriani: deployment-puppetmaster02 puppetmaster running again, apache2 was refusing to start with: Invalid command 'SSLOpenSSLConfCmd' -- installed apache from wmf repo instead of debian fixed it
- 08:36 hashar: nodepool deleted alien instances 541585 541586 and 541587
2017-02-27
- 21:36 bearND: Update mobileapps to c924126
2017-02-25
- 03:50 MaxSem: deployment-prep Deleted January logs from deployment-fluorine02, was running out of space
2017-02-24
- 13:56 hashar: Log refresh Nodepool instances to deploy slave script update to be able to merge mediawiki/composer.json into vendor/composer.json 6527f49..a7728a5 https://gerrit.wikimedia.org/r/#/c/339202/ T158674
- 13:52 hashar: deployed slave script update to be able to merge mediawiki/composer.json into vendor/composer.json 6527f49..a7728a5 https://gerrit.wikimedia.org/r/#/c/339202/ T158674
2017-02-23
- 18:35 greg-g: 18:29 < chasemp> !log labnodepool1001:~# service nodepool restart
- 09:27 hashar: Clearing skins from testextension jobs T117710 salt -v '*slave*' cmd.run 'rm -fR /srv/jenkins-workspace/workspace/mwext-testextension*/src/skins/*'
2017-02-22
- 20:58 hashar: Deleted jenkins job pplint-HEAD. Fully replaced by rake / puppet-syntax gem - T154894
- 20:54 hashar: Deleted jenkins job erblint-HEAD. Fully replaced by rake / puppet-syntax gem - T154894
2017-02-20
- 14:53 hashar: integration: applying role::ci::slave::saucelabs to saucelabs-01
- 12:50 hashar: integration-slave-jessie-1001 downgraded cowbuilder to 0.73 from jessie to match integration-slave-jessie-1002
2017-02-17
- 14:07 hashar: integration: deleting "repository" instance. No time to figure out how to ship Sonatype Nexus to it. T147635
2017-02-16
- 18:34 greg-g: chase restarted nodepool, the daemon crashed
- 18:32 greg-g: no active nodepool instances listed in Jenkin's view: https://integration.wikimedia.org/ci/ but zuul has plenty to do https://integration.wikimedia.org/zuul/
- 16:56 hashar: integration: provisioned browsertests-1001 with role::ci::slaves::browsertests . Added it to Jenkins with label BrowserTests
- 16:33 halfak: deploying ores:e9bbda3
- 16:30 hashar: integration: created browsertests-1001 intended to run the daily browser tests later on
2017-02-15
- 15:47 hashar: Zuul reducing gate-and-submit minimum amount of changes to process from the wrong 12 down to 2. In case of repeating failures it would end up running jobs for only two jobs which would prevent cancelling jobs for up to 11 changes!
2017-02-14
- 14:38 hashar: Updating castor-save publish job to properly capture composer cache on Jessie ( it is in ~/.composer/cache for some reason) T156359
2017-02-13
- 21:25 bearND: Update mobileapps to 3af473f
- 20:15 hashar: Image snapshot-ci-jessie-1487016035 in wmflabs-eqiad is ready
- 20:01 hashar: Updating Nodepool Jessie snapshot to update the Parsoid zuul-cloner map ( https://gerrit.wikimedia.org/r/#/c/337430/ )
- 09:25 hashar: Changing Jenkins slave contint1001 working dir from /srv/ssd/jenkins-slave to /srv/jenkins-slave ( https://gerrit.wikimedia.org/r/#/c/337286/ )
2017-02-10
- 22:11 halfak: deployed ores:a15ec90
- 21:25 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/337079
- 16:18 thcipriani: deployment-puppetmaster02:/var/lib/git/operations/puppet removed untracked file "how", updated submodules
- 14:49 hashar: rebase beta puppet master. Fixed conflicts with https://gerrit.wikimedia.org/r/#/c/321096/ and https://gerrit.wikimedia.org/r/#/c/312523/
- 11:36 hashar: Pruning some old caches from castor.integration.eqiad.wmflabs (eg node-4 jobs are gone)
2017-02-09
- 22:22 greg-g: manually kicked off a bunch of selenium tests after tgr and Reedy fixed T157636
- 18:36 Reedy: someone should kill the ee_prototypewiki db from beta
- 16:40 Amir1: deploying 030c269 ores to sca03
- 11:11 legoktm[NE]: deploying https://gerrit.wikimedia.org/r/336615 and https://gerrit.wikimedia.org/r/336779
2017-02-08
- 22:26 mdholloway: mobileapps deployed 0efa7b8 in the beta cluster
- 14:14 hashar: integration-slave-jessie-1001 upgrading cowbuilder
- 09:20 hashar: deployment-fluorine02 upgraded packages, deleted old files from /srv/mw-log/archive
2017-02-07
- 17:49 halfak: deploying ores 7c80636
- 09:02 hashar: Hard rebooting integration-slave-jessie-1001 . I messed up with the DHCP client :(
2017-02-06
- 21:31 bearND: Update mobileapps to 034a391
2017-02-04
2017-02-03
- 11:09 hashar: beta: removed old kernels from deployment-redis02 to free up disk space
- 10:42 hashar: Image ci-jessie-wikimedia-1486115643 in wmflabs-eqiad is ready T156923
- 10:12 hashar: Image ci-jessie-wikimedia-1486115643 in wmflabs-eqiad is ready T156923
- 09:54 hashar: Regenerate Nodepool Jessie snapshot. Would get a new HHVM version T156923
2017-02-02
- 21:56 hashar: integration-slave-jessie-1001 wiping /srv/pbuilder/base-trusty-amd64.cow it was not properly provisioned causing build to fail (eg lack of /etc/hosts) Running puppet to reprocvision it (poke T156651)
- 16:26 Amir1: deploying 9fd75a1 ores in beta
- 16:17 hashar: integration-slave-jessie-1001 wiping /srv/pbuilder/base-trusty-i386.cow/ it was not properly provisioned causing build to fail (eg lack of /etc/hosts) Running puppet to reprocvision it (poke T156651)
- 14:15 hashar: Nodepool: delete the image building of Jessie (image id 1322) to prevent a faulty HHVM version from being added. T156923
- 00:52 tgr: added mhurd as member
2017-02-01
- 21:43 bearND: Update mobileapps to e48a88c
- 18:51 thcipriani: nodepool delete-image 1320 per T156923
- 14:53 gehel: deployment-elastic* fully migrated to Jessie and /srv as data partition - T151326
- 14:52 gehel: killing test node deployment-elastic08 - T151326
- 14:32 gehel: shutting down and reimaging deployment-elastic07 - T151326
- 14:06 gehel: shutting down and reimaging deployment-elastic06 - T151326
- 13:34 gehel: shutting down and reimaging deployment-elastic05 - T151326
- 13:29 gehel: starting deployment-elastic* migration to jessie and moving data partition to /srv (T151326 / T151328)
- 13:18 moritzm: upgraded deployment-prep to hhvm 3.12.12
2017-01-31
- 22:12 thcipriani: started mysql on all integration precise instances via salt -- was stopped for some reason
- 01:59 bd808: nodepool is full of instance stuck in "delete"
- 01:53 bd808: https://integration.wikimedia.org/zuul/ showing huge backlogs but https://integration.wikimedia.org/ci/ looks mostly idle
2017-01-26
- 14:25 hashar: Created Github repo for Gerrit replication https://github.com/wikimedia/mediawiki-libs-phpstorm-stubs T153252
- 13:49 hashar: Gerrit creating mediawiki/libs/phpstorm-stubs to fork https://github.com/JetBrains/phpstorm-stubs for T153252
2017-01-24
- 11:04 hashar: Deleting integration-publisher (Precise) replaced by integration-publishing (Jessie). T156064 T143349
2017-01-23
- 23:41 bearND: Update mobileapps to 66ef3c2
- 21:05 hashar: Created integration-publishing Jessie instance 10.68.23.254 with puppet class role::ci::publisher::labs . Meant to replace Precise instance integration-publisher T156064
- 12:45 hashar: Image ci-jessie-wikimedia-1485174573 in wmflabs-eqiad is ready | should no more spawn varnish on boot
- 09:02 hashar: Archiving Gerrit project wikidata/gremlin marking it read-only T155829
- 07:15 _joe_: cherry-picking the move of base to profile::base
2017-01-21
- 21:20 hashar: integration: updating slave scripts for https://gerrit.wikimedia.org/r/#/c/333389/
- 21:08 bd808: Puppet failures on deployment-restbase0[12] seem to be some sort of hang of the Puppet process itself. Run prints "Finished catalog run in 2n.nn seconds" but Puppet doesn't terminate for about a minute longer. The only state change logged is cassandra-metrics-collector service start.
2017-01-20
- 10:14 hashar: puppet fails on "integration" labs instances due to an attempt to unmount the non existing NFS /home. Filled T155820
- 09:18 hashar: beta: reset workspace of /srv/mediawiki-staging/php-master/extensions/reCaptcha it had a .gitignore local hack for some reason
- 09:05 hashar: integration restarted mysql on trusty permanent slaves T141450 T155815 salt -v '*trusty*' cmd.run 'service mysql start'
2017-01-19
- 22:11 Krenair: added bunch of others to the same group per request. we should figure out how to make this process sane somehow
- 22:06 Krenair: added nuria to deploy-service group on deployment-tin
- 16:56 hashar: rebased puppet master on integration and deployment-prep Trivial conflict between https://gerrit.wikimedia.org/r/#/c/312523/ and a lint change
- 09:36 hashar: Nuking workspaces of all mwext-testextension-hhvm-composer* jobs. Lame attempt for T155600. salt -v '*slave*' cmd.run 'rm -fR /srv/jenkins-workspace/workspace/mwext-testextension-hhvm-composer*'
2017-01-18
- 10:49 hashar: Disconnected/connected Jenkins Gearman client. The beta cluster builds had a deadlock.
- 10:39 hashar: Image ci-jessie-wikimedia-1484735445 in wmflabs-eqiad is ready (add python-conftool to hopefully have puppet rspec pass on https://gerrit.wikimedia.org/r/#/c/332475/ )
2017-01-17
- 21:47 urandom: deployment-prep restarting Cassandra on deployment-restbase02
- 21:46 urandom: deployment-prep restarting Cassandra on deployment-restbase01
- 19:02 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/332534/
- 18:25 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/332521/
- 18:07 urandom: deployment-prep restarting Cassandra on deployment-restbase01
- 17:50 urandom: re-enabling puppet on deployment-restbase02
- 17:47 urandom: re-enabling puppet on deployment-restbase01
- 10:32 hashar: Refreshing all jobs in Jenkins 'jenkins-jobs --conf jenkins_jobs.ini update config/jjb'
2017-01-16
- 09:33 hashar: integration nuked the Zuul merger path for SelectTag mw extension ( on scandium /srv/ssd/zuul/git/mediawiki/extensions/SelectTag ) Failed to merge https://gerrit.wikimedia.org/r/#/c/331974/
2017-01-12
- 00:33 legoktm: deploying https://gerrit.wikimedia.org/r/331796 and https://gerrit.wikimedia.org/r/331795
2017-01-11
- 18:07 urandom: restarting restbase cassandra nodes
- 18:01 urandom: disabling puppet on restbase cassandra nodes to experiment with prometheus exporter
2017-01-10
- 23:07 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/331099
2017-01-08
- 05:20 Krenair: deployment-stream: live hacked /usr/lib/python2.7/dist-packages/socketio/handler.py a bit (added apostrophes) to try to make rcstream work
2017-01-07
- 10:17 Amir1: ladsgroup@deployment-tin:~$ mwscript updateCollation.php --wiki=fawiki (T139110)
2017-01-06
- 16:31 hashar: Nodepool Image ci-jessie-wikimedia-1483719758 in wmflabs-eqiad is ready
- 16:24 hashar: Nodepool Image ci-trusty-wikimedia-1483719370 in wmflabs-eqiad is ready
- 04:56 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/330843
2017-01-05
- 17:20 hashar: Dropping puppet source from https://doc.wikimedia.org/puppetsource/ . contint1001: sudo rm -fR /srv/org/wikimedia/doc/puppetsource (T143233)
2017-01-04
- 21:29 mutante: deployment-cache-text-04 - running acme-setup command to debug .. Creating CSR /etc/acme/csr/beta_wmflabs_org.pem
- 21:26 Krenair: trying to troubleshoot puppet by stopping nginx then letting puppet start it
- 21:05 mutante: deployment-cache-text04 stopping nginx service, running puppet to debug dependency issue
- 09:41 hashar: integration: pruning /srv/pbuilder/aptcache/ on Jessie perm slaves
2017-01-02
- 11:22 hashar: Nodepool Image ci-jessie-wikimedia-1483355768 in wmflabs-eqiad is ready
- 11:17 hashar: Jessie images have the wrong python-pbr version ( T153877 ) causing zuul-cloner to fail. Refreshing image
- 10:02 hashar: Nodepool Image ci-jessie-wikimedia-1483350885 in wmflabs-eqiad is ready
- 09:57 hashar: Nodepool Image ci-trusty-wikimedia-1483350368 in wmflabs-eqiad is ready
Archives
- Archive 1 (September 2014 - December 2015)
- Archive 2 (2016)
- Archive 3 (2017)
- Archive 4 (2018)
- Archive 5 (2019)
- Archive 6 (2020)
- Archive 7 (2021)
- Archive 8 (2022)
- Archive 9 (2023)