Nova Resource:Quarry/SAL

From Wikitech

2024-04-01

  • 12:48 Rook: quarry moved to k8s T349032

2024-03-11

  • 13:10 andrewbogott: deleting long-shutdown quarry-puppet-master-02

2024-02-02

  • 20:00 andrewbogott: rebuilding trove instances with new antelope guest image

2023-12-04

  • 15:16 dcaro: re-enable puppet that has been too long disabled (T348748)

2023-10-24

  • 12:56 Rook: minikube helm chart T301469

2023-05-29

2023-05-27

  • 21:50 framawiki: shutdown potentially unused servers: quarry-nfs-1, quarry-puppet-master-02, quarry-dev-03 (this last one is started when there is need to test patchs)

2023-05-19

  • 09:59 wm-bot2: added user isaacj to the project as reader (T337019) - cookbook ran by arturo@endurance

2023-04-18

  • 19:24 Rook: remove db entries ending in semicolon 91d66e5

2023-02-28

2023-02-27

2023-01-04

  • 16:41 Rook: Fix various outdated URLs in Quarry website footer

2022-12-04

  • 16:23 dcaro: restarted uwsgi on quarry-web-02 as it was getting out of memory errors (and failing puppet)

2022-09-20

2022-09-09

2022-08-29

2022-08-10

2022-08-08

2022-07-06

  • 08:40 dcaro: rebooting worker-04 due to being unable to ssh to it (things started segfaulting, then too much work for irq)

2022-06-23

2022-06-22

2022-06-21

2022-06-16

2022-06-06

2022-05-17

2022-05-16

2022-04-18

2022-04-04

  • 13:00 taavi: delete quarry-db-01

2022-03-25

  • 12:04 dcaro: rebooting quarry-worker-04.quarry.eqiad1.wikimedia.cloud due to stuck nfs (T304681)

2022-03-21

  • 17:30 Rook: updating home page link to profile T85175

2022-02-20

  • 19:49 andrewbogott: moving nfs service from quarry-nfs-1 (bullseye) to quarry-nfs-2 (buster), testing to see if T302154 is a kernal or nfs-version issue
  • 19:23 taavi: hard rebooted quarry-nfs-1 again T302154

2022-02-19

  • 14:04 taavi: reboot quarry-nfs-1 T302154

2022-02-11

  • 21:11 andrewbogott: switching shared nfs project dir (again) to internal nfs server quarry-nfs-1

2022-02-10

  • 19:13 andrewbogott: rebooting all VMs to switch to new NFS server

2022-01-29

2021-09-27

  • 17:23 mdipietro: added stopped status T289349

2021-09-07

  • 13:10 mdipietro: tab will close autocomplete window T289872

2021-09-03

  • 17:39 andrewbogott: restarting celery workers and reloading web UI to pick up timeout changes
  • 16:45 bstorm: set live wait_timeout variable to 28800 (the default) on the trove instance T290291

2021-09-01

  • 21:56 andrewbogott: switched /srv/quarry to branch 'master' on quarry-worker-03, quarry-worker-04, quarry-web-02
  • 18:26 bstorm: started instance quarry-dbbackup-01 T289568
  • 13:07 mdipietro: Updated to Debian Buster/python 3.7 T288528

2021-08-16

  • 10:25 dcaro: Reverting deploy of T287471 - saved queries fail to show the DB field, will open a task
  • 10:21 dcaro: Deploying T287471 2/2 - updating code on the web and worker servers
  • 10:19 dcaro: Deploying T287471 1/2 - creating DB index

2021-05-06

  • 17:57 bstorm: restarting web service to remove banner for wikireplicas upgrade
  • 17:49 bstorm: cleared out tmp files created by quarry web service that had filled the disk with find T282171

2021-04-23

  • 18:51 Framawiki: ran apt updates without issues on all 4 servers. T266386 looks fixed.

2021-04-07

  • 21:06 bstorm: deploying regex fixes T278715

2021-04-02

  • 09:05 Framawiki: shutdown quarry-dev server, normally unused now

2021-03-26

  • 19:27 bstorm: deploying changes to the replica class and restarting things T278544

2021-03-25

  • 22:15 bstorm: removing the querykiller role T264254
  • 22:03 bstorm: restarting celery worker processes to fix connection cleanup T264254
  • 22:01 bstorm: restarting web interface for a small fix for the database field display T264254

2021-03-23

  • 21:45 bstorm: restarting quarry services for the meta_p and centralauth issue T264254
  • 19:17 bstorm: finished updating quarry for multiinstance replicas T264254
  • 18:51 bstorm: running the multiinstance migration script T264254
  • 18:45 bstorm: stopping the quarry web service for the upgrade process
  • 18:43 bstorm: `git stash`ing the in-place test of sentinel in the code checkout
  • 15:08 dcaro: systemctl restart mariadb on quarry-db-01 brought it back to life (T278230)
  • 14:53 andrewbogott: service mariadb start on quarry-db-01

2021-02-20

  • 13:18 wm-bot: framawiki: Deployed f51f9a9 on -web-01 T275277 (not yet merged)
  • 11:09 wm-bot: framawiki: Deployed 15a315a on -web-01 T254847 (`Update document title on title change`)

2021-02-15

  • 19:24 Reedy: stash'd patch saved to quarry-web-01.quarry.eqiad1.wikimedia.cloud:/root/T274815.patch T274815
  • 19:22 Reedy: T274815 filed with the login failure traceback
  • 19:21 Reedy: re-enabled puppet on quarry-web-01.quarry.eqiad1.wikimedia.cloud as it had been disabled for a week
  • 19:20 Reedy: `git stash` framawiki changes as it was breaking login

2021-02-12

  • 17:12 bstorm: started quarry-dev-01 T264254

2020-12-15

  • 17:43 Reedy: quarry-worker-02 `systemctl restart uwsgi-quarry-web.service` again, after pulling patch for T270195
  • 17:43 Reedy: quarry-worker-01 `systemctl restart uwsgi-quarry-web.service` again, after pulling patch for T270195
  • 17:40 Reedy: quarry-web-01 `systemctl restart uwsgi-quarry-web.service` again, after pulling patch for T270195
  • 17:31 Reedy: quarry-web-01 `systemctl restart uwsgi-quarry-web.service`
  • 17:26 Reedy: `find /tmp -type f -mtime +30 -delete;` on quarry-web-01 T270198
  • 17:23 Reedy: `apt-get clean && apt-get autoclean` on quarry-web-01 T270198

2020-10-20

  • 16:11 bstorm: restarted mariadb on quarry-db-01 so it pointed to the right data directory
  • 16:00 andrewbogott: rebooting quarry-web-01; lots of cruft in /tmp
  • 15:56 andrewbogott: restarting nginx on quarry-web-01

2020-09-03

  • 17:14 Framawiki: `framawiki@quarry-web-01:/tmp$ find /tmp/* -mtime +360 -user www-data -exec sudo rm -v {} \;` 775 files deleted for 10G. Again. T261909

2020-07-08

  • 19:23 Framawiki: `framawiki@quarry-web-01:/tmp$ find /tmp/* -mtime +360 -user www-data -exec sudo rm -v {} \;` 778 files deleted for 10G.

2020-03-06

  • 19:32 zhuyifei1999_: changed to analytics replica for database queries and restarted celery workers T246970

2020-02-26

  • 20:12 jeh: restart quarry-web-01 and quarry-worker-01

2020-01-14

2020-01-01

2019-11-14

  • 21:59 zhuyifei1999_: `zhuyifei1999@quarry-web-01:/tmp$ find /tmp/* -mtime +360 -user www-data -exec sudo rm -v {} \;` T238375

2019-10-19

  • 15:39 wm-bot: framawiki: Deployed 90a1bef on -web-01 (`query-status.html: fix compiled.js`)
  • 15:18 wm-bot: framawiki: Deployed 1f297c9 on -web-01 T205214 (`query-status.html: hide Explain button until bug is solved`)

2019-10-02

  • 17:20 mutante: - mariadb::packages should now work on buster too, fyi

2019-10-01

  • 11:51 zhuyifei1999_: restart celery-quarry-worker.service

2019-06-28

  • 18:45 zhuyifei1999_: Deployed 2f7ee60 to quarry-web-01
  • 14:37 bstorm_: changed to web replica for database queries and restarted celery workers

2019-06-21

  • 21:34 wm-bot: framawiki: Deployed 5d6844e on -web-01

2019-06-14

  • 20:23 wm-bot: framawiki: Deployed b303ce8 on -web-01

2019-05-25

  • 12:58 framawiki: block spammer `INSERT INTO user_group (user_id, group_name) VALUES (3765, "blocked");`
  • 12:22 wm-bot: framawiki: Deployed cc0c0a7 on -web-01 T224300

2019-05-24

  • 21:00 zhuyifei1999_: masked uwsgi service on quarry-web-01 to prevent future mess-ups
  • 20:59 zhuyifei1999_: reenabled puppet on quarry-web-01, should use uwsgi-quarry-web service not uwsgi service
  • 20:51 zhuyifei1999_: disabled puppet on quarry-web-01 because it wants uwsgi dead

2019-05-12

2019-05-11

  • 22:10 framawiki: re-enable puppet T223018
  • 22:00 framawiki: disabling puppet temporary on -web-01 to test nginx conf T223018
  • 19:25 framawiki: block spammer `INSERT INTO user_group (user_id, group_name) VALUES (3927, "blocked"), (3958, "blocked"), (3984, "blocked"), (3985, "blocked"), (3986, "blocked");`

2019-05-10

  • 14:00 andrewbogott: restarting uwsgi-quarry-web and nginx on quarry-web-01
  • 07:09 zhuyifei1999_: restarted uwsgi. nginx reports 502

2019-04-05

  • 18:48 zhuyifei1999_: checked out FETCH_HEAD on quarry-web-01 T209226
  • 18:43 zhuyifei1999_: applied 0001-SECURITY-escape-CSV-injections.patch on quarry-web-01 and restarted uwsgi T209226

2019-03-16

  • 08:31 framawiki: restarted uwsgi to deal with 502 nginx errors `sudo systemctl restart uwsgi-quarry-web`

2019-03-02

2019-02-21

  • 09:29 gtirloni: applied CSP change T214637
  • 09:22 gtirloni: updated and rebooted all servers (debian 9.8)

2019-02-20

  • 20:59 wm-bot: framawiki: Deployed 8f72587 on -web-01 T216581
  • 20:38 framawiki: re-activating puppet on -web-01, csp conf looks good T214637
  • 20:15 framawiki: disabling puppet temporary on -web-01 to test csp conf T214637

2019-02-18

2018-12-24

2018-12-02

2018-11-27

  • 18:54 zhuyifei1999_: triggered OOM killer on quarry-worker-02 9 times T188564

2018-11-16

2018-11-12

2018-11-05

  • 18:49 zhuyifei1999_: `UPDATE query join query_revision on query.latest_rev_id = query_revision.id join query_run on latest_run_id = query_run.id SET status=1 where (status = 2 or status = 0) and query_run.timestamp <= DATE_ADD(NOW(), INTERVAL -30 MINUTE);` 286 rows affected.
  • 18:43 framawiki: migration is over T207677
  • 18:33 zhuyifei1999_: flushed redis with flushall
  • 18:22 zhuyifei1999_: unset db read-only `SET GLOBAL read_only = 0; UNLOCK TABLES;` T207677
  • 18:01 andrewbogott: moving instances from eqiad to eqiad1-r
  • 17:40 zhuyifei1999_: set db read-only `FLUSH TABLES WITH READ LOCK; SET GLOBAL read_only = 1;` T207677
  • 17:24 zhuyifei1999_: shutting down all workers `sudo kill -TERM $(systemctl show -p MainPID celery-quarry-worker.service | cut -d= -f2)` T207677

2018-11-04

  • 22:24 zhuyifei1999_: checked out FETCH_HEAD 8c065d0, previous head was 71643b6 on quarry-web-01
  • 17:37 framawiki: deployed c10fc32 and 71643b6 on quarry-web-01

2018-10-21

2018-10-15

  • 18:52 framawiki: deployed c1dfde7 on quarry-web-01, quarry-worker-0{1,2} T126888

2018-10-06

  • 16:30 framawiki: deployed 8550956 on quarry-web-01

2018-09-26

  • 13:24 zhuyifei1999_: restarted mariadb on -db-01 after max_allowed_packet fix

2018-09-25

  • 12:09 arturo: make myself projectadmin

2018-09-24

  • 22:33 framawiki: manually clear queries and resultsets where userid=3214 T205286
  • 22:31 framawiki: `update query_run set status=3 where id=290865;` on quarry-db-01 to mark a ghost query as killed on the ui
  • 21:50 framawiki: `select * from query_run where id=290865;` on quarry-db-01 to mark a ghost query as killed on the ui
  • 17:54 framawiki: quarry-db-01: `INSERT INTO user_group (user_id, group_name) VALUES (3214, "blocked");` T205286
  • 17:53 framawiki: deployed 028a292 on quarry-web-01 T205286 T104322

2018-09-23

2018-09-21

2018-09-19

2018-09-17

2018-09-16

2018-09-13

  • 22:10 zhuyifei1999_: purging stuffs created by labs_debrepo T153615
  • 21:40 zhuyifei1999_: deployed 8b4bde0 to quarry-web-01 T204277
  • 19:19 framawiki: deleted legacy instances quarry-main-01 and quarry-runner-0{1,2}, migration is over T202588
  • 19:10 framawiki: copy /var/log/nginx from legacy main-01 to /data/project/nginx-logs-legacy-20180913-framawiki for further analysis T202588 T197256

2018-09-12

  • 21:21 zhuyifei1999_: unset read-only again on new database
  • 21:15 zhuyifei1999_: `sudo chown quarry:quarry /data/project/quarry/ -Rv` T202588
  • 21:13 zhuyifei1999_: set read-only again on new database because new quarry's UID is 498 T202588
  • 21:10 zhuyifei1999_: unset read-only again on new database
  • 21:09 zhuyifei1999_: rm'ed /var/lib/mysql on -db-01, we are using /srv/sqldata/ now, and if something goes really better have a loud failure
  • 21:03 zhuyifei1999_: deployed 8b4bde0 to -web-01 T204161
  • 21:01 zhuyifei1999_: deployed 461e56c
  • 20:45 zhuyifei1999_: set read-only again on new database due to a bug in worker code
  • 20:42 zhuyifei1999_: unset read-only on new database T202588
  • 20:41 framawiki: switched quarry.wmflabs.org proxy to new quarry-web-01.quarry.eqiad.wmflabs T202588
  • 20:27 zhuyifei1999_: backed up old db to /data/project/dump-2018-09-12.sql and restoring to new server T202588
  • 20:03 zhuyifei1999_: set quarry-main-01 mariadb read-only T202588
  • 20:02 zhuyifei1999_: stopped celery-quarry-worker on quarry-runner-0{1,2} T202588
  • 19:45 zhuyifei1999_: created new quarry database and user in quarry-db-01.quarry.eqiad.wmflabs T202588

2018-09-11

  • 17:22 zhuyifei1999_: doing another backup of main db: `sudo mysqldump quarry | sudo tee /data/project/dump-$(date '+%Y-%m-%d').sql > /dev/null` T202588
  • 17:14 zhuyifei1999_: disabling puppet on quarry-main-01, quarry-runner-0{1,2} T202588

2018-09-07

  • 21:06 zhuyifei1999_: reverted hotpatch, deployed till 3375dc3
  • 20:47 zhuyifei1999_: hotpatch /etc/uwsgi/apps-enabled/quarry-web.ini processes 8 -> 1 for some gdb-ing
  • 19:56 framawiki: deployed 501695f to quarry-main-01 (T202588)
  • 18:11 framawiki: deployed 769cace to quarry-main-01 (T202588)

2018-08-24

2018-05-31

  • 15:52 zhuyifei1999_: live-patch `/srv/quarry/quarry/web/connections.py` on `quarry-main-01` and restart uwsgi

2018-05-07

  • 22:40 framawiki: deployed 24038e3 to quarry-main-01

2018-05-04

  • 02:05 zhuyifei1999_: Deployed 6069904

2018-05-02

  • 18:23 framawiki: deployed af2f7e6 to quarry-main-01
  • 18:04 zhuyifei1999_: Deployed f4e86f1 and restarted everything

2018-04-24

  • 21:36 framawiki: removing old /srv/venv on quarry-main-01 T192731

2018-04-23

  • 22:50 zhuyifei1999_: Does quarry only have an effective concurrency limit of 3, despite having a few dozen celery worker processes?!
  • 22:46 zhuyifei1999_: behaving abnormally. https://quarry.wmflabs.org/query/26629 has been queued for 16 mins... (hopefully) have some time to investigate

2018-04-18

  • 23:21 framawiki: deployed 02049d9 to quarry-main-01
  • 22:21 zhuyifei1999_: +Framawiki project admin & Gerrit +2
  • 19:35 zhuyifei1999_: deployed c6cd55e to quarry-main-01
  • 17:36 zhuyifei1999_: deployed 8eeeff8 to quarry-main-01

2018-04-17

  • 23:00 zhuyifei1999_: forgot to restart uwsgi on last deployment. restarted it now
  • 00:34 zhuyifei1999_: Deploy b5fd6b0 on quarry-main-01

2018-03-24

2018-03-16

2018-03-15

  • 19:27 zhuyifei1999_: switch back to d9cc1c8 on both hosts
  • 16:29 zhuyifei1999_: quarry-runner-02 is on d9cc1c8
  • 16:21 zhuyifei1999_: installed python-dbg on quarry-runner-02 because it's so good
  • 16:18 zhuyifei1999_: depool quarry-runner-01
  • 15:56 zhuyifei1999_: deploying d653400 to quarry-runner-0{1,2} T188564

2018-03-01

  • 18:41 zhuyifei1999_: deploying d5e2845 to quarry-runner-01 & 02
  • 00:37 zhuyifei1999_: `UPDATE query join query_revision on query.latest_rev_id = query_revision.id join query_run on latest_run_id = query_run.id SET status=1 where (status = 2 or status = 1) and query_run.timestamp <= DATE_ADD(NOW(), INTERVAL -1 HOUR);` 251 rows affected (1.81 sec) T139162 T172086 T188564

2018-02-28

  • 22:57 zhuyifei1999_: killed two IO-intensive query saves

2018-02-09

  • 01:06 bd808: Removed TestingAccount2 at user request (T186289)
  • 01:06 bd808: Removed Yuvipanda at user request (T186289)

2018-01-02

  • 10:52 zhuyifei1999_: deploying d9cc1c8 to quarry-runner-01 & 02 T172143

2017-12-13

  • 19:19 zhuyifei1999_: Deployed 62676f2 to quarry-main-01 and restarted uwsgi

2017-12-10

  • 05:52 zhuyifei1999_: deployed e835a46 to quarry-main-01 and restarted uwsgi T165169
  • 05:49 zhuyifei1999_: quarry-main-01: `ALTER IGNORE TABLE star ADD UNIQUE INDEX star_user_query_index (user_id, query_id);` Records: 728 Duplicates: 17 Warnings: 0 T165169

2017-12-05

  • 18:56 zhuyifei1999_: quarry-main-01: `MariaDB [quarry]> UPDATE user SET username = '-revi' WHERE username = 'Hym411';` T182064

2017-10-02

  • 05:15 zhuyifei1999_: Deployed 644b293 to quarry-main-01 and restarted uwsgi

2017-09-26

  • 04:08 zhuyifei1999_: Restarting service 'uwsgi-quarry-web' on quarry-main-01, 'celery-quarry-worker' on quarry-runner-01 & quarry-runner-02 T176694
  • 03:59 zhuyifei1999_: Switching REPLICA_HOST from 'enwiki.labsdb' to 'enwiki.analytics.db.svc.eqiad.wmflabs' T176694 (Executing `sudo -- sudo -u quarry sed -i 's/enwiki.labsdb/enwiki.analytics.db.svc.eqiad.wmflabs/' /srv/quarry/quarry/config.yaml` on all hosts)

2017-09-10

2017-09-08

2017-09-06

  • 00:05 zhuyifei1999_: backup quarry main database to /data/project/dump-2017-09-05.sql because I fear it die :(. Executing `sudo mysqldump quarry | sudo tee /data/project/dump-2017-09-05.sql > /dev/null`

2017-08-11

  • 11:12 zhuyifei1999_: deployed 2834160 on quarry-main-01

2017-08-01

2017-07-31

  • 16:31 zhuyifei1999_: Repeated for quarry-main-01, but restarted uwsgi T146483
  • 16:30 zhuyifei1999_: Repeated for quarry-runner-02
  • 16:28 zhuyifei1999_: Restarted celery-quarry-worker on quarry-runner-01 T146483
  • 16:22 zhuyifei1999_: `zhuyifei1999@quarry-runner-01:/srv/quarry$ sudo git fetch; sudo git checkout 6447943` T146483
  • 01:01 zhuyifei1999_: `zhuyifei1999@quarry-main-01:/srv/quarry$ sudo git fetch; sudo git checkout 7dd8c60; sudo service uwsgi restart` T101424

2017-07-30

  • 21:17 zhuyifei1999_: `sudo service uwsgi restart` T76126
  • 21:14 zhuyifei1999_: `sudo git fetch; sudo git checkout 172eb7e` on /srv/quarry T76126
  • 20:10 zhuyifei1999_: Yuvi gave me access after I asked about T76126

2017-06-30

  • 22:34 bd808: Added BryanDavis (self) as project admin
  • 18:09 bd808: Ran service uwsgi-quarry-web restart on quarry-main-01. People seeing intermittent 502s

2017-06-26

  • 18:33 milimetric: Restarted celery workers on quarry-runner-01 and quarry-runner-02 (systemctl restart celery-quarry-worker.service)
  • 17:20 madhuvishy: Add milimetric as project admin

2017-01-19

  • 12:07 yuvipanda: run chown -R 998:998 quarry/ on labstore1004

2016-10-05

  • 19:48 mutante: quarry-runner-01 has a problem starting exim4
  • 19:47 mutante: merged gerrit 308313 - should definitely be no-op, but noticed that puppet is disabled on quarry-main-01

2016-10-04

  • 19:33 valhallasw`cloud: removed myself as admin

2016-09-23

  • 20:41 yuvipanda: add halfak as projectadmin

2016-05-08

  • 21:35 Krenair: restarted quarry-runner-01 to attempt to get the queue working again - new queries are going through but some old ones (from the last few hours) seem stuck

2016-04-02

  • 14:06 valhallasw`cloud: systemctl restart celery-quarry-worker.service hangs. Will now reboot quarry-runner-02.
  • 14:02 valhallasw`cloud: killed 100% CPU using process on quarry-runner-02 (ptrace suggested some sort of idle loop). Let's see if that has any effect.

July 4

  • 15:45 YuviPanda: deploying to latest master and hoping!

April 30

  • 21:57 andrewbogott: moving quarry-main-01 to labvirt1003
  • 20:53 andrewbogott: moving quarry-runner-01 to labvirt1004
  • 19:41 andrewbogott: cold-migrating quarry-runner-test to labvirt1003

December 8

  • 19:36 YuviPanda: deploying to get in valhallasw`cloud’s patches

September 30

  • 15:44 andrewbogott: enabled puppet on quarry-runner-test, updated, installed a bunch of maria stuff, rebooted

September 16

  • 21:53 YuviPanda: appllying db, web and redis roles to quarry-runner-test, will act as db and web host until labs issues clear up

August 21

  • 19:47 YuviPanda: upgraded all text and varchar columns to utf8

August 17

  • 02:03 YuviPanda: increased mysql connection limit manually to 1024, re-running all old query-runs to produce output in new sqlite format

August 14

  • 20:21 YuviPanda: upgrade to MariaDB 10.1, because fuck offline ALTER TABLEs