Nova Resource:Ores/SAL

From Wikitech

2022-05-11

  • 10:30 taavi: deleted calbon-test.ores VM T308102

2020-11-16

  • 21:09 Amir1: deleted six ores.wmflabs.org VMs for donwsizing it (T242819)

2020-02-14

  • 20:18 andrewbogott: adding profile::rsyslog::kafka_shipper::kafka_brokers: [] to project-wide puppet setting
  • 20:16 andrewbogott: adding profile::ores::logstash_host: localhost to project-wide puppet config

2020-02-12

  • 15:28 halfak: staging ores-wmflabs-deploy:4f0a14e

2019-10-10

  • 19:20 halfak: quiet icinga-wm in #wikimedia-ai
  • 16:25 halfak: unquiet icinga-wm in #wikimedia-ai

2019-10-08

  • 14:07 halfak: Restarting nginx on ores-lb-03
  • 13:49 halfak: Temporarily stopped nginx on ores-lb-03
  • 13:25 halfak: running FLUSHDB on ores-redis-02
  • 13:13 halfak: restarting celery workers on ores-worker-01,02,03,04

2019-09-25

  • 18:13 halfak: restarted redis services on ores-redis-02 (again)
  • 18:12 halfak: deleting AOF file on ores-redis-02
  • 18:03 halfak: restarted redis services on ores-redis-02

2019-09-13

  • 21:52 halfak: switched ores-lb-03 to point to ores-web-04/05/06
  • 21:17 andrewbogott: moving ores-redis-02 to cloudvirt1030
  • 21:06 halfak: restarting redis services on ores-redis-02
  • 20:53 halfak: ran 'sudo service celery-ores-worker restart' on celery-worker-02

2019-09-12

  • 21:20 bd808: Forcing puppet run on ores-web-04.ores.eqiad.wmflabs to see if I can fix ssh access for halfak

2019-09-10

  • 15:27 halfak: rebuilt ores-worker-03 with role::labs::ores::worker
  • 14:26 halfak: restarting ores-uwsgi service on ores-web-01

2019-09-09

  • 22:08 halfak: deleting ores-worker-03 because it didn't work
  • 21:50 halfak: configured ores-worker-03 with role::labs::ores::worker

2019-09-06

2019-09-04

  • 20:20 halfak: ran "flushall" on both redis instances (ports 6379 and 6380)
  • 20:09 halfak: restarting all labs VPN nodes.
  • 15:33 halfak: ran "sudo service nginx restart" on ores-lb-03

2019-06-14

  • 22:06 halfak: ran "sudo service nagios-nrpe-server restart" on ores-web-01

2019-06-11

  • 18:50 halfak: deploying ores-wmflabs-deploy:079f18c
  • 18:48 halfak: staging ores-wmflabs-deploy:079f18c
  • 17:19 halfak: deploying ores-wmflabs-deploy:989c436
  • 17:16 halfak: staging ores-wmflabs-deploy:989c436

2019-06-06

  • 21:42 halfak: staging ores-wmflabs-deploy:989c436

2019-04-15

2019-03-18

  • 14:05 halfak: deploying ores-wmflabs-deploy:9a0e589
  • 14:02 halfak: staging ores-wmflabs-deploy:9a0e589

2019-02-21

  • 15:29 halfak: staging ores-wmflabs-deploy:ea153f6

2019-02-08

  • 16:02 halfak: staging ores-wmflabs-deploy:5bd6b61

2019-01-31

  • 12:05 arturo: VM instances ores-puppetmaster-01,ores-sentinel-01, were stopped briefly due to issue in hypervisor (T215012)

2019-01-07

  • 21:13 halfak: deploying ores-wmflabs-deploy:5634d1
  • 21:08 halfak: staging ores-wmflabs-deploy:5634d1

2019-01-04

  • 18:17 halfak: deploying ores-wmflabs-deploy:f3dbd2
  • 18:13 halfak: staging ores-wmflabs-deploy:f3dbd2
  • 16:06 halfak: ran "sudo service uwsgi-ores restart" on ores-staging-01
  • 15:39 halfak: staging ores-wmflabs-deploy:7a3bdf

2018-11-07

  • 19:05 Amir1: ores:f2f085d is going prod

2018-11-03

  • 13:53 Amir1: restarting ores-redis-02 under pressure

2018-10-11

  • 18:10 Amir1: 3667aac is going prod (labs)
  • 18:06 Amir1: 3667aac is going staging

2018-09-26

  • 10:47 Amir1: 1b5efba is being deployed
  • 10:37 Amir1: 1b5efba is going to staging

2018-09-19

  • 18:46 Amir1: 2bd8de3 is going to staging and prod

2018-09-05

  • 12:26 Amir1: ceb27b2 is going to prod
  • 12:23 Amir1: ceb27b2 is going to staging

2018-08-31

  • 17:06 Amir1: deploying ec9548f into prod
  • 13:33 Amir1: spinning up ores-lb-03 to replace ores-lb-02
  • 13:07 Amir1: c3396e3 is going to staging

2018-08-29

  • 17:52 Amir1: deleting ores-worker-03 and ores-worker-04

2018-08-27

  • 15:13 Amir1: ac18173 is going to prod
  • 15:06 Amir1: ac18173 is going to staging

2018-08-23

  • 19:59 Amir1: shutting down ores-worker-05
  • 19:52 Amir1: shutting down ores-worker-06

2018-05-10

  • 21:03 halfak: deploying ores-wmflabs-deploy:346e198
  • 21:01 halfak: staging ores-wmflabs-deploy:346e198
  • 19:52 halfak: deploying ores-wmflabs-deploy:04103d2
  • 19:15 halfak: staging ores-wmflabs-deploy:d30df05

2018-01-17

  • 21:44 halfak: created ores-misc-01 as Debian Stretch instance
  • 21:42 halfak: deleted ores-misc-01
  • 21:30 halfak: deploying ores-wmflabs-deploy:96d7f12
  • 21:27 halfak: staging ores-wmflabs-deploy:96d7f12

2018-01-16

  • 23:36 halfak: created ores-worker-05/06 as Debian Stretch
  • 23:18 halfak: deleting ores-worker-08/09/10
  • 23:14 halfak: shutting down ores-worker-08/09/10
  • 22:33 halfak: creating ores-worker-01/02/03/04 as stretch instances.
  • 22:27 halfak: deleting ores-worker-05/06/07
  • 22:25 halfak: shutting down ores-worker-05/06/07
  • 22:23 halfak: deleting ores-web-03/05
  • 22:15 halfak: pooling ores-web-01/02
  • 22:06 halfak: created ores-web-02 as a debian stretch instance.
  • 22:01 halfak: deleted ores-redis-01 (seemingly unused)

2018-01-11

  • 20:06 halfak: created ores-web-01 as Stretch instance

2018-01-05

  • 18:26 halfak: deleting ores-staging-02 to recreate as a stretch instance.

2017-12-02

  • 16:54 halfak: deploying ores-wmflabs-deploy:8cebc43
  • 16:50 halfak: staging ores-wmflabs-deploy:8cebc43

2017-11-27

  • 21:03 halfak: deployed ores-wmflabs-deploy:02aebc5
  • 20:59 halfak: staged ores-wmflabs-deploy:02aebc5

2017-10-07

  • 17:16 halfak: deploying ores-wmflabs-deploy:3e854cf
  • 17:10 halfak: staging ores-wmflabs-deploy:3e854cf

2017-10-05

  • 19:13 halfak: deployed ores-wmflabs-deploy:5b61eec (rollback)
  • 18:56 halfak: deployed ores-wmflabs-deploy:b7c5337
  • 18:55 halfak: staged ores-wmflabs-deploy:b7c5337

2017-07-25

  • 16:48 halfak: deploying ores-wmflabs-deploy:68326ba
  • 16:12 halfak: staging ores-wmflabs-deploy:68326ba

2017-07-17

  • 15:42 halfak: deleting ores-web-04 (intermittent puppet issues and wasn't pooled anyway)

2017-07-06

  • 00:07 bd808: bd808's root key is rejected by ores-compute-01.ores.eqiad.wmflabs. Puppet busted there?

2017-06-28

  • 17:55 halfak: deleting mediawiki-ores and hashing-vector instances from ores-staging (unused)

2017-06-22

  • 00:03 awight: labs ores was down 17:13-17:22 due to maintenance.

2017-06-16

  • 21:37 halfak: deploying ores-wmflabs-deploy:e8f6932
  • 21:34 halfak: staging ores-wmflabs-deploy:e8f6932
  • 21:06 halfak: deploying ores-wmflabs-deploy:c952f57 (reverting due to issues)
  • 20:52 halfak: deploying ores-wmflabs-deploy:7529113
  • 20:26 halfak: staging ores-wmflabs-deploy:7529113

2017-05-20

  • 14:47 halfak: staging ores-wmflabs-deploy:9df7b5a
  • 14:44 halfak: staging ores-wmflabs-deploy:ed1991c

2017-04-25

  • 21:33 halfak: deployed ores-wmflabs-deploy:dc934e8
  • 21:30 halfak: staged ores-wmflabs-deploy:dc934e8

2017-04-06

  • 16:00 halfak: deploying ores-wmflabs-deploy:7061f0a
  • 15:56 halfak: staging ores-wmflabs-deploy:7061f0a

2017-03-28

  • 15:58 halfak: deployed ores-wmflabs-deploy:18beebf (T160638)
  • 15:57 halfak: deployed ores-wmflabs-deploy:18beebf
  • 15:48 halfak: staged ores-wmflabs-deploy:18beebf

2017-03-26

  • 07:39 Amir1: restart ores-web-03

2017-03-15

  • 19:26 halfak: deployed ores-wmflabs-deploy:0990b1d
  • 19:24 halfak: staged ores-wmflabs-deploy:0990b1d
  • 19:03 halfak: deployed ores-wmflabs-deploy:080418d
  • 17:36 halfak: deployed ores-wmflabs-deploy:080418d to staging

2017-03-06

  • 20:26 halfak: restarted precached and uwsgi-ores on ores-web-03

2017-02-23

  • 18:55 halfak: restarted uwsgi-ores and precached on ores-web-03 (memory issues)

2017-02-21

  • 03:40 halfak: restarted uwsgi and precached on ores-web-03

2017-02-19

  • 15:54 Amir1: (except redis ones)
  • 15:54 Amir1: rebooting all instances

2017-02-11

  • 18:12 halfak: deployed ores-wmflabs-deploy:f58d3fd
  • 18:09 halfak: staged ores-wmflabs-deploy:f58d3fd

2017-02-01

  • 23:39 halfak: deployed ores-wmflabs-deploy:ce8861d

2017-01-25

  • 21:20 halfak: deployed ores-wmflabs-deploy:0f90516

2017-01-18

  • 19:47 halfak: restarted precached and uwsgi-ores on ores-web-03. Memory usage issues.

2017-01-15

  • 03:03 Amir1: restarting ores-web-03

2017-01-04

  • 12:51 halfak: ran 'sudo service uwsgi-ores restart' on ores-web-03
  • 11:41 Amir1: stopping precaching in ores in labs to reduce load
  • 11:40 Amir1: reboot ores-lb-02

2016-11-22

  • 16:20 mutante: ores-web-03 has an issue (Socket timeout, CRIT)

2016-11-07

  • 20:45 halfak: restarted uwsgi-ores and precached on ores-web-03

2016-11-03

  • 17:36 halfak: manually ran "sudo mkdir /srv/log; sudo mkdir /srv/log/ores; sudo service celery-ores-worker restart" on ores-worker-(05|06|07|08|09|10)
  • 17:23 halfak: deployed ores-wmflabs-deploy:ffb5dbb
  • 17:16 halfak: deployed ores-wmflabs-deploy:ffb5dbb to staging (fixes logging bug)
  • 17:04 halfak: deployed ores-wmflabs-deploy:74699db to staging

2016-09-25

  • 16:03 Amir1: deploying c2ba205 (T146581)
  • 15:59 Amir1: c2ba205 going to staging
  • 15:26 halfak: restarted precached and uwsgi-ores on ores-web-03

2016-09-23

  • 20:49 halfak: restarted ores-uwsgi and precached services on ores-web-03

2016-09-13

  • 03:03 halfak: deployed ores-wmflabs-deploy:022fb7d

2016-09-12

  • 13:47 halfak: restarted precached service on ores-web-03
  • 04:31 halfak: manually killed since "python" process consuming 80% of memory and restarted uwsgi-ores service on ores-web-03
  • 04:29 halfak: restarted uwsgi-ores service on ores-web-03

2016-09-02

  • 20:25 halfak: deployed ores-wmflabs-deploy:751860f

2016-08-29

  • 14:03 chasemp: reboot 94886e74-5be4-4669-a1c1-840ce7c65de9 ores-compute-01

2016-08-24

  • 22:00 halfak: restarted uwsgi-ores on ores-web-05

2016-08-19

  • 21:10 halfak: deployed ores-wmflabs-deploy:f0fc59b

2016-08-12

  • 18:25 halfak: ran FLUSHALL on ores-redis-02:6380
  • 18:24 halfak: deployed ores-wmflabs-deploy:b015348

2016-08-09

  • 22:08 halfak: restarted precached service on ores-web-03
  • 22:08 halfak: deployed ores-wmflabs-config:1b1c56d

2016-08-08

  • 20:44 Amir1: depooling ores-web-04

2016-08-05

  • 17:44 halfak: repooled ores-web-04 in hiera and ran puppet on ores-lb-02
  • 16:15 halfak: creates ores-web-04 debian-8.5-jessie
  • 16:11 halfak: terminated ores-web-04 (re-creating instance with medium image size)

2016-08-04

  • 18:02 halfak: rebooting ores-web-05 to address OOM
  • 17:56 halfak: reduced web workers per core from 28 to 16

2016-07-29

  • 00:20 halfak: deployed ores-wmflabs-deploy:770d131

2016-07-13

  • 18:20 Amir1: ladsgroup@ores-lb-02:~$ sudo service nginx restart
  • 17:21 Amir1: deploying to web again
  • 17:14 Amir1: ladsgroup@ores-web-05:~$ sudo puppet agent -tv
  • 16:32 halfak: terminated ores-web-05
  • 15:58 halfak: repooled ores-web-04
  • 15:20 halfak: deployed ores-wmflabs-deploy:e638f1b

2016-07-12

  • 11:25 Amir1: deploying aad92ac to web and worker nodes
  • 11:18 Amir1: aad92ac goes to staging

2016-07-08

  • 20:20 halfak: deployed ores-wmflabs-deploy:3f116cd

2016-06-10

  • 20:25 halfak: deleted 99-redis.yaml and restarted celery-ores-worker on all worker nodes
  • 07:36 Amir1: deploying 4efc5b7 into prod

2016-06-03

  • 23:57 Amir1: removing role role::labs::ores::flower from ores-web-03

2016-05-27

  • 13:31 Amir1: adding role::deployment::server to ores project

2016-05-26

  • 10:05 Amir1: rebooting ores-web-04
  • 10:02 Amir1: deploying 4dc6050 into web and worker nodes, updating to revscoring 1.2.6

2016-05-25

  • 19:33 Amir1: running puppet agent manually in ores-web-03
  • 19:29 Amir1: deploying 7992fd1 into web and worker nodes

2016-05-23

  • 22:38 Amir1: precaching brought back online
  • 22:33 Amir1: stopping precaching in ores-web-03 manually. Testing something

2016-05-18

  • 16:48 Amir1: running puppet agent on ores-lb-02 manually

2016-05-16

  • 13:00 Amir1: deploying e51c4f3 to celery and web nodes

2016-05-09

  • 07:52 Amir1: manually restarted nginx load balancer (ores-lb-02) It wasn't needed since puppet did it already.

2016-04-30

  • 18:05 Amir1: deployed d4f63a3 to web and worker nodes
  • 17:49 Amir1: deploy d4f63a3 to the staging
  • 17:33 Amir1: deploying 30ba552 to the staging
  • 14:54 Amir1: running puppet agent manually in ores-web-03
  • 14:52 Amir1: added precaching role to ores-web-03

2016-04-27

  • 20:48 halfak: deployed ores-wikimedia-config:6453fe5

2016-04-23

  • 15:25 Amir1: sudden disruption due to an unexpected deploy

2016-04-07

  • 14:32 halfak: deployed ores-wikimedia-config:34ba0fa

2016-03-30

  • 16:13 halfak: deployed ores-wikimedia-config:58905c5

2016-03-23

  • 17:39 halfak: deployed ores-wikimedia-config:39b622e
  • 01:24 halfak: deployed ores-wikimedia-config:8af4377

2016-03-19

  • 20:04 halfak: ores-web-01/02 and ores-worker-01/02/03/04 deleted. ores-web-03/04/05 and ores-worker-05/06/07/08/09/10 started and configured as replacements.
  • 16:43 halfak: Manually ran `sudo apt-get install aspell-ar aspell-pl` across web and worker nodes
  • 15:48 halfak: Ran puppet and restarted uwsgi on web-01 with 28 forks rather than 32
  • 15:32 halfak: workers_per_core 32 --> 28

2016-03-18

  • 01:17 mutante: while doing that i noticed that ores-web-02 has an "Cannot allocate memory" problem
  • 01:17 mutante: but no class names have changed and confirmed no-op on every single instance
  • 01:17 mutante: merged https://gerrit.wikimedia.org/r/#/c/270102/ - role classes have been moved to modules/role and split into one file per class
  • 01:07 mutante: testing puppet on all instances before merging change that moves role classes to module

2016-03-05

  • 17:58 halfak: Shut down the uwsgi service and started up the uwsgi-ores-web service on web-01 and -02.

2016-03-04

  • 22:54 andrewbogott: migrating ores-worker-02 to labvirt1010

2016-02-15

  • 16:26 halfak: restarted workers on ores-worker -02 and -03
  • 04:01 halfak: Ammending last message -- Restarted the uwsgi-ores-web service on *ores-web-01*. The service on ores-web-02 was left alone.
  • 03:59 halfak: Restarted the uwsgi-ores-web service on ores-web-02

2016-02-07

  • 19:21 halfak: Deployed ores-wikimedia-config:0be5afc

2015-12-30

  • 20:23 halfak: Deployed with wb-vandalism:d940cea
  • 20:22 YuviPanda: create ores-cache-01

2015-12-21

  • 20:28 YuviPanda: depool ores-web-01 from lb

2015-12-16

  • 14:24 YuviPanda: updated quota for instances to arbitrary number (40)

2015-12-08

  • 17:34 halfak: Deployed 1ad37c5 with ores:b745570, revscoring:, editquality:b41b7c1, wikiclass:bbfa9ce, and wb-vandalism:1075596

2015-11-12

  • 20:50 halfak: Deployed 334b4b2 with ores:b745570, revscoring:, editquality:b41b7c1, wikiclass:bbfa9ce, and wb-vandalism:a09d026

2015-10-28

  • 21:05 halfak: Deployed ores-wikimedia-config:c27c8f7, ores==0.5.4, wb-vandalism==0.1.5, revscoring==0.6.7

2015-09-12

  • 02:30 halfak: Deployed ores-wikimedia-config:c852404, ores==0.5.0 and revscoring==0.6.1

2015-09-11

  • 14:51 halfak: removed ores-web-02 from ores-lb-02 pool
  • 03:31 YuviPanda: restart redis server on ores-redis-02 to apply tcp-keepalive

2015-09-10

  • 18:28 YuviPanda: disabled restart crontab on ores-worker-01 under halfak's user, have set it to keep insane levels of logs to debug

2015-09-08

  • 16:35 halfak: deployed ores-wikimedia-config:ca10888, ores==0.4.0 and revscoring==0.5.0
  • 16:34 valhallasw`cloud: should work now


=============