Nova Resource:Paws/SAL
Appearance
2024-06-20
- 11:47 taavi@cloudcumin1001: END (PASS) - Cookbook wmcs.openstack.migrate_project_to_ovs (exit_code=0)
- 11:44 taavi@cloudcumin1001: START - Cookbook wmcs.openstack.migrate_project_to_ovs
2024-06-18
- 20:46 andrewbogott: rebuilt magnum cluster using g4/ovs flavors
2024-05-06
- 17:15 Rook: upgrade pywikibot T364188
2024-04-29
- 17:54 Rook: upgrade pywikibot T363131
- 17:54 Rook: jupyterlab to 4.1.8 T363596
- 16:50 Rook: k8s to 1.26 T326985
2024-03-25
- 15:59 Rook: upgrade jupyter chart. New cluster T360643
- 13:01 Rook: use upstream jupyter-rsession-proxy T360800
2024-03-15
- 16:16 Rook: upgrade jupyterlab T360193
2024-03-09
- 16:06 Rook: increase number of worker nodes T359747
2024-03-08
2024-03-07
- 21:01 Rook: increase worker count to manage outreachy load T359591
2024-03-06
- 12:07 Rook: increase capacity for outreachy T359316
2024-03-05
- 13:25 Rook: remove jupyter-dash T358621
2024-03-04
- 13:57 Rook: add wikibase-cli T358649
2024-02-26
- 16:09 Rook: remove nbextension moving away from notebook interface T312234
2024-02-21
- 16:07 Rook: increase prometheus retention T357786
2024-02-20
- 16:17 Rook: upgrade jupyterlab T357990
2024-02-15
2024-02-12
- 15:56 Rook: upgrade OpenRefine T356448
2024-02-09
- 17:48 Rook: jupyterlab upgraded to 4.1.0 T357027
2024-02-01
- 20:17 Rook: prometheus and kube-state-metrics internal to cluster T355179
2024-01-29
- 12:46 Rook: update jupyerlab T355890
2024-01-26
- 15:25 Rook: update to allow for s3 tofu state storage in codfw1dev T355543
2024-01-24
- 12:56 Rook: Remove 123-11 cluster T355785
2024-01-18
- 08:05 Rook: upgrade rstudio-server T355288
2024-01-12
- 14:00 Rook: removed paws-123-10 cluster T354946
2023-12-08
- 14:41 Rook: upgrade OpenRefine T353021
2023-12-06
- 14:28 Rook: pywikibot to 8.6 T352794
2023-11-27
- 12:33 Rook: jupyterlab to 4.0.9 T351726
2023-11-15
- 19:46 Rook: move to opentofu T351249
2023-11-13
2023-11-09
- 14:34 Rook: updated ingress-nginx T347506
2023-11-06
- 17:10 Rook: pywikibot to 8.5.0 T350552
2023-11-03
- 09:27 Rook: bump jupyterlab version T350459
2023-10-24
- 13:03 Rook: removed old cluster T349551
2023-10-23
- 18:35 Rook: deploy new cluster/jupyterhub chart T349545
2023-10-19
- 15:37 Rook: Bump jupyterlab version T349203
- 14:53 Rook: Bump urllib3 from 1.26.17 to 1.26.18
- 12:59 Rook: Bump urllib3 from 1.26.16 to 1.26.17
2023-10-18
2023-10-09
- 15:50 Rook: bump pwb version T348372
2023-09-29
- 09:58 wm-bot2: dcaro@urcuchillay END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0)
- 09:56 wm-bot2: dcaro@urcuchillay START - Cookbook wmcs.openstack.cloudvirt.vm_console
- 09:56 wm-bot2: dcaro@urcuchillay END (ERROR) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=255)
- 09:56 wm-bot2: dcaro@urcuchillay START - Cookbook wmcs.openstack.cloudvirt.vm_console
2023-09-21
- 14:24 Rook: bump pwb version T346912
2023-09-05
- 12:19 Rook: Reduce memory request for single user container T345467
2023-09-01
- 18:45 Rook: increased worker node count to as all memory was requested T345462
2023-08-31
- 16:51 taavi: restart uwsgi-toolsdb-replica-cnf-web.service on paws-nfs-2, a required dependency was installed only after last restart which made the service fail
- 13:52 Rook: pwb to version 8.3.2 T345192
2023-08-30
- 12:39 Rook: Upgrade notebook allowing i18n support T121478
2023-08-28
- 13:37 Rook: upgrade node T341716
2023-08-22
- 18:06 Rook: bump jupyterlab version T344265
2023-08-21
- 20:03 Rook: pywikibot version bump T344493
2023-08-18
- 14:07 Rook: unrar added
2023-08-15
- 14:08 Rook: T343116 update helm chart, jupyterlab, jupyterhub, notebook. Dropping sparql to allow for update.
2023-07-28
- 12:26 Rook: pwb version bump T342852
2023-07-17
- 18:33 Rook: update OpenRefine version T341985
2023-07-13
2023-07-12
- 15:22 Rook: paws deploying from tf in codfw1dev T340657
2023-07-10
- 11:18 Rook: shut down systems associated with T341457 should only be redirect page.
- 11:00 Rook: removed paws-db-backup-1 T340054
2023-06-30
- 18:51 Rook: https://wikitech.wikimedia.org/wiki/PAWS/Admin updated to describe blue green deployment method using terraform
- 18:48 Rook: terraform working in eqiad1 T340836
- 18:18 Rook: bounced control node as it was getting disk pressure
2023-06-29
- 13:52 Rook: T340698 OpenRefine version bump
2023-06-13
- 18:16 Rook: Rolling jupyterlab back to last known working version T338981
2023-06-12
- 16:02 Rook: updated jupyterlab T338337
2023-06-07
- 10:30 Rook: upgrade jupyterlab T324002
2023-05-15
- 12:00 Rook: implemented limits to storage. Cleared about half the storage used T327936
2023-05-03
2023-04-24
- 13:28 Rook: pywikibot version bump 5466962 T335237
- 11:37 andrewbogott: rebooting paws-k8s-haproxy-2, paws-acme-chief-01, paws-k8s-haproxy-1 -- their load numbers are through the roof for no obvious reason
2023-04-17
- 11:13 Rook: robots.txt to nbserve 1fbc786
2023-04-13
- 21:19 Rook: revert jupyterlab upgrade, causing notebooks to not load T324002
2023-04-12
- 14:42 Rook: upgrade jupyterhub and jupyterlab c345066 T324002
- 10:51 Rook: remove focal from R 7a88b67 T331363
2023-04-10
2023-04-03
2023-03-27
- 19:41 Rook: Upgrade Wikimedia Commons extension for OpenRefine on PAWS to version 0.1.1 4b10d3f T332721
- 17:50 Rook: bump pwb version 7c49518 T333067
2023-03-20
2023-03-16
- 13:13 Rook: pywikibot version bump 6de17db
2023-03-15
- 16:46 Rook: Install Wikimedia Commons extension for OpenRefine aaae2fd Install Wikimedia Commons extension for OpenRefine
- 16:08 Rook: openrefine version bump 9047213 T331747
2023-03-14
2023-03-13
2023-02-20
2023-02-18
2023-02-16
- 12:58 Rook: moving to new cluster. Old one was restarting hub and couldn't find all of its nodes
2023-02-14
- 13:27 Rook: Bump oauthlib 97f241b
- 08:15 taavi: empty profile::wmcs::paws::control_nodes hiera key to bring PAWS back up (T329581), it contained the hostnames of the old kubeadm backed cluster which should be cleaned up properly in T327674
2023-02-13
2023-01-31
2023-01-30
- 13:51 Rook: Set 1 to = 1 for nfs mounts and variables 09b036e T326675
- 11:53 Rook: updated ingress-nginx to allow larger file (more than 800K) uploads T328168
2023-01-25
2023-01-24
- 18:52 Rook: remove unused lighthttpd.conf 0c92944 T322064
- 12:44 Rook: removing unneeded lines 897e64e T322742
- 10:54 Rook: bump certifi version ee13069 T327764
2023-01-23
- 16:21 Rook: remove automatic update for pywikibot 029126a T327668
- 13:09 Rook: remove missed dbProxy entries 7c52b75 T327414
2023-01-19
2023-01-17
2023-01-12
2023-01-11
2023-01-10
2023-01-04
2023-01-03
2022-12-16
- 16:08 dcaro: removing coe cluster rook3 (T325373)
2022-12-13
2022-12-05
2022-12-01
- 08:30 taavi: root@paws-k8s-control-1:~# for cert in etcd-server etcd-peer etcd-healthcheck-client; do kubeadm certs renew $cert; done # T324178
2022-11-28
2022-11-02
2022-10-31
2022-10-27
2022-10-24
2022-10-20
2022-10-17
- 10:04 Rook: update favicon to paws logo T198012 4019bd8
- 05:09 Rook: readme for paws-hub 1d125c2 T308949
- 04:40 Rook: update favicon on hub image T198012 146effd
2022-10-10
- 17:02 Rook: Bump pywikibot to 7.7.1 24d0e22 T320432
- 12:13 Rook: T318279 Tidy R config files c220d16
- 05:58 Rook: Upgrade IRkernel fd62fef T318275
2022-10-05
- 10:40 dcaro: extended the /srv volumes for both prometheus nodes to 15G to give some space for prometheus to shuffle data (the tsdb size is set to 10G, but it needs a bit more space than that)
- 10:18 arturo: aborrero@paws-k8s-control-1:~$ sudo -i kubectl -n prod rollout restart deployment/proxy (T319366)
- 10:02 dcaro: checking the prometheus instance for mounts from labstore
- 10:01 dcaro: checking db-backup for labstore mounts (it should have according to hiera)
- 10:00 dcaro: note that ingress were not mounting labstore already
- 10:00 dcaro: removing labstore mounts on ingress-4
- 09:58 dcaro: removing labstore mounts on ingress-3
- 09:57 dcaro: removing labstore mounts on acme-chief
- 09:56 dcaro: removing labstore mounts on control-3
- 09:54 dcaro: removing labstore mounts on control-2
2022-10-03
2022-09-27
2022-09-26
- 13:56 taavi: restart the 6 singleuser pods that don't have the new dumps mount points attached yet T317144
- 12:42 Rook: Upgrade julia to 1.8.1 #210 51307a7 T318276
- 11:15 Rook: bump pywikibot version #211 T318519 b540c81
- 08:42 Rook: Remove unused file #209 T318277 9f5034e
2022-09-20
- 18:43 Rook: Bump oauthlib from 3.2.0 to 3.2.1 in /images/minesweeper #207 dcfcfaa
2022-08-31
- 14:10 Rook: updating openrefine install method to unblock nbclassic T312251 7aa340c
- 11:44 Rook: jupyterhub version bump T316309 5308be4
2022-08-29
2022-08-25
- 15:05 Rook: OpenRefine version bump T316218 55271d2
- 09:54 Rook: resource requests for renderer 3598be3 T315682
2022-08-24
- 13:44 Rook: Upgrade ingress-nginx to 1.3.0 8e6b577
2022-08-23
2022-08-22
- 12:40 Rook: Upgrade pywikibot 7.5.0 -> 7.6.0 #193 456d3f2fe0 T315745
- 11:44 Rook: updating key method #190 5631062a8a T312096
2022-08-04
2022-08-02
- 12:22 Rook: deleting worker-1 from cluster and shutting down T313287
2022-07-25
- 20:25 Rook: T313728 #188 revert nbclassic to restore openrefine
- 11:18 Rook: 9336306 update pywikibot
2022-07-22
- 15:43 Rook: updating to give hub pod a cpu request so that it cannot be resource starved
2022-07-19
- 08:12 taavi: drain paws-k8s-worker-1 T313287
2022-07-14
2022-07-13
- 18:25 Rook: 002d1a8 update dockerfile to give julia and openrefine sections
- 11:59 Rook: cd8ec39 node from its own repo #182
2022-07-12
2022-07-11
- 14:09 Rook: refactor: create R install section #180 bac49ef T312570
- 11:15 Rook: 08f1e33 reorder minimal install #179 T312593
2022-07-07
- 20:56 Rook: SPARQL to own section of build d3827e3 T312569
- 14:26 Rook: update pwb version 6fcb349 T311958
- 11:17 Rook: unblock singleuser container build T312088 977c8d4
2022-06-27
2022-06-23
2022-06-21
2022-06-15
- 18:03 Rook: removing unnecessary yaml #167 f473302
- 17:04 Rook: Move renderer to ubuntu container #168 0788089
2022-06-14
- 13:11 Rook: Removing leftover db-proxy bits #170 8795bf7
2022-06-07
2022-06-06
- 13:54 Rook: T308975 Move away from git to pip for nbconvert #162 54f93d0
- 12:21 Rook: T308926 Move nbserve to upstream container #165 1767870
2022-06-02
- 17:02 Rook: scaling db-proxy to zero T309794
2022-05-22
2022-05-18
- 12:55 Rook: bump jupyterhub version #149 T308568 41f03a5
- 11:10 Rook: updating ingress from v1beta1 #134 T294342 5107562
2022-05-16
- 09:36 dcaro: restarted reload-acme-chief-backend.service to ensure certs are refreshed
2022-05-14
- 16:16 andrewbogott: restarting acme-chief.service on paws-acme-chief-01 for T308383
2022-05-11
- 13:44 Rook: update pwb version and pin jupyterlab version ef3e38c
2022-05-10
- 13:56 Rook: upgrade pywikibot on container start 437f46a
2022-04-27
- 17:02 Rook: pywikibot version bump 3c42a62
2022-04-18
- 17:29 Rook: updating links to phab with prefilled ticket links aef7c67
- 12:30 Rook: update pywikibot 6db74b6
2022-04-16
- 18:36 wm-bot: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/kubernetes-metrics (2c485e9) - cookbook ran by taavi@runko
2022-04-04
- 12:31 taavi: moving all VMs from paws-puppetmaster-01 -> paws-puppetmaster-2
2022-03-29
- 10:36 Rook: upgrading pywikibot 702f21d
2022-03-21
- 11:11 Rook: deploying jupyterlab cd6ee19
2022-03-10
- 12:23 Rook: updating banner to note ui will update soon 462ab18
2022-03-08
- 13:26 Rook: upgrading open refine c116d64
2022-03-07
- 11:12 Rook: deploying paws realtime collaboration 246e2af
2022-03-02
- 14:20 Rook: deploying fixed version of jupyter-rsession-proxy abe89f6
2022-03-01
- 13:38 Rook: deploying pyaudio fix 978fb64
2022-02-23
- 13:13 Rook: deploying e6eedbc cleanup
2022-02-15
- 18:20 chicocvenancio: added psp for minesweeper
- 16:04 mdipietro: updating pywikibot 2fc27c9
- 14:21 chicocvenancio: Deploying minesweeper
2022-01-25
- 14:30 mdipietro: deployed 93d33c4 PR122
2021-12-28
- 14:14 taavi: deploying https://github.com/toolforge/paws/pull/121
2021-12-20
- 18:20 majavah: deploying calico v3.21.0 (T292698)
2021-12-16
2021-12-02
2021-12-01
- 11:36 mdipietro: deploying lsof pr-76 a378845
2021-11-29
2021-11-25
- 21:37 chicocvenancio: rollback singleuser to PR #96 T295257
- 21:15 chicocvenancio: deploy PR #110 changing singleuser to bump openrefine version T295257
2021-11-23
- 14:19 mdipietro: increased cull timeout with deploy of 3e57264
2021-11-22
2021-11-11
- 08:35 majavah: disabling pod preset controller in preparation for T291913
2021-11-09
- 16:24 mdipietro: deployed PR97 (85c085f) Update Pywikibot to 6.6.2
2021-11-03
- 12:41 mdipietro: deployed https://github.com/toolforge/paws/pull/92 (4961101) T150098
2021-11-01
- 12:31 majavah: upgrade ingress-nginx T292771
2021-10-28
- 14:35 chicocvenancio: set team toolforge/wmcsadmins as maintainers for github repo
2021-10-26
- 15:06 chicocvenancio: delete orphan pods for 2 users
2021-10-22
- 16:38 majavah: deploy https://github.com/toolforge/paws/pull/91
2021-10-21
- 12:58 mdipietro: upgraded to 923250f which was really not an upgrade as the diff gave nothing. Though now it is clear what is deployed.
2021-09-07
- 22:14 bstorm: upgraded k8s to 1.19.13 T287399
2021-08-18
- 19:09 bstorm: redeployed hub with trove database backend instead of toolsdb
2021-07-29
- 14:09 majavah: add mdipietro as projectadmin T287287
2021-07-25
- 16:09 majavah: deleting ingress pod running on worker-6 to get it to re-appear in ingress-4
2021-07-21
- 19:53 bstorm: deployed new maintain-kubeusers T285011
- 19:53 bstorm: deployed new rbac for maintain-kubeusers changes T285011
- 16:59 majavah: deploying calico v3.18.4 T280342
- 15:52 majavah: add my key to passwords::root::extra_keys
- 15:00 majavah: starting kubernetes upgrades T280302
2021-07-14
2021-07-13
- 07:51 majavah: renewing tools-prometheus certificates
2021-07-12
- 13:18 majavah: ingress upgrade completed
- 13:05 majavah: moving user traffic to updated ingress-nginx T264221
2021-07-01
- 12:04 majavah: deploy ingress-nginx 0.46 via the helm chart to paws T264221
2021-06-30
- 20:05 bstorm: tried force delete on the ingress-nginx-gen2 namespace, which doesn't appear to be working either until metrics-server is fixed T285905
- 20:00 bstorm: renewed k8s metrics-server certs and the deployment
- 18:04 majavah: renew kubernetes metrics-server certificate
- 17:26 majavah: creating paws-k8s-ingress-[3-4] and joining them to the k8s cluster T264221
- 17:16 bstorm: temporarily increased quota to 60 cores to enable T264221
2021-06-03
- 20:43 chicocvenancio: tagged new singleuser image, fixes T283969
2021-05-27
- 21:53 bstorm: added paws-k8s-control-2.paws.eqiad.wmflabs back to the list of control nodes at the proxy
- 21:50 bstorm: renewed the certs for paws-k8s-control-2
- 20:37 bstorm: removed paws-k8s-control-2.paws.eqiad.wmflabs from the proxy because it is somewhat broken (certs expired)
- 19:41 bstorm: forced removal of openrefine in paws for now and deleted all current user server pods to force use of the new image
2021-05-23
- 01:45 bstorm: manually pulling updated singleuser images because I hate waiting to test things
- 01:36 bstorm: update latest tag for singleuser image to b894471 for https://github.com/toolforge/paws/pull/69
- 00:56 bstorm: deploying changes for https://github.com/toolforge/paws/pull/67
2021-05-21
- 00:06 bstorm: creating trove mysql instance pawsdb-1 T267683
2021-05-12
- 19:33 bstorm: added taavi to paws.admin
2021-05-11
- 09:17 Majavah: set `profile::wmcs::kubeadm::docker_vol: false` on ingress nodes T282087
- 09:15 arturo: added user `taavi` (Majavah) as projectadmin
2021-04-20
- 23:18 bstorm: deploying updated paws-hub image for https://github.com/toolforge/paws/pull/63
2021-04-02
- 21:50 bstorm: deploying latest PRs to add a note on the wikireplicas changes
2020-12-21
- 20:27 bstorm: applied tuning for timeouts and elections on the k8s etcd pods of 300 for heartbeat and 3000 for elections T267966
2020-12-17
- 02:22 bstorm: Set PAWS hub back to using mariadb T266587
2020-12-16
- 18:21 chicocvenancio: move paws to sqlite while toolsdb is down.
2020-12-10
- 17:00 arturo: fixing /etc/kubernetes/kublet.conf and restarting kubelet in paws-k8s-control-1 (T269865)
2020-12-05
- 00:42 bd808: `kubectl delete po renderer-794886b9cd-9nc6c -n prod` after seeing lots of listen queue full errors in the pod logs.
2020-11-30
- 18:22 bstorm: 1.17 upgrade for kubernetes complete T268669
- 17:25 bstorm: upgrading the worker nodes (this will likely kill services briefly when some pods are rescheduled) T268669
- 17:14 bstorm: updated the calico-kube-controllers deployment to use our internal registry to deal with docker-hub rate-limiting T268669 T269016
- 17:09 chicocvenancio: delete orphaned jupyter server pod `kubectl -n prod delete pod jupyter--45volutionoftheuniverse`. Respective server not running in jupyter admin UI.
- 16:31 bstorm: upgrading pods on paws-k8s-control-3 T268669
- 16:17 bstorm: starting upgrade on paws-k8s-control-2 T268669 (first kubectl drain paws-k8s-control-2 --ignore-daemonsets)
- 15:53 bstorm: proceeding with upgrade to 1.17 on paws-k8s-control-1 T268669
- 15:49 bstorm: draining paws-k8s-control-1 for upgrade T268669
- 12:49 arturo: disable puppet in all k8s nodes to prepare for the upgrade (T268669)
- 12:49 arturo: set hiera `profile::wmcs::kubeadm::component: 'thirdparty/kubeadm-k8s-1-17'` at project level (T268669)
2020-11-16
- 22:13 bstorm: deploying new paws changes for multiinstance readiness
2020-11-10
- 20:16 chicocvenancio: restart hub to apply move to sqlite. T267667
- 16:41 arturo: set paws in sqlite mode because T266587 (kubectl --namespace prod edit configmap hub-config)
2020-10-15
- 19:12 andrewbogott: uncordoned paws-k8s-worker-1 and -2
- 18:48 andrewbogott: draining paws-k8s-worker-2 for move to ceph
- 18:36 andrewbogott: draining paws-k8s-worker-1 for move to ceph
2020-09-29
- 10:59 arturo: last 2 commands should help puppet agent in the paws project, previously it had issues fetching acme-chief certs because an API update
- 10:58 arturo: aborrero@paws-acme-chief-01:~$ sudo systemctl restart uwsgi-acme-chief.service
- 10:56 arturo: aborrero@paws-acme-chief-01:~$ sudo systemctl restart acme-chief.service
2020-08-14
- 17:09 bstorm: backing up the old proxy config to NFS and deleting paws-proxy-02 T211096
2020-08-07
- 22:30 bstorm: removing downtime for paws and front page monitor T211096
- 18:01 bstorm: shutting down paws-proxy-02 T211096
- 17:05 bstorm: running the final rsync to the new cluster's nfs T211096
- 16:08 bstorm: changing paws.wmflabs.org to point at the new cluster ip 185.15.56.57 T211096
- 16:02 bstorm: LAST MESSAGE WRONG: switching NEW cluster to toolsdb T211096
- 16:02 bstorm: switching old cluster to toolsdb T211096
- 15:58 bstorm: switching old cluster to sqlite T211096
- 15:53 bstorm: downtiming alerts in case they need changes (seems likely) T211096
2020-07-30
- 20:40 bstorm: upgrading the singleuser image to test shuffling around some of the pip installs
- 16:38 bstorm: removing the *.paws.wmflabs.org SNI name because it won't be used and it might trigger a re-issue of certs T255249
- 15:39 bstorm: upgrading acme-chief to 0.27-1
2020-07-29
- 18:03 bstorm: powering on paws-k8s-haproxy-1 because that worked fine
- 18:00 bstorm: powering off paws-k8s-haproxy-1 to test failover
2020-07-24
- 17:25 bstorm: to force repulling of every image everywhere, uninstalling paws in the new cluster and reinstalling it T258812
- 09:39 arturo: dropped the DNS wildcard record `*.paws.wmcloud.org IN A 185.15.56.57` and created concrete CNAME records for the FQDNs we actually use (T211096)
2020-07-23
- 22:51 bstorm: deploying via the default 'latest' tag in the new cluster T211096
- 22:48 bstorm: tagged the newbuild tags with "latest" to set sane defaults for all images in the helm chart T211096
- 21:14 bstorm: pushing quay.io/wikimedia-paws-prod/nbserve:newbuild to main repo T211096
- 21:11 bstorm: pushing quay.io/wikimedia-paws-prod/deploy-hook:newbuild to main repo T211096
- 21:09 bstorm: pushing quay.io/wikimedia-paws-prod/singleuser:newbuild to the main repo T211096
- 21:08 bstorm: pushing quay.io/wikimedia-paws-prod/paws-hub:newbuild to the main repo T211096
- 21:06 bstorm: pushing dbproxy docker image for new cluster into main quay.io repo T211096
2020-07-22
- 23:32 bstorm: setting the default NFS version to 4.2 while excepting the two stretch servers T257945
2020-07-21
- 15:13 chicocvenancio: merge pr #50 to fix T258142
2020-07-06
- 21:41 bstorm: deployed ingress to redirect paws.wmcloud.org to the wikitech doc page T195217
2020-06-30
- 23:00 bstorm: added paws-public.wmflabs.org to the alt-names for acme-chief, which broke it until we hand off the zone to the paws project <sorry!> T195217 T255997
2020-06-26
- 21:57 bstorm: applied the metrics manifests to kubernetes to enable metrics-server, cadvisor, etc. T256361
2020-06-25
- 22:52 bstorm: created paws-k8s-worker-5/6/7 as x-large nodes to bring the cluster up to roughly the same capacity as the existing one using soft anti-affinity T211096 T253267
- 22:43 bstorm: bumped quota up to 24 instances, 128 GB RAM and 56 cores T211096
- 16:39 bstorm: deleted the deployhook from the in-progress new cluster for now just in case T211096
- 15:44 bstorm: deployed a proof-of-concept paws-public setup in the new cluster T255997
2020-06-24
- 23:18 bstorm: added A record for *.paws.wmcloud.org to public and hub to use T211096 T255997 T195217
- 21:45 bstorm: doing an initial rsync of the paws userhomes to the new project T160113
2020-06-19
- 10:01 arturo: enabled `paws.wmflabs.org` and `*.paws.wmflabs.org` as valid ingress domains (acme-chief TLS cert, haproxy, etc) (T195217)
2020-06-17
- 21:51 bstorm_: upgraded chart in the new cluster to include resource limits T251298
- 21:51 bstorm_: upgraded chart in the new cluster to include resource limits
2020-06-16
- 15:48 arturo: change DNS record k8s.svc.paws.eqiad1.wikimedia.cloud to point to the haproxy VIP port address 172.16.1.171 (T195217)
- 15:47 arturo: associate floating IP 185.15.56.57 with haproxy VIP port (T295217)
- 15:43 arturo: allow traffic to haproxy VM ports from the VIP port: `sudo wmcs-openstack port set --allowed-address ip-address=172.16.1.171 1b40be58-7182-41aa-95ce-797f94f83d66` (T295217)
- 15:43 arturo: allow traffic to haproxy VM ports from the VIP port: `sudo wmcs-openstack port set --allowed-address ip-address=172.16.1.171 9ccc43d9-1a8a-4287-afda-67e8bab27a9f` (T295217)
- 15:37 arturo: `aborrero@cloudcontrol1004:~ 1 $ sudo wmcs-openstack --os-project-id=paws port create --network 7425e328-560c-4f00-8e99-706f3fb90bb4 paws-haproxy-vip` (T295217)
- 15:23 arturo: live-hacking paws-puppetmaster-01 with https://gerrit.wikimedia.org/r/c/operations/puppet/+/605944 for T195217
2020-06-15
- 15:59 arturo: created DNS record `deploy-hook.paws.wmcloud.org IN CNAME paws.wmcloud.org` (T195217)
- 12:28 arturo: manually created an Ingress object to test routing to the hub (T195217)
- 12:20 arturo: created DNS record `paws.wmcloud.org IN A 185.15.56.57` (T195217)
- 12:19 arturo: associate floating IP 185.15.56.57 with VM paws-k8s-haproxy-1 (T195217)
- 12:18 arturo: release floating IP not in use: 185.15.56.42
- 12:18 arturo: release floating IP not in use: 185.15.56.43
- 11:45 arturo: reset wikitech user password for the service account `paws-dns-manager` to what is in labs/private.git/hieradata/common.yaml `profile::acme_chief::cloud::designate_sync_password` (T195217)
2020-06-12
- 18:49 bstorm_: deployed a test of paws chart in the new cluster T211096
- 13:23 arturo: assigned the DNS zone `paws.wmcloud.org` (T195217)
- 13:13 arturo: live-hacking session in the puppetmaster ended
- 13:05 arturo: live-hacking puppet tree in paws-puppetmaster-01 for T195217
- 11:55 arturo: `aborrero@cloudcontrol1004:~ $ sudo wmcs-openstack role add --user paws-dns-manager --project paws observer` (T255252)
- 11:55 arturo: `aborrero@cloudcontrol1004:~ $ sudo wmcs-openstack role add --user paws-dns-manager --project paws designateadmin` (T255252)
- 11:51 arturo: created service account `paws-dns-manager` in wikitech (T255252)
- 11:31 arturo: introduced acme-chief private data into labs/private in paws-puppetmaster-01 (T255252)
- 11:02 arturo: created puppet prefix 'paws-acme-chief' (T255252)
- 11:01 arturo: created VM paws-acme-chief-01 (T255252)
2020-06-11
- 11:11 arturo: deployed nginx-ingress for some early testing (not definitive) with code https://github.com/crookedstorm/paws/commit/bee62b3fd57f9804aa27e7b8b41fde50bd93df94 (T195217)
- 10:15 arturo: added role (just a label) for ingress nodes: `kubectl label node paws-k8s-ingress-1 kubernetes.io/role=ingress` (T195217)
2020-06-04
- 14:16 arturo: added node taints to ingress nodes: `kubectl taint nodes paws-k8s-ingress-1 ingress=true:NoSchedule` (T195217)
- 12:18 arturo: bootstrapped paws-k8s-ingress nodes, added them to the k8s cluster (T195217)
- 12:04 arturo: created `paws-k8s-ingress` puppet prefix and add the `role::wmcs::paws::k8s::worker` role (T195217)
- 12:02 arturo: created 2 medium VM instances: paws-k8s-ingress-1 and paws-k8s-ingress-2 with haproxy anti-affinity (T195217)
2020-05-26
- 22:34 bstorm_: restored the deployment for maintain-kubeusers so anyone added to the paws.admin group will have admin on the cluster now that the bug is fixed T211096 T246059
- 22:05 bstorm_: temporarily deleted the deployment for maintain-kubeusers pending patch to fix context creation for new admin accounts T211096 T246059
- 22:04 bstorm_: created paws-focused PodSecurityPolicies and the prod namespace in the new cluster T211096
- 22:03 bstorm_: created paws.admin group and kubernetes admin accounts on the new k8s cluster T211096 T246059
- 18:29 bstorm_: bootstrapped the new control plane nodes T211096
- 15:27 bstorm_: updated profile::wmcs::kubeadm::kubernetes_version to 1.16.10 for cluster init T211096
2020-05-21
- 23:04 bstorm_: added profile::wmcs::kubeadm::k8s::encryption_key and profile::wmcs::kubeadm::k8s::node_token to labs/private T211096
- 14:53 bstorm_: adding the hiera values to horizon for bootstrapping k8s T211096
- 14:39 arturo: point record `k8s.svc.paws.eqiad1.wikimedia.cloud` to `172.16.1.186` (which is paws-k8s-control-1, for the initial bootstrap) (T211096)
- 12:48 arturo: created record `k8s.svc.paws.eqiad1.wikimedia.cloud` pointing to `172.16.0.191` (which is paws-k8s-haproxy-1) (T211096)
- 12:34 arturo: created and transferred DNS zone `svc.paws.eqiad1.wikimedia.cloud` (T211096)
2020-05-20
- 22:35 bstorm_: created paws-k8s-worker-1/2/3/4 T211096
- 22:12 bstorm_: created paws-k8s-haproxy-1/2 with antiaffinity group T211096
- 21:36 bstorm_: created paws-k8s-control-1/2/3 with appropriate sec group and server group T211096
- 18:59 bstorm_: created anti-affinity group "controlplane" T211096
- 16:38 bstorm_: deleting the old shut-down VMs from the last effort to rebuild paws T211096
- 16:36 bstorm_: cleaned up the old DNS entries for the external LBs that have been off for a year
2020-03-20
- 14:03 jeh: upgrade paws-puppetmaster-01 to v5 T241719
2020-02-14
- 21:31 andrewbogott: restarting paws-puppetmaster-01 so its clients can connect
2020-01-09
- 18:06 bstorm_: rebooting tools-paws-master-01 T242353
- 14:28 chicocvenancio: shutdown unused instances
2019-12-13
- 00:27 bstorm_: rebooting the paws master since it is in a bad state after the openstack maintenance as well.
2019-11-01
- 21:15 Krenair: Updated paws-apiserver.wmflabs.org A record list to remove 172.16.2.151 which is not allocated to any instance. The other two A records point to valid instances in the paws project.
2019-10-23
- 09:03 arturo: paws-master-01/03 and a couple of other servers are down because hypervisor is rebooting
2019-10-14
- 22:32 bd808: Removed project member "Afrodric". Looks like someone added accidentally when trying to make aborrero as project member
- 22:31 bd808: Added Krenair as project member
2019-05-18
- 11:13 chicocvenancio: point paws-proxy-02 to tools-paws-worker-1006 on paws-deploy-hook hostname (T218380)
2019-04-26
- 15:27 chicocvenancio: paws-proxy-02 redirecting main page to https://www.mediawiki.org/wiki/PAWS T221886
2019-04-16
- 17:15 chicocvenancio: move paws-proxy-02 reload nginx
- 17:07 chicocvenancio: move paws-proxy-02 to point to tools-paws-worker-1006 for upcoming master move
2019-03-27
- 23:46 chicocvenancio: moving paws host in `paws-proxy-02` back to `tools-paws-master-01` T219460
- 22:10 chicocvenancio: moving paws host in `paws-proxy-02` to `tools-paws-worker-1005` T219460
2019-03-25
- 14:12 gtirloni: created `paws.wmflabs.org` subdomain under `paws` project (T211096)
- 14:07 gtirloni: created `paws.wmflabs.org` subdomain under `paws` project T211096
- 13:54 gtirloni: created `paws.wmflabs.org` subdomain under `paws` project (T211096)
2019-03-15
- 02:25 gtirloni: activated TLS termination using Let's Encrypt on paws-proxy-02
- 02:25 gtirloni: removed webproxies and created new A records pointing directly to paws-proxy-02
2019-02-21
- 09:22 gtirloni: upgraded and rebooted paws-proxy-02
2019-02-20
- 15:00 andrewbogott: deleting the long-shut-down paws-proxy-01
2019-02-15
- 01:28 bd808: Re-enabled PAWS vhost on paws-proxy-02
2019-02-14
- 22:25 gtirloni: downtimed PAWS in Icinga
- 22:16 gtirloni: Activated maintenance page on paws-proxy-02 nginx config
2019-02-13
- 08:32 arturo: switch paws-proxy-02 puppetmaster to labs-puppetmaster.wikimedia.org
2019-01-24
- 19:20 andrewbogott: shutting down paws-proxy-01
- 19:11 chicocvenancio: moved config, ready to receive traffic on paws-proxy-02 T214613
- 18:34 chicocvenancio: firing up paws-proxy-02 for T214613
2019-01-23
- 20:46 chicocvenancio: moving paws_public proxy_pass to https://172.16.6.39 in paws-proxy-01
2018-10-25
- 23:58 gtirloni: Started tools-paws-worker-1010 (T208006)
2018-08-03
- 20:19 andrewbogott: deleting paws-master-01 and paws-node-1002; unused
2018-07-03
- 22:49 bstorm_: added stricter image space reclaiming arguments to kubelet
2018-06-20
- 17:39 chicocvenancio: edited paws-proxy-01 to pass http_x_forwarded_proto as it receives T197248
2018-05-04
- 02:48 chicocvenancio: killed 25 pods with more than one hour inactivity through admin interface
2018-03-14
- 21:49 chicocvenancio: updated k8s control plane, updating nodes to v1.9.4 for T189680
2018-02-23
- 18:33 chicocvenancio: redirected tools.wmflabs.org/paws to paws.wmflabs.org and deleted old k8s ReplicationControllers (T188068)
2018-02-22
- 22:11 chicocvenancio: (T175202) culler is running and killing pods as designed!
- 21:13 chicocvenancio: jupyterhub updated to fix culler (T175202) culler already ran without 404
- 17:43 chicocvenancio: manually ran culler inside hub container
2018-02-21
- 17:03 chicocvenancio: deleted query-killer k8s deployment T187818
2018-02-16
- 20:18 chicocvenancio: changed userhomes group for T185434 workarround
2018-02-15
- 01:10 chicocvenancio: changed group of all userhome folders to tools.paws
2018-02-04
- 12:21 chicocvenancio: changed group of all userhome folders to tools.paws
2017-12-19
- 22:11 bd808: Killed tiller pod that was in crashloopbackoff
2017-09-28
- 21:25 andrewbogott: server docker restart on paws-node-1002; disk is full and docker is holding open a lot of deleted files
2017-03-20
- 21:25 andrewbogott: migrating paws-base-01 to labvirt1013
2016-05-10
- 20:36 bd808: Visit https://foo-pawsbase.wmflabs.org/ to see list of wikis in wikifarm
- 20:32 bd808: Fixed hiera config so https://wikidata-pawsbase.wmflabs.org routes correctly
- 15:47 YuviPanda: added bd808 to project and made as admin