Nova Resource:Admin/SAL

From Wikitech
Jump to navigation Jump to search

2019-03-21

  • 19:32 andrewbogott: restarting keystone on cloudcontrol1003

2019-03-15

  • 16:00 gtirloni: increased nscd cache size (T217280)

2019-03-14

  • 19:04 gtirloni: bstorm started nfsd on labstore1006 (T218341)
  • 16:42 gtirloni: published new debian-9.8 image (T218314)

2019-03-04

  • 19:37 bstorm_: umounted /mnt/nfs/dumps-labstore1006.wikimedia.org across all VPS projects for T217473

2019-02-26

  • 12:46 gtirloni: shutdown toolsbeta-sgegrid-master (cronspam)

2019-02-25

  • 10:32 gtirloni: restarted nfsd on labstore1004

2019-02-21

  • 09:09 gtirloni: restarted uwsgi-labspuppetbackend.service on labpuppetmaster1001
  • 07:42 gtirloni: created project cloudstore
  • 07:36 gtirloni: deleted wmcs-nfs project

2019-02-20

  • 21:58 andrewbogott: silencing shinken and disabling puppet on shinken-02 for now

2019-02-19

  • 12:00 gtirloni: added nagios@icinga2001.wikimedia.org to cloud-admin-feed@ allowed senders

2019-02-18

  • 20:21 gtirloni: downtimed cloudvirt1020
  • 20:12 gtirloni: ran `labs-ip-alias-dump.py` on cloudservices/labservices servers

2019-02-15

  • 13:10 arturo: T216239 labvirt1019 has been drained
  • 12:22 arturo: T216239 draining labvirt1009 with a command like this: `root@cloudcontrol1004:~# wmcs-cold-migrate --region eqiad --nova-db nova 2c0cf363-c7c3-42ad-94bd-e586f2492321 labvirt1001`
  • 12:02 arturo: more nova service cleanups in the database (labvirts that were reallocated to eqiad1)
  • 11:34 arturo: T216190 cleanup from nova database `nova service-delete 35`
  • 03:50 andrewbogott: updated VPS base images for Jessie and Stretch, now featuring Stretch 9.7

2019-02-11

  • 18:13 gtirloni: cleaned old metrics data in labmon1001 T215417
  • 15:28 gtirloni: running `maintain-views --all-databases --replace-all` on labsdb1011
  • 14:18 gtirloni: running `maintain-views --all-databases --replace-all` on labsdb1010

2019-02-08

  • 14:56 gtirloni: running `maintain-views --all-databases --replace-all` on labsdb1009

2019-02-06

  • 11:47 gtirloni: downtimed labmon100{1,2} T215399
  • 00:17 bstorm_: T214106 deleted bstorm-test2 project to clean up

2019-02-05

  • 10:48 arturo: labmon1001 is now part of the 'eqiad1-r' region

2019-02-01

  • 09:54 arturo: moving canary1015-01 VM instance from cloudvirt1024 back to cloudvirt1015

2019-01-31

  • 12:44 arturo: T215012 depooling cloudvirt1015 and migrating all VMs to cloudvirt1024

2019-01-25

  • 20:11 gtirloni: deleted project yandex-proxy T212306
  • 20:11 gtirloni: deleted project T212306

2019-01-24

  • 11:50 arturo: T213925 modify subnet cloud-instances-transport1-b-eqiad1 to avoid floating IP allocations from here
  • 11:07 arturo: T214299 failover cloudnet1003 to cloudnet1004
  • 10:03 arturo: T214299 reimage cloudnet1004 to debian stretch
  • 09:51 arturo: T214299 failover cloudnet1004 to cloudnet1003

2019-01-22

  • 19:19 arturo: T214299 stretch cloudnet1003 is apparently all set
  • 18:40 arturo: T214299 manually delete from neutron agents from cloudnet1003 (must be added again after reimage, with new uuids)
  • 18:37 arturo: T214299 reimaging cloudnet1003 as debian stretch
  • 17:35 jbond42: starting roll out of apt package updates to
  • 14:41 gtirloni: T214369 deployed new jessie and stretch VM images

2019-01-21

  • 18:29 gtirloni: installed libguestfs-tools on cloudvirt1021

2019-01-16

  • 14:21 andrewbogott: stopping old VPS proxies in eqiad — T213540

2019-01-15

  • 14:20 andrewbogott: changing tools.wmflabs.org to point to tools-proxy-03 in eqiad1

2019-01-13

  • 20:00 andrewbogott: VPS proxies are now running in eqiad1 on proxy-01. Old VMs will wait a bit for deletion. T213540
  • 19:12 andrewbogott: moving the VPS proxy API backend to proxy-01.project-proxy.eqiad.wmflabs, as per T213540
  • 17:11 andrewbogott: moving all VPS dynamic proxies to proxy-eqiad1.wmflabs.org aka proxy-01.project-proxy.eqiad.wmflabs, as per T213540

2019-01-09

  • 22:21 bd808: neutron quota-update --tenant-id tools --port 256

2019-01-08

  • 18:59 bd808: Definately did NOT delete uid=novaadmin,ou=people,dc=wikimedia,dc=org
  • 18:59 bd808: Deleted LDAP user uid=neutron,ou=people,dc=wikimedia,dc=org
  • 18:58 bd808: Deleted LDAP user uid=novaadmin,ou=people,dc=wikimedia,dc=org

2019-01-06

  • 22:03 bd808: Set floatingip quota of 60 for tools project in eqiad1-r region (T212360)

2018-12-20

  • 17:10 arturo: T207663 renumbered transport network in eqiad1

2018-12-05

  • 17:59 arturo: T207663 changed labtestn transport network addressing from private to public

2018-12-03

  • 13:25 arturo: T202886 create again PTR records after dnsleak.py fix

2018-11-30

  • 14:08 arturo: running dns leaks cleanup `root@cloudcontrol1003:~# /root/novastats/dnsleaks.py --delete`

2018-11-28

  • 17:33 gtirloni: deleted contintcloud project (T209644)

2018-11-27

  • 13:32 gtirloni: enabled DRBD stats collection on labstore100[4-5] T208446

2018-11-22

  • 07:12 gtirloni: deployed new debian-9.6-stretch image

2018-11-21

  • 10:48 arturo: re-created compat-net as not shared in labtestn to test stuff related to T209954

2018-11-16

  • 12:43 gtirloni: armed keyholder on labpuppetmaster1001/1002 after reboots
  • 12:08 gtirloni: rebooted labpuppetmaster1001 (T207377)
  • 11:57 gtirloni: rebooted labpuppetmaster1002 (T207377)

2018-11-14

  • 17:19 gtirloni: added cloudvirt1016 to scheduler pool (T209426)
  • 15:41 gtirloni: reimaging labvirt1016 as cloudvirt1016
  • 15:14 gtirloni: reset-failed systemd unit nova-scheduler on cloudcontrol1004
  • 13:52 gtirloni: rebooted labservices1002 after package upgrades (T207377)
  • 13:23 gtirloni: rebooted labstore2004 after package upgrades (T207377)
  • 13:20 gtirloni: rebooted labstore2003 after package upgrades (T207377)
  • 13:20 gtirloni: rebooted labstore2001/labstore2003 after package upgrades (T207377)
  • 12:08 gtirloni: rebooted labnet1002 after package upgrades
  • 12:01 gtirloni: rebooted labmon1002 after package upgrades
  • 11:41 gtirloni: rebooted labcontrol1002 after package upgrades
  • 11:15 gtirloni: rebooted cloudcontrol1004 after package upgrades

2018-11-09

  • 18:17 gtirloni: restarted neutron-linuxbridge-agent on cloudvirt1018/1023

2018-11-08

  • 11:00 gtirloni: Added novaproxy-02 to $CACHES
  • 10:50 gtirloni: Added cloudvirt1017 to eqiad1 region

2018-11-07

  • 13:49 arturo: T208733 moving labvirt1017 from main deployment to eqiad1 and renaming it to cloudvirt1017

2018-10-22

  • 16:24 arturo: T206261 another update to dmz_cidr in eqiad1
  • 10:26 arturo: change again in dmz_cidr in eqiad1: VMs will connect between them without NAT even when using floating IPs (T206261)

2018-10-19

  • 12:02 arturo: revert change in dmz_cidr in eqiad1 for now (T206261)
  • 11:16 arturo: change in dmz_cidr in eqiad1: VMs will connect between them without NAT even when using floating IPs (T206261)
  • 10:14 arturo: we have new virt servers in the eqiad1 deployment since past week and this week: cloudvirt1018, cloudvirt1023, cloudvirt1024

2018-09-26

  • 10:40 arturo: T205524 all sorts of restarts in all neutron daemons
  • 10:20 arturo: T205524 stop/start all neutron agents in cloudnet1003.eqiad.wmnet
  • 10:13 arturo: T205524 restart all agents in cloudnet1004.eqiad.wmnet
  • 10:10 arturo: restart neutron-server in cloudcontrol1003, investigating T205524

2018-09-24

  • 10:57 arturo: try to increase floating ip allocation pool in eqiad1. Of 185.15.56.0/25 we are using only 185.15.56.10-185.15.56.31, I don't know why. Let's use 185.15.56.2-185.15.56.126

2018-09-21

  • 17:18 bd808: Running `sudo maintain-meta_p --all-databases --purge` across labsdb10(09|10|11) for T201890

2018-09-17

  • 22:08 bd808: Granted gtirloni project roles of admin, projectadmin, and user

2018-09-12

  • 11:20 arturo: T202636 distributing default routes using classless-static-route for all VMs in main/labtest (dnsmasq/nova-network)

2018-09-11

  • 16:52 arturo: again, restarted nova-network after killing all dnsmasq procs in labnet1001 for T202636
  • 16:08 arturo: restarted nova-network after killing all dnsmasq procs in labnet1001 for T202636
  • 10:53 arturo: T202636 creating all the compat-network configuration in neutron
  • 10:36 arturo: T202636 creating br-compat bridge in eqiad1 for the compat network
  • 10:33 arturo: T202636 manually reserve 10.68.23.253 (in nova-network)

2018-09-10

  • 22:46 andrewbogott: deleting all VMs on labvirt1019 and 1020 as prep for T204003

2018-08-30

  • 15:46 andrewbogott: restarting rabbitmq-server on cloudcontrol1003
  • 13:07 arturo: T202636 internal network routing now exists in labtest/labtestn for VM to communicate with each other

2018-08-28

  • 11:04 arturo: T202549 eqiad1 databases are all now running in m5-master. Mysql has been cleaned from cloudcontrol100[3,4]

2018-08-23

  • 16:17 arturo: T188589 bstorm_ merged patch to reduce nova DB connection usage
  • 13:15 arturo: T202115 `root@cloudcontrol1003:~# neutron subnet-update --allocation-pool start=10.64.22.4,end=10.64.22.4 e4fb2771-a361-4add-ac4e-280cc300c59f`
  • 13:10 arturo: T202115 (was `{"start": "10.64.22.2", "end": "10.64.22.254"}` )
  • 13:08 arturo: T202115 `root@cloudcontrol1003:~# neutron subnet-update --allocation-pool start=10.64.22.254,end=10.64.22.254 e4fb2771-a361-4add-ac4e-280cc300c59f`

2018-08-22

  • 15:28 arturo: cleanup local glance,keystone databases in cloudcontrol1003.wikimedia.org (already in m5-master)
  • 15:27 arturo: cleanup local keystone database in cloudcontrol1003.wikimedia.org (already in m5-master)

2018-08-21

  • 15:39 andrewbogott: initial test message
  • 10:31 arturo: eqiad1 remove leftover port for HA on labnet1004
  • 10:15 arturo: test

2018-05-07

  • 18:07 bstorm_: stopped the toolhistory job because it is totally broken and fills /tmp.

2018-02-09

  • 00:55 bd808: Added Arturo Borrero Gonzalez and Bstorm as project members
  • 00:54 bd808: Removed Yuvipanda at user request (T186289)