Portal:Cloud VPS/Admin/Procedures and operations

From Wikitech
Jump to navigation Jump to search

This page describes some standard admin procedures and operations for our Cloud VPS deployments.

Manual routing failover

In the old nova-network days, a very long procedure was required to manually failover from a dead/under-maintenance network node (typically cloudnetXXXX).

Nowadays is much more simpler. This procedure assumes you want to move the active service from one node to the other:

Alternatively you can play with other neutron commands to manage agents.

By the time of this writing is not known which method produces less impact in terms of network downtime.

Remove hypervisor

Follow this procedure to remove a virtualizacion server (typically cloudvirtXXXX|labvirtXXXX).

  • Remove or shutdown node
  • openstack hypervisor list will still show it
  • nova service-list will show it as down once it's taken away:

| 9 | nova-compute | labtestvirt2003 | nova | disabled | down | 2017-12-18T20:52:59.000000 | AUTO: Connection to libvirt lost: 0 |

  • nova service-delete 9 will remove where the number is the id from nova service-list

VM/Hypervisor pinning

In case you want to run a concrete VM in a concrete hypervisor, run the command at instance creation time with the --availability-zone option as in the following example:

user@cloudcontrol1005:~$ sudo wmcs-openstack server create --os-project-id testlabs --image debian-10.0-buster --flavor g2.cores1.ram2.disk20 --network lan-flat-cloudinstances2b --property description='test VM' --availability-zone host:cloudvirt1022 mytestvm

Canary VM instance in every hypervisor

Each hypervisor should have a canary VM instance running.

The command to create it should be something like:

user@cloudcontrol1005:~$ sudo wmcs-openstack server create --os-project-id cloudvirt-canary --image debian-10.0-buster --flavor cloudvirt-canary-ceph --network lan-flat-cloudinstances2b --property description='canary VM' --availability-zone host:cloudvirt1022 canary1022-01

NOTE: you could also use a script like this: wmcs-canary-vm-refresh.sh (a custom helper script made by Arturo to refresh canary VMs in every hypervisor).

Updating openstack database password

Openstack uses many databases, and updating the password requires several steps.

nova

We usually have the same password for the different nova databases nova_eqiad1 and nova_api_eqiad1.

  • in the puppet private repo (in puppetmaster1001.eqiad.wmnet), update the profile::openstack::eqiad1::nova::db_pass hiera key in hieradata/eqiad/profile/openstack/eqiad1/nova.yaml.
  • in the puppet private repo (in puppetmaster1001.eqiad.wmnet), update class passwords::openstack::nova in modules/passwords/manifests/init.pp.
  • in the openstack database (galera running in cloudcontrol nodes), update grants, something like:
GRANT ALL PRIVILEGES ON nova_api_eqiad1.* TO 'nova'@'208.80.153.x' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_api_eqiad1.* TO 'nova'@'%' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_eqiad1.* TO 'nova'@'208.80.153.x' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_eqiad1.* TO 'nova'@'%' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_cell0_eqiad1.* TO 'nova'@'208.80.153.x' IDENTIFIED BY '<%= @db_pass %>';
GRANT ALL PRIVILEGES ON nova_cell0_eqiad1.* TO 'nova'@'%' IDENTIFIED BY '<%= @db_pass %>';
  • repeat grants for every cloudcontrol server IP and IPv6 address.
  • update cell mapping database connection string (yes, inside the database itself) in any cloudcontrol server:
$ mysql nova_api_eqiad1;
[nova_api_eqiad1]> update cell_mappings set database_connection='mysql://nova:<password>@openstack.eqiad1.wikimediacloud.org/nova_eqiad1' where id=4;
[nova_api_eqiad1]> update cell_mappings set database_connection='mysql://nova:<password>@openstack.eqiad1.wikimediacloud.org/nova_cell0_eqiad1' where id=1;
  • run puppet everywhere (in cloudcontrol servers etc) so the new password is added to the config files.
  • if puppet is not restarting the affected services, restart them by hand (systemctl restart nova-api, etc)

neutron

TODO: add information.

glance

TODO: add information.

designate

TODO: add information.

keystone

TODO: add information.

Rotating or revoking keystone fernet tokens

Should you need to rotate or revoke all keystone fernet tokens, follow this procedure:

  • on all cloudcontrol nodes
rm -rf /etc/keystone/fernet-keys
  • on one cloudcontrol node:
keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone
  • on each other cloudcontrol node:
rsync -a --delete rsync://<host where you ran fernet_setup>.wikimedia.org/keystonefernetkeys/* /etc/keystone/fernet-keys/
  • on labweb/cloudweb hosts:
service memcached restart
service apache2 restart

See also