Portal:Cloud VPS/Admin/Trove

From Wikitech
Jump to navigation Jump to search

Trove is OpenStack's database-as-a-service project. It creates and manages database instances, each of which implements one database install (e.g mysql v5.7.29) hosting one or more databases. Each instance consists of a docker container stored on a cinder volume and running on a VM. The instance implementation details are obscured from the tenant which owns the database instance, but the volume and VM can be accessed from the 'trove' service project.


Puppet will do most of the installation and setup, but a few manual steps are needed. This is not (currently) meant to be a complete script, but a list of pieces not managed by puppet:

  1. Database creation and syncing
  2. Add keystone endpoints (example endpoint: http://openstack.codfw1dev.wikimediacloud.org:8779/v1.0/%(tenant_id)s )
  3. Add 'trove' project and 'trove' service user. The trove user's password is stored in puppet. The user needs the 'projectadmin' and 'member' roles in the 'trove' project but NOT the 'admin' role.
  4. Guest images need to be installed (details below)
  5. Database docker images may need to be uploaded to our quay.io repo (details below)

guest images

Currently (as of 2021-05-01) we are using the test image builds provided by the Trove project. These are provided by https://tarballs.opendev.org/openstack/trove/images/. Here's an example of installing a trove guest image:

# wget https://tarballs.opendev.org/openstack/trove/images/trove-master-guest-ubuntu-bionic.qcow2
# qemu-img convert -f qcow2 -O raw ./trove-master-guest-ubuntu-bionic.qcow2 ./trove-master-guest-ubuntu-bionic.raw
# openstack image create --private --project trove --tag trove --disk-format "raw" --property hw_scsi_model=virtio-scsi --property hw_disk_bus=scsi --container-format "ovf" --file ./trove-master-guest-ubuntu-bionic.raw trove-master-guest-ubuntu-bionic
# trove-manage datastore_version_update mysql 5.7.29 mysql $image_id "" 1

Note that only one guest image should be needed for all of Trove; the database type and version switching is handled within the VM by Docker.

Detailed (but possibly incorrect?) instructions can be found at openstack's site.

guest config

The guest central guest config is stored on the cloudcontrols at /etc/trove/trove-guestagent.conf. That file is injected into guests at boot time.

guest containers

When Trove requests a database of <type>:<version>, the guest image will attempt to fetch a Docker image with a corresponding <image:tag>, for example mysql:5.7.29. Images will only be fetched from the WMCS quay.io repo which can be found here: https://quay.io/organization/wikimedia-cloud-services

To provide a new container to Trove you will first need write access to that repository which can be granted by another existing member.

To copy a docker image from Docker Hub and upload it to the quay.io repo, the steps look like this:

# docker pull mysql:5.7.29
# docker login quay.io
   <provide username/password>
# docker tag mysql:5.7.29 quay.io/wikimedia-cloud-services/mysql:5.7.29
# docker push  quay.io/wikimedia-cloud-services/mysql:5.7.29

install issues

A bug in the Victoria debian packages causes trove-common config to fail and complain about statting /usr/share/trove-common/api-paste.ini. Touching that file gets things configuring properly.

Accessing Trove guest VMs

Each Trove database instance is backed by a VM of the same name in the 'trove' project. These VMs are not puppetized like normal cloud-vps VMs; they are built from the Trove guest image, described above. Consequently, they do not have standard ssh access enabled.

These instances have a public key installed for emergency troubleshooting. The key is stored by nova, owned by the 'trove' service user, and named trove-debug. The private keys can be found in the private puppet repo: modules/secret/secrets/ssh/wmcs/trove/openstack-trove-debug-key-codfw1dev and modules/secret/secrets/ssh/wmcs/trove/openstack-trove-debug-key-eqiad1.

To access a VM:

  • First get a copy of that private key on your laptop
  • Then add a security group rule permitting access on port 22 to the VM you want to investigate. Each VM has its own, private security group; VMs do NOT apply the default security group.
  • After all that, you should be able to ssh as the 'ubuntu' user:
ssh -i ~/.ssh/trove_codfw1dev ubuntu@consoletest10.trove.codfw1dev.wikimedia.cloud
  Welcome to Ubuntu 18.04.5 LTS (GNU/Linux 4.15.0-141-generic x86_64)

Cloud-wide Trove quotas

Because the resources created by Trove are stored in the 'trove' project, the 'trove' project must have quotas big enough to support every database instance and volume cloud-wide. If instance creation is failing for no reason, it's possible that the trove project has hit its quota limits.

The quotas in effect within the trove project are standard nova quotas: instances, cores, ram, storage, and ceph storage.

Adjusting per-project Trove quotas

Each Cloud VPS project has quotas limiting trove usage: backups, instances, and volumes. These quotas aren't available via standard quota-management commands; rather, they're hidden under the 'openstack database' commandline.

user@cloudcontrol1006:~$ sudo wmcs-openstack database quota show deployment-prep
| Resource  | In Use | Reserved | Limit |
| backups   |      0 |        0 |     2 |
| instances |      1 |        0 |     2 |
| ram       |   1024 |        0 |  4096 |
| volumes   |      1 |        0 |     2 |
user@cloudcontrol1005:~$ sudo openstack database quota update deployment-prep ram 8192
| Field | Value |
| ram   | 8192  |

The instances quota caps the total number of database instances.

The volumes quota caps the total number of GB permitted for database storage.

The ram quota caps the total number of MB permitted for database storage.

The backups quota limits the number GB permitted for database database volume backups. We currently do not use this feature so the quota is always set to 0.

Common quota issues

Reserved quota does not go down

Sometimes for unknown reasons the reservation system for quotas does not clean up the reserved counter, leaving the quotas in a state similar to:

ssh cloudcontrol1005
dcaro@cloudcontrol1005:~$ sudo -i
root@cloudcontrol1005:~# mysql -u root
mysql:root@localhost [(none)]> use trove_eqiad1

## There's some reservations in the counter
mysql:galera_backup@localhost [trove_eqiad1]> select * from quota_usages where tenant_id='tf-infra-test';
| id                                   | created             | updated             | tenant_id     | in_use | reserved | resource  |
 | b3f7b371-ac14-4d02-afbd-f94f1b992c2a | 2023-06-26 19:39:38 | 2023-06-26 19:39:38 | tf-infra-test |      0 |        0 | backups   |
 | ad1bcb7e-8b6b-4d36-8382-b24812d1a0f5 | 2023-06-26 17:11:12 | 2023-07-13 09:22:42 | tf-infra-test |      1 |       -4 | instances |
 | cc8e7f0b-b102-4c64-aee5-c19924742e11 | 2023-06-26 17:11:12 | 2023-07-13 09:22:42 | tf-infra-test |   2048 |    -4096 | ram       |
 | 22714fc9-75a5-4c11-82bf-a3a0bc900424 | 2023-06-26 17:11:12 | 2023-07-13 09:22:42 | tf-infra-test |      2 |        8 | volumes   |
4 rows in set (0.001 sec)

# But none in the reservations table:
mysql:root@localhost [trove_eqiad1]> select * from reservations where status not in ('Committed', 'Rolled Back');
Empty set (0.005 sec)

When this happens, you can manually update the reservation to set to 0:

mysql:galera_backup@localhost [trove_eqiad1]> update quota_usages set reserved=0 where tenant_id='tf-infra-test' and resource='volumes';