Jump to content

Help:Trove database user guide

From Wikitech


This page explains how to add and manage databases within a Cloud VPS project. Cloud Services implements Trove, an OpenStack component, to provide database services for Cloud VPS projects. Toolforge tools may use Trove if they obtain approval through the Toolforge Quota Request process, but most tools should use the user database options in Toolforge.

Why add Trove to a project?

  • Your Cloud VPS project needs persistent data storage and retrieval, but you don't want the burden of directly managing that database, a virtual machine and its install of MySQL/MariaDB/PostgreSQL software. Trove can handle much of this complexity for you.
  • Your tool's data has become too large for ToolsDB. For example, in phab:T323502, the popularity and usage of a tool has grown to the point that it has 160 GB of saved data. This is so much data that it's causing issues for other ToolsDB users. For this tool, "graduating" from ToolsDB to a dedicated Trove instance is the next step in managing growth.

Key concepts

A Trove instance (or database instance) is a database server that may contain one or more databases. A Trove instance is not the same as a Cloud VPS instance, which is a virtual machine (VM). A Trove instance uses Cinder volumes for storage, and a Nova instance for compute. However, database instances and their storage are managed separately from Cloud VPS instances and their Cinder volumes.

A Cloud VPS project can have multiple Trove instances, as long as a project has sufficient Trove quota. Multiple Cloud VPS instances may access the same Trove instance. A Trove instance doesn't count against your project's "instance" quotas, and doesn't show up in your list of "instances", because it's a different kind of instance – a database instance, rather than a VM. Trove storage quota defaults to around 10GB, but you can request more.

Toolforge tools that are approved via quota request to use Trove have a Cloud VPS project that exists only to provide their database instance.

Before you start

To create and manage Trove instances and databases:

  • Cloud VPS users: You must be a project admin for the Cloud VPS project to which you are adding a database.
  • Toolforge users: To use Trove for a Toolforge tool, you must first get approval through the Toolforge Quota Request process.
  • You should know the database type and version of the datastore(s) you intend to use. Trove primarily targets MySQL and MariaDB servers, but has some limited support for Postgres databases.

Launch a Trove instance

Before you can create a database, you must launch a Trove instance in Horizon:

  1. Log in to Horizon and select your project from the drop-down menu at the top.
  2. From the main menu, navigate to Databases -> Instances.
  3. In the upper right side of the screen, click the "Launch Instance" button.
  4. Specify values in the required fields to launch your instance.
  • For Volume Size, choose an option that is slightly more than what you think you might need, so you can avoid resizing later.
  • For Volume Type, select "standard".
  • If your Datastore type is mariadb, select a Flavor with at least 2GB of RAM. (The amount of RAM is indicated by the number after "ram" in the flavor name.)

Create a database

A Trove instance can contain one or more databases. To add a database to an instance:

  • In Horizon (mysql and Mariadb instances only): click on the instance name and go to the "Databases" tab.
  • On the command line: enable root access and use a mysql, psql, or equivalent CLI.

Manage access and users

If you prefer a command line interface for database management (or need more fine-grained control than that offered by Horizon) you can enable root access to an instance.

Enable root access

To use the command line to manage your databases, enable root access in Horizon:

  1. In the "Actions" menu for an instance, select "Manage Root Access".
  2. Click the "Enable root" button for the instance.
  3. Use the resulting credentials to access a root database login.

Access databases from a VM

Horizon provides an example shell command and an example URI as shortcuts for database access. To view these examples, click an Instance name and go to the "Connection Information" section on the Overview tab.

Copy this command and paste it into your shell to get a mysql, psql, or equivalent prompt. Substitute the appropriate username and password. For a root shell, enable root access.

By default you can only access the DB from VMs within the same Cloud VPS project, if you need to connect directly from your desktop/laptop, use ssh tunnels to a VM with access.

Manage users

You can add users to specific databases within a Trove instance, or provide access at the instance level. Database usernames are not the same as Cloud VPS account names.

To add users to an instance:

  1. Go to your list of instances and click the instance name.
  2. Navigate to the "Users" tab. Use the "+ Create User" button in the upper right corner of the screen to add users.
You can't use the "Users" tab in Horizon to configure Postgres database servers. Instead, you must use the psql CLI on the default config database, which is named 'postgres.' To get there, use the Horizon-suggested connection command but add -d postgres. You also can't manage users for Postgres databases through Horizon.

Modify database configuration

Trove databases are configured through OpenStack – not through Puppet or text files. In Horizon, the "Defaults" tab for the Trove instance displays its default configurations. These values are applied to your database at startup, and may be different from the database system's defaults. To change the configuration of your database so that it persists across restarts, follow the steps below to create a Configuration Group and then attach it to an instance.

Create a Configuration Group

  1. Under "Database" in the Horizon menu, select "Configurations".
  2. Select "+ Create Configuration Group" and give your Configuration Group a descriptive name.
  3. In the "datastore" field, select the database type and version you're using.
  4. After you create your Configuration Group, click its name to specify its parameters. The "Add Parameter" menu lists possible parameters for your database, and a text box for your custom values. Some parameters require a restart of the database when you apply them, while others apply immediately without disruption.
Horizon doesn't display which parameter changes require a restart. The quickest place to find it is the source code for the validation rules in OpenStack's GitHub repository. For example, see this template of Mariadb parameters.

Attach Configuration Group

After you add parameters, attach the Configuration Group to an instance:

  1. Return to your list of instances.
  2. Select "Attach Configuration Group" from the dropdown menu for the instance, and choose the custom Configuration Group you want to add.
  3. Your configuration settings now apply persistently to all databases in that instance.

To view which instances a Configuration applies to, navigate to the Configurations section and click the name of a Configuration Group, then select the "Instances" tab.

Manage storage and compute

Resize a volume

Resizing a volume means changing the amount of disk space for a database. WMCS uses Cinder to provide attachable block storage for Trove instances and for Cloud VPS instances, but quota for those volumes is tracked and managed separately. As long as you're not exceeding your quota, you can resize your storage without needing to rebuild your database.

Resize an instance

Resizing an instance means adding cores or RAM to the database server that is running the database software. WMCS uses Nova to provision compute instances. As long as you're not exceeding your quota, you can resize your instance without needing to rebuild your database.

Request a quota increase

By default, any Cloud VPS project should be able to create one or two small Trove instances. If you need additional cores or gigabytes, open a quota request in Phabricator. Database instances don't count against a project's storage or virtualization quotas; Trove has separate quotas to govern database use.

Backup or snapshot your databases

To backup a database, run a database dump using mysqldump (or pg_dump for PostgreSQL), and store the backups on a separate Cinder volume.

WMCS has plans to add Swift storage backend and Trove database backup and snapshot features, but that is not yet available (phab:T276961).

Character sets (MariaDB/MySQL)

You can modify the default character set and collation for a Trove MariaDB/MySQL instance by creating a Configuration Group and adding the variables character_set_server and collation_server.

Please note that if you create a Trove instance using the Horizon interface and specify one or more "Initial databases" to be created with the instance, these databases will not respect the values set in the configuration group, but will instead use some default values (upstream bug report). The workaround is to create the Trove instance without specifying any initial database, then creating a database later: in the Create Database dialog in Horizon you can specify a custom character set and collation.

Dialog shown in Horizon when clicking "Create Database"

Remember your database client (mysql CLI, Python library, etc.) also has to use the right character set. You can use the commands SHOW VARIABLES LIKE 'character%'; and SHOW VARIABLES LIKE 'collation%'; to verify the current settings:

MariaDB [(none)]> SHOW VARIABLES LIKE 'character%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8mb4                    |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.002 sec)

MariaDB [(none)]> SHOW VARIABLES LIKE 'collation%';
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
| collation_database   | utf8mb4_unicode_ci |
| collation_server     | utf8mb4_unicode_ci |
+----------------------+--------------------+
3 rows in set (0.004 sec)

Troubleshooting

Database instance creation fails with cryptic error messages

A properly functioning Trove instance is quite complicated and relies on many different OpenStack features working in concert: storage, compute, DNS, networking, and Docker. This means that if OpenStack APIs are the least bit fragile, database instance creation may fail with cryptic error messages. Fragility in the orchestration layer does not necessarily mean that your data or database itself is fragile. Contact the WMCS team if you encounter ongoing errors in Trove functionality, but also please do not panic.

Flavor is not supported for datastore version (HTTP 400)

This error appears if you attempt to launch an instance but the Flavor you selected is not allowed for the type of Datastore you selected. By default all instances are allowed, but Trove admins can specify an optional list of "allowed flavors" for each datastore version. This allowlist is maintained in the datastore_version_metadata table in the Trove config database (you can connect to it with mysql -u root from a cloudcontrol host).

Running database instances appear as 'down' in Horizon

Existing, running database instances may appear as 'down' in Horizon if any of OpenStack's communication channels are unreliable. The good news is that Trove and OpenStack have very little interaction with the database once it's up and running. Fragility in the OpenStack orchestration layer does not necessarily mean that your data or database itself is fragile. Contact the WMCS team if you encounter ongoing errors in Trove functionality, but also please do not panic.

Operation not supported for datastore postgresql. (HTTP 422)

This error appears if you click a tab or menu item for an instance that targets a Postgres datastore, but that action isn't supported for Postgres databases. Trove can create and manage Postgres databases, but many runtime configuration options are unsupported.

Internal server error (HTTP 500)

Error: Unable to retrieve database instances. May be resolved by reloading the page. If reload doesn't resolve the issue, use one of the Cloud communication or support channels to get help or file a bug.

Instance does not start after applying a Configuration Group

One of the options in the Configuration Group is probably invalid. Try removing the options in the Configuration Group one at a time, to identify which one is causing the server to fail.

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)