Portal:Cloud VPS/Admin/Galera

From Wikitech
Jump to navigation Jump to search

The databases for OpenStack services are stored on a Galera cluster hosted on the cloudcontrol nodes.

Deployment

This cluster does not use custom WMF packages; it runs the standard packages from upstream Debian.

The cluster is active/active/active which means that write actions can be taken on any of the cloudcontrol nodes.

Directions for standing up a new cluster are included in puppet/modules/galera/manifests/init.pp.

DB setup

OpenStack services tend to use connection pooling, opening many long-lived connections to each database. For this reason, our Galera config has extremely long connection timeouts and very high connection limits.

General Operations

Restarting the local mariadb process

Don't let puppet restart this.

  • Tell haproxy the database is down just to be safe with sudo touch /tmp/galera.disabled
  • To begin, disable puppet so it can't mess with things.
  • sudo systemctl stop mariadb
  • In another shell run sudo journalctl -u mariadb.service -f to verify it cleanly exits. It can take a few moments or quite a while.
  • sudo systemctl start mariadb
  • Again, watch journalctl to see that it comes up alright.
  • Once it is up, access the mysql shell with sudo -i mysql -u root
  • Run SHOW STATUS LIKE "wsrep_local_state_comment"; and SHOW STATUS LIKE "wsrep_ready". If the first isn't "joined" or "synced" the node isn't ready. The second needs to be "ON" or almost all queries against it will fail. If you are manually handling haproxy to keep it out of the cluster, don't include it until you get one of those statuses.
  • Repool the database in haproxy by removing the /tmp/galera.disabled file.

Troubleshooting

Please read Portal:Cloud_VPS/Admin/Troubleshooting#Galera.

See also

TODO.