MariaDB/Multiinstance

From Wikitech

Our database hosts can run either a single mariadb instance (single-instance) or multiple (multi-instance). This affects how they are addressed over the network, and the systemd setup within the host itself.

Single-instance hosts

Mariadb runs on port :3306, and is controlled by the mariadb.service systemd unit. Monitoring is provided by the prometheus-mysqld-exporter.service unit. This instance is referred to by internal tools as the plain hostname, e.g. db1159.

Multi-instance hosts

The mariadb port and systemd unit name are determined by the sections the host is in.

The mapping from section to port is controlled by this list. s1 instances on multi-instances hosts run on :3311, m5 multi-instances run on :3325, and so on.

These instances are referred to by internal tools as the plain hostname + the port. E.g. db1000:3311, db1000:3325, etc.

The systemd unit is more straight-forward, it's mariadb@<section>.service. So s4 would be mariadb@s4.service.

Monitoring uses the same scheme. E.g. the exporter for m3 on multi-instance hosts is prometheus-mysqld-exporter@m3.service.

Determining if a host is single- or multi-instance

The simplest clue is to look at the motd on login. For example:

...
db1159 is a Misc DB Server (mariadb::core)
DB section m1
The last Puppet run was at Fri Jun  4 13:11:28 UTC 2021 (12 minutes ago). 
...

This is a single-instance host; there's only one section listed. Compare with:

...
dbstore1005 is a DBStore multi-instance server (mariadb::dbstore_multiinstance)
DB section s6
DB section s8
DB section staging
DB section x1
The last Puppet run was at Fri Jun  4 13:17:58 UTC 2021 (8 minutes ago). 
...

This is a multi-instance host; it has 4 sections listed.

Cleaning up a removed section from a multiinstance host

Currently, puppet doesn't automatically clean up leftover files and configuration when removing or moving away a section from a multiinstance host. This is a list of operations that would likely be wanted to be done manually until this is automated:

section="my_section_to_cleanup"
rm /etc/nagios/nrpe.d/check_mariadb_read_only_${section}.cfg 
rm /etc/nagios/nrpe.d/check_mariadb_replica_io_state_${section}.cfg
rm /etc/nagios/nrpe.d/check_mariadb_replica_sql_lag_${section}.cfg
rmdir /srv/tmp.${section}
mv /srv/sqldata.${section} /srv/sqldata.${section}.bak
rm /etc/mysql/mysqld.conf.d/${section}.cnf
rm /usr/local/bin/mysql.${section}
rm /etc/update-motd.d/06-db-section-${section}
rm /etc/ferm/conf.d/10_${section}_mariadb_internal
rm /etc/ferm/conf.d/10_${section}_orchestrator
rm /etc/ferm/conf.d/10_${section}_mariadb_dba
systemctl reload ferm
rm /etc/default/prometheus-mysqld-exporter@${section}
systemctl disable prometheus-mysqld-exporter@${section}
rm /lib/systemd/system/wmf_auto_restart_prometheus-mysqld-exporter@${section}.service
systemctl disable wmf_auto_restart_prometheus-mysqld-exporter@${section}.service
rm /var/log/wmf_auto_restart_prometheus-mysqld-exporter_${section}/syslog.log
rm /etc/rsyslog.d/20-wmf-auto-restart-prometheus-mysqld-exporter-${section}.conf
rm /lib/systemd/system/wmf_auto_restart_prometheus-mysqld-exporter@${section}.timer
systemctl disable wmf_auto_restart_prometheus-mysqld-exporter@${section}.timer
systemctl daemon-reload
systemctl reset-failed
rm /var/lib/prometheus/.my.${section}.cnf
# /etc/nagios/nrpe.d/check_mysqld.cfg will be overwritten