MariaDB/Upgrading a section

This is the procedure used for upgrading a section to Buster and MariaDB 10.4. If upgrading to another version, exercise caution (and possibly update this banner)

This document assumes that all replicas with no other hosts hanging below, are already upgraded)

Order of upgrades

Upgrade clouddb* hosts.
Upgrade Sanitarium hosts in both DCs
Upgrade Sanitarium primaries in both DCs and ensure sanitarium host hangs from the 10.4 one in the active DC
Upgrade the candidate master on the standby DC
Upgrade the backup source in the standby DC (coordinate with Jaime)
Upgrade the master in the standby DC
Upgrade the candidate master in the primary DC
Upgrade the backup source in the primary DC (coordinate with Jaime)
Switchover the primary host in the primary DC to a Buster+10.4 host
Upgrade the old primary and make it a candidate primary

Upgrade procedure

Patch the dhcp file: [example]
Run puppet on install1003 and install2003
Depool the host (if needed) using software/dbtools/depool-and-wait
Silence the host in Icinga (e.g. on a cumin host, cookbook sre.hosts.downtime xxxx.wmnet -D1 -t TXXXXXX -r "reimage for upgrade - TXXXXXX")
Stop MySQL on the host
Run umount /srv; swapoff -a
Run reimage: sudo -E sudo cookbook sre.hosts.reimage xxxx.wmnet -p TXXXXXX
Wait until the host is up
Run systemctl set-environment MYSQLD_OPTS=”--skip-slave-start”
Run chown -R mysql. /srv/*; systemctl start mariadb ; mysql_upgrade
Run systemctl restart prometheus-mysqld-exporter.service
Dropped the host from Tendril and re-add it, otherwise they won’t get updated on tendril metrics
Check all the tables before starting replication (this can take up to 24h depending on the section)
- In a screen run: mysqlcheck --all-databases
- If any corruption is discovered, fix it with the following: journalctl -xe -u mariadb | grep table | grep Flagged | awk -F "table" '{print $2}' | awk -F " " '{print $1}' | tr -d "\`" | uniq >> /root/to_fix ; for i in `cat /root/to_fix`; do echo $i; mysql -e "set session sql_log_bin=0; alter table $i engine=InnoDB, force"; done
Start the replica
Wait until the host is up
Repool the host.

Upgrading mariadb minor version

The general steps for restarting a mysql instance apply, except there is no need to reboot the host entirely, unless a kernel upgrade requires it. Follow the details at: MariaDB/Rebooting_a_host, including how to safely depool, shutdown all servers and disable monitoring alerting.

The main changes are:

You should log that a maintenance is about to happen:
```
!log Upgrade db1111 T123456
```
The package for the mariadb server must be upgraded, usually:
```
sudo apt upgrade 'wmf-mariadb*'
```
where wmf-mariadb* is the package version you want to upgrade to, e.g. wmf-mariadb104, for WMF's version of MariaDB 10.4 WMFf package is built thinking to avoid side effects- so it won't automatically try to stop, restart or alter in any way a running instance- so it is possible to run it an any time, even if a previous version is currently executing. But unless there is a reason for it (e.g. minimizing upgrade downtime) it should probably ran after all current instances are shutdown.

Start mysql in a safe way- not starting replication automatically and removing any old buffer pool dump:

sudo systemctl set-environment MYSQLD_OPTS="--skip-slave-start"
<for each datadir> sudo mv ib_buffer_pool ib_buffer_pool.bak

mysql_upgrade must be ran on every instance after startup, and before replication starts, for single instance hosts:

systemctl start mariadb
systemctl status mariadb  # check it started correctly (it is ok to have some errors on first start up due to ongoing upgrade, due to old table formats)
mysql_upgrade

For multiple instance hosts, for each instance:

sudo systemctl start mariadb@<section>
sudo systemctl status mariadb@<section>  # check it started correctly (it is ok to have some errors on first start up due to ongoing upgrade, due to old table formats)
sudo mysql_upgrade -S /run/mysqld/mysqld.<section>.sock

Where section is the list of instances to upgrade on that host (e.g. s1 and s2, x1, s5 and s4, etc.)

After upgrade, if the mysql database changed, it is important to perform a reboot. This is normally skippable for minor upgrades, but guarantees it started with the right formatting:
```
sudo systemctl restart mariadb # or sudo systemctl restart mariadb@<section> (for each section upgraded)
```

The rest of steps to get the server into production state would be the same as on a regular reboot/restart: MariaDB/Rebooting_a_host (restart replication, repool, reenable monitoring, safety checks)

This page is a part of the SRE Data Persistence technical documentation
(go here for a list of all our pages)