MariaDB/Upgrading a section

From Wikitech
This is the procedure used for upgrading a section to Buster and MariaDB 10.4. If upgrading to another version, exercise caution (and possibly update this banner)
This document assumes that all replicas with no other hosts hanging below, are already upgraded)

Order of upgrades

  • Upgrade clouddb* hosts.
  • Upgrade Sanitarium hosts in both DCs
  • Upgrade Sanitarium primaries in both DCs and ensure sanitarium host hangs from the 10.4 one in the active DC
  • Upgrade the candidate master on the standby DC
  • Upgrade the backup source in the standby DC (coordinate with Jaime)
  • Upgrade the master in the standby DC
  • Upgrade the candidate master in the primary DC
  • Upgrade the backup source in the primary DC (coordinate with Jaime)
  • Switchover the primary host in the primary DC to a Buster+10.4 host
  • Upgrade the old primary and make it a candidate primary

Upgrade procedure

  • Patch the dhcp file: [example]
  • Run puppet on install1003 and install2003
  • Depool the host (if needed) using software/dbtools/depool-and-wait
  • Silence the host in Icinga (e.g. on a cumin host, cookbook sre.hosts.downtime xxxx.wmnet -D1 -t TXXXXXX -r "reimage for upgrade - TXXXXXX")
  • Stop MySQL on the host
  • Run umount /srv; swapoff -a
  • Run reimage: sudo -E sudo cookbook sre.hosts.reimage xxxx.wmnet -p TXXXXXX
  • Wait until the host is up
  • Run systemctl set-environment MYSQLD_OPTS=”--skip-slave-start”
  • Run chown -R mysql. /srv/*; systemctl start mariadb ; mysql_upgrade
  • Run systemctl restart prometheus-mysqld-exporter.service
  • Dropped the host from Tendril and re-add it, otherwise they won’t get updated on tendril metrics
  • Check all the tables before starting replication (this can take up to 24h depending on the section)
    • In a screen run: mysqlcheck --all-databases
    • If any corruption is discovered, fix it with the following: journalctl -xe -u mariadb | grep table | grep Flagged | awk -F "table" '{print $2}' | awk -F " " '{print $1}' | tr -d "\`" | uniq >> /root/to_fix ; for i in `cat /root/to_fix`; do echo $i; mysql -e "set session sql_log_bin=0; alter table $i engine=InnoDB, force"; done
  • Start the replica
  • Wait until the host is up
  • Repool the host.

Upgrading mariadb minor version

The general steps for restarting a mysql instance apply, except there is no need to reboot the host entirely, unless a kernel upgrade requires it. Follow the details at: MariaDB/Rebooting_a_host, including how to safely depool, shutdown all servers and disable monitoring alerting.

The main changes are:

  1. You should log that a maintenance is about to happen:
    !log Upgrade db1111 T123456
    
  2. The package for the mariadb server must be upgraded, usually:
    sudo apt upgrade 'wmf-mariadb*'
    
    where wmf-mariadb* is the package version you want to upgrade to, e.g. wmf-mariadb104, for WMF's version of MariaDB 10.4 WMFf package is built thinking to avoid side effects- so it won't automatically try to stop, restart or alter in any way a running instance- so it is possible to run it an any time, even if a previous version is currently executing. But unless there is a reason for it (e.g. minimizing upgrade downtime) it should probably ran after all current instances are shutdown.
  3. Start mysql in a safe way- not starting replication automatically and removing any old buffer pool dump:
    sudo systemctl set-environment MYSQLD_OPTS="--skip-slave-start"
    <for each datadir> sudo mv ib_buffer_pool ib_buffer_pool.bak
    
  4. mysql_upgrade must be ran on every instance after startup, and before replication starts, for single instance hosts:
    systemctl start mariadb
    systemctl status mariadb  # check it started correctly (it is ok to have some errors on first start up due to ongoing upgrade, due to old table formats)
    mysql_upgrade
    
    For multiple instance hosts, for each instance:
    sudo systemctl start mariadb@<section>
    sudo systemctl status mariadb@<section>  # check it started correctly (it is ok to have some errors on first start up due to ongoing upgrade, due to old table formats)
    sudo mysql_upgrade -S /run/mysqld/mysqld.<section>.sock
    
    Where section is the list of instances to upgrade on that host (e.g. s1 and s2, x1, s5 and s4, etc.)
  5. After upgrade, if the mysql database changed, it is important to perform a reboot. This is normally skippable for minor upgrades, but guarantees it started with the right formatting:
    sudo systemctl restart mariadb # or sudo systemctl restart mariadb@<section> (for each section upgraded)
    

The rest of steps to get the server into production state would be the same as on a regular reboot/restart: MariaDB/Rebooting_a_host (restart replication, repool, reenable monitoring, safety checks)