Primary database switchover/Archive

From Wikitech
This page may be outdated or contain incorrect details. Please update it if you can.

The MHA Way

mha is a pretty awesome set of perl scripts written by Yoshinori Matsunobu from facebook: https://code.google.com/p/mysql-master-ha/

Some background

mha relies on replication topologies defined in the global array $role::coredb::config::topology. With the exception of when you're failing over masters, this should always be the same as what is defined in mediawiki configs (wmf-config/db-<site>.php). It consists of the following:

  • 'hosts': a hash that has as keys each datacenter ('pmtpa' and 'eqiad') with an array of all nodes in that datacenter as the values.
  • 'primary_site': a string of what datacenter is currently the primary
  • 'masters': a hash that has as keys each datacenter with a string of the current master in that datacenter as the values.
  • 'snapshot': an array of all snapshots hosts in all datacenters for that shard. These hosts will never be promoted to a master by mha.
  • 'no_master': any host that should never be promoted to a master (such as a researchdb node)

This information is used to create mha configs per datacenter (ex: /etc/mha/s4-pmtpa.cnf) and an mha config that just consists of each site's master (ex: /etc/mha/s4-dc.cnf). These config files will reside on any box with the mha::manager class (at time of writing this was db1017 and db1056).

Using mha for planned failovers

This example will use the scenario of failing over the pmtpa s4 master because that's what I needed to do when I wrote this documentation. The general steps (as done by notpeter) are:

  • Run masterha_check_repl for the appropriate shard/datacenter: masterha_check_repl --conf=/etc/mha/s4-pmtpa.cnf
    • This will fail, saying that the master node in the other datacenter isn't defined, but this is a good way to catch other issues before starting.
  • If you are rotating master in the primary datacenter, set the shard to read-only mode, and record the master status of the primary master.
  • If you are rotating the secondary master, stop slaving on the current secondary master and record the slave status of the secondary master, and set the secondary master to have no slaving information. DO NOT LOSE THIS SLAVING INFORMATION OR YOU'RE TOTALLY FUCKED
  • Run masterha_check_repl again. It should now exit cleanly.
  • Before you run masterha_master_switch, there are some important flags to be aware of:
    • --master_state: This a mandatory parameter. --master_state takes either "dead" or "alive". If "alive" is set, masterha_master_switch starts online master switch operation.
    • --orig_master_is_new_slave: This is not the default, even if master_state=alive. In case of planned failover, you most likely want this.
    • --new_master_host: Use this to specify a new master. This is not required. If not specified, mha will select a suitable candidate (non-snapshot, non-researchdb).
    • --interactive: Can be 0 or 1. By default is 1 (interactive). This is not needed if you know what you're doing. If you don't know what you're doing, ask someone who does :)
  • It is now time to run masterha_master_switch. Example:
    • masterha_master_switch --master_state=alive --interactive=0 --orig_master_is_new_slave --conf=/etc/mha/s4-pmtpa.cnf
  • If failing over the secondary master, start the new secondary master off of the primary master using the slaving information saved above.
  • Push out a new version of wmf-config/db-<site>.php to reflect this topology change.
  • If failing over the primary master, take the shard out of read-only mode.
  • !log the new master binlog and position for the cluster in #wikimedia-operations
  • Edit DNS to update the sX-master or sX-secondary cname
  • Update the topology in puppet.

Using mha for emergency failovers

  • If you're reading this in panic, sorry I haven't written it yet. Use another method.

Known Issues

mha has mostly unusable error messages. Here are some things that have been discovered:

  • mha will not start a failover if any of the candidates are running an older major version of mysql than any of the slaves

The Fairly Easy Way (asher uses this, domas did something similar)

  1. cd to your svn checkout of trunk/tools/switch-master
  2. Have a ~/.my.cnf setup with the root mysql password
  3. Create a config file according to config.sample for the cluster you're switching. Make sure the secondary master in in the slaves section! (it can be added in for the form of sX-secondary)
  4. Edit db.php, moving the new master to position 0.
    1. Commit to the repository in this form (the post-switch version)
    2. Resume editing to comment out the old / current master (to drain it of all connections before the switch instead of sending a bunch of new slave queries to it - this results in a quicker swap) and to add this cluster to readOnlyBySection.
    3. sync-file wmf-config/db.php "switching master for sX to dbXX"
  5. run ./switch and step through it
  6. git checkout wmf-config/db.php (reverts local changes from step 4.2)
  7. sync-file wmf-config/db.php "completed master switch for sX"
  8. make sure to !log the new master binlog and position for the cluster in #wikimedia-operations
  9. edit DNS to update the sX-master cname
  10. edit puppet to update the $masters array in mysql.pp

The easy way (needs some updating)

DO NOT USE THIS IF THE MASTER IS DOWN ALREADY (unless you really know what you are doing)
  • NOTE: This tool doesn't currently switch the secondary master in the replication tree - make sure that is switched as well!! You should also update the $masters section in mysql.pp, and update the $cluster-master or $cluster-secondary dns cnames - they must be accurate for heartbeat monitoring.
cd /home/wikipedia/src/mediawiki/tools/switch-master
php switch.php <old master> <new master>

The hard way

The script above (switch.php) does the following tasks, which you can do manually if you feel that way inclined:

  • Check that the new master has binlog enabled. All slaves should be configured this way, for convenience. All servers, including masters, should have read_only mode on by default.
  • Run RESET MASTER on the new master
  • Log in to the mysql on the old master as root. Check for any long running queries. Kill them. If there is a single long-running query such as a backup, which will restart after you kill it, then get its thread ID ready in a kill command, ready to press enter later.
  • Set read-only mode in db.php, sync it.
  • Set read-only mode on the old master using SET GLOBAL read_only=1;
  • Run FLUSH TABLES on the old master. This will block until all queries complete, so you may need to kill some now.
  • Check for slave lag. Slave lag must be zero on the new master before you continue.
  • Run STOP SLAVE on the new master
  • Run SHOW MASTER STATUS on the new master to get the new replication position
  • Run a CHANGE MASTER TO query on each of the slaves including the old master
  • Set read/write mode on the new master using SET GLOBAL read_only=0;
  • Edit db.php again, turning off read-only mode and simultaneously switching the loads configuration.
  • Run RESET SLAVE on the new master, to prevent it from replicating the old master after restart.

What could be easier than managing a replicated cluster in MySQL?