MariaDB/Switch Datacenter
Appearance
< MariaDB
The week before the switchover
- 7 days before: no more maintenance on the database clusters.
- 6 days before: Enable circular replication between eqiad and codfw.
- This requires updating
section_params
inhieradata/common/profile/mariadb.yaml
. E.g. gerrit:719168 - Run the
sre.switchdc.databases.prepare
cookbook.
- This requires updating
- In the new DC:
- Check and disable GTID on primaries.
- Check that all replicas have GTID enabled.
- Check for disabled notifications (icinga)/silences (alertmanager).
- Check that the query killers are installed and enabled.
- Review MW weights, comparing them to the old DC.
- Warm up the caches using queries from the old DC.
The day of the switchover
Before the switchover
- Downtime all db primaries just before the switch, so that read-only alerts won't fire (T285803).
After the switchover
- Manually fix parsercache hosts and x2 in tendril: T266723
- Submit a puppet patch changing host-down alerting:
- Background: gerrit:736415
- Move
profile::monitoring::is_critical: true
fromhieradata/role/<old dc>/mariadb/*
tohieradata/role/<new dc>/mariadb/
- Re-run puppet:
sudo cumin 'A:db-core or A:db-parsercache' 'run-puppet-agent -q'
After the switchover
- 2 days after: disable circular replication again:
- update
section_params
inhieradata/common/profile/mariadb.yaml
again. E.g. gerrit:721421 - Run the
sre.switchdc.databases.finalize
cookbook.
- update
This page is a part of the SRE Data Persistence technical documentation
(go here for a list of all our pages)