VRT System/Failover
Jump to navigation
Jump to search
VRT System |
---|
![]() |
|
VRTS has one active host (currently otrs1001) and one replica (vrts2001).
Prerequisites
The host to failover to should be a proper VRTS replica, meaning:
- is running the puppet
role(vrts)
- has the same files as the primary in
/opt
. There is currently a rsync setup and can be run usingsudo /usr/bin/rsync --rsh /usr/local/sbin/sync-vrts-ssl-wrapper -av --progress rsync://otrs1001.eqiad.wmnet/vrts /opt/
Planned Failover
A planned failover means the old production instance is responding and working properly. The following steps are needed to failover to a new host:
- Log in with an admin account to the VRTS dashboard and schedule new system maintenance for when you plan to do the failover. This can be done from Admin -> System Maintenance. This is important as one of the critical things we have to try and ensure during a failover is that no one is writing to the database. Maintenance mode ensures that only admins can login to the system and this goes a long way in reducing the number of people actively using the system and we can easily inform admins to not perform any critical tasks during a failover.
- Prepare DNS patch: In the DNS repo, open the wmnet template and change the record that
ticket
points to. This is under the "misc services without multiple backends section". - Ensure your new host is listed as the active_host in the
hieradata/role/common/vrts.yaml
file. This will ensure that it points to the write database in eqiad. Since there are only two hosts, you can just invert the values of active_host and passive_host.
Unplanned Failover
An unplanned failover means the old production instance is not responding/lost.