From Wikitech
Jump to navigation Jump to search

Toolforge Redis is running on three nodes (tools-redis-5/6/7). One of them is a master and rest of them are replicas. Keepalived makes sure a virtual IP address is always assigned to the master that the clients can connect to.

If the current master goes down, Redis Sentinel should notice that within five seconds and automatically fail over to a replica. It might take additional 10 seconds for the floating IP to move to the new master.

If Sentinel does not fail over to a new node (use redis-cli info replication to check), look into /var/log/redis/redis-sentinel.log on any alive node. If the IP address does not move, check sudo systemctl status keepalived and check that the /usr/local/bin/wmcs-check-redis-master script has exit code 0 on the master and 1 on the replicas.

Note that Sentinel requires a quorum to perform any actions - that means that it will not function with two nodes down. Additionally Redis has been configured to not accept any writes on the replicas or on the master if no replicas are connected.

Manual failover

If you need to force a failover or perform other Sentinel actions, you can connect to it using redis-cli on port 26379:

taavi@toolsbeta-redis-1:~$ redis-cli -p 26379>

Sentinel commands are listed at redis, use toolforge as the "master name".

The most useful command is sentinel failover toolforge which forces a failover to any other available node. You can alternatively add the IP address of the node to fail over to.

See also