Portal:Toolforge/Admin/Runbooks/Redis
This runbook explains how to debug and fix common issues with Toolforge Redis.
Error / Incident
checker.tools.wmflabs.org/toolschecker: Redis set/get
is a paging alert that triggers where toolschecked is unable to talk to Redis.
Debugging
SSH to Redis hosts at tools-redis-X.tools.eqiad1.wikimedia.cloud
and check the output of redis-cli info replication
.
The Redis service is using a non-standard unit name in Systemd: redis-instance-tcp_6379.service
Common issues
max number of clients reached
When this happens, the following message will be logged: ERR max number of clients reached
. Check in the logs of all servers with sudo journalctl -g "max number of clients"
. If you find this message appearing repeatedly, restart the Systemd unit on the hosts where the message is logged:
$ sudo systemctl restart redis-instance-tcp_6379.service
Related information
Support contacts
#wikimedia-cloud-admin
is the main communication channel for Toolforge admins.
If Redis is down, you should follow the Wikimedia Cloud Services team/Incident Response Process