Help:Toolforge/Redis for Toolforge

From Wikitech
Jump to navigation Jump to search

Redis is a key-value store that can be used to implement publish/subscribe protocols between processes, maintain persistent queues, and serve as a caching service for data that is expensive to compute. Stored values can be different data structures, such as hash tables, lists, queues, etc. Stored data persists across service restarts.

Redis instances

A Redis instance that can be used by all tools is available via the service name redis.svc.tools.eqiad1.wikimedia.cloud, on the standard port 6379. It has been allocated a maximum of 12G of memory, which should be enough for most usage. You can set limits for how long your data stays in Redis; otherwise it will be evicted when memory limits are exceeded. See the Redis documentation for a list of available commands.

Redis libraries

Most languages have a library for interacting with Redis, for example PHP (phpredis), Python (redis-py), and Perl (perl-redis).

For quick & dirty debugging, you can connect directly to the Redis server with nc -C redis.svc.tools.eqiad1.wikimedia.cloud 6379 and execute commands (for example "INFO") or using redis-cli, which is installed on the bastions: redis-cli -h redis.svc.tools.eqiad1.wikimedia.cloud.

Celery

Redis is suffering from frequent drops in connections especially when used in conjunction with celery. See T318479

The redis service can be used as a broker between the worker and the web frontend to run a celery worker in a kubernetes container, as continuous job, see the documentation for Kubernetes (for instance to execute long-running tasks triggered by a web frontend).

Make sure you use a unique queue name so that your tasks get sent to the right workers.

If using Django, assuming you use following definition of the namespace in your celery.py file:

app.config_from_object('django.conf:settings', namespace='CELERY')

Then, you can adapt the settings.py as following:

REDIS_PASSWORD=""
REDIS_HOST="redis.svc.tools.eqiad1.wikimedia.cloud"
REDIS_PORT="6379"
REDIS_DB=0

REDIS_URL = ':%s@%s:%s/%d' % (
        REDIS_PASSWORD,
        REDIS_HOST,
        REDIS_PORT,
        REDIS_DB)

CELERY_BROKER_URL = 'redis://'+REDIS_URL
CELERY_DEFAULT_QUEUE = "abcdefghijklm"  # Pick a random string here

To run the Celery worker, use something like this:

$ toolforge jobs run --continuous --image python3.11 --command "/data/project/your-tool-name/www/python/venv/bin/celery -A yourDjangoApp worker" celery-worker

Of course, if your virtual environment is at a different position, adapt the path.

Security

Redis has no access control mechanism, so other users can accidentally/intentionally overwrite and access the keys you set. Even if you are not worried about security, it is highly probable that multiple tools will try to use the same key (such as lastupdated, etc). To prevent this, it is highly recommended that you prefix all your keys with an application-specific, lengthy, randomly generated secret key.

PLEASE PREFIX YOUR KEYS! We have also disabled the Redis commands that let users 'list' keys. This protection however should not be trusted to protect any secret data. Do not store plain text secrets or decryption keys in Redis for your own protection.

You can generate a prefix by running the following command:

openssl rand -base64 32

Disabled Redis commands

Some built-in Redis commands have been disabled in an attempt to make Redis safer for multi-tenant usage:

  • CLIENT
  • CONFIG
  • DEBUG
  • FLUSHALL
  • FLUSHDB
  • KEYS
  • MONITOR
  • RANDOMKEY
  • SCAN
  • SHUTDOWN
  • SLAVEOF

See also

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)