History of job queue runners at WMF
This page documents the operational history of job queue runners at WMF. A jobrunner is a service that continuously processes items from MediaWiki's job queue.
Used from 2009 until 2015.
- Management: Init service (debian wikimedia-job-runner)
- Service: jobs-loop.sh (SVN mediawiki-core@wmf-deployment)
- Orchestration: nextJobDB.php (firstname.lastname@example.org)
- Runner: runJobs.php (email@example.com)
- Queue store: JobQueueDB
- Backend: mc hosts.
The nextJobDB.php script used ad-hoc Memcached logic to orchestrate and aggregate state across the wiki farm.
In 2012, the "wikimedia-job-runner" debian package and jobs-loop.sh script were both folded into Puppet.
In 2013, Redis support was developed and transitioned to. This involved creating JobQueueRedis, formalising of the aggregation logic as JobQueueAggregator with a Memc and Redis implementations, and development of the JobQueueFederated concept. These were deployed later that year.
- Management: Init service and cron restarts (puppet)
- Service: jobs-loop.sh (puppet)
- Orchestration: nextJobDB.php (firstname.lastname@example.org)
- Runner: runJobs.php (email@example.com)
- Queue store: JobQueueFederated + JobQueueAggregatorRedis + JobQueueRedis (wmf-config)
- Backend: rdb10xx hosts.
jobrunner & jobchron
From 2015 to 2017.
The job queue provides a means of deferring work that is too expensive to perform in the context of a web request. On our production environment, it does this by using Redis to enable shared access to a queue. Web MediaWiki instances enqueue operations for asynchronous execution, and a special class of app servers called job runners dequeue and execute them. The master process on the job runners is a service implemented in PHP called jobrunner.
dispatcher configuration option specifies how a batch of jobs will be run. By default this uses the runJobs.php maintenance script. For Wikimedia specifically, the dispatcher is configured to instead make an HTTP request to an RPC endpoint on the localhost (docroot:/rpc/RunJobs.php). This allows it to optimally use HHVM (command-line invocation would have a higher startup time and no persistent compilation cache)
When the jobrunner service starts, it reads configuration values from
/etc/jobrunner/jobrunner.conf. This file is generated by Puppet from puppet:/modules/mediawiki/manifests/jobrunner.pp and puppet:/modules/mediawiki/templates/jobrunner/jobrunner.conf.erb. For an overview of configuration options, see the jobrunner.sample.json file.
- Browse to the deployment directory and get the local repo in the state you want to deploy (e.g.
- Once ready, first run
scap deploy-login one terminal (or screen) to start watching the logs.
- In another terminal (or screen), run
scap deploy -v "log message here"to start the deployment.
- Follow the instructions as deployment reaches each group of servers. Scap will automatically restart services on active jobrunner servers (e.g. those in the primary DC).
Logging and metrics
The jobrunner service logs to
For the old version of Job queue operational tasks and debugging see Revision 1863966 of Job_queue.
The current system as of 2017. See Kafka Job Queue.