History of job queue runners at WMF/Jobrunner

From Wikitech
Jump to navigation Jump to search
This page contains historical information. It is probably no longer true.
See Job queue for current information. 2015

Redis is used for the MediaWiki job queue as a memcached replacement.

The Job queue in production is backed up by a Redis storage (see Redis for documentation on how to issue Redis commands). The jobs in the queue are run by the Jobrunner service.



Redis queues:

  • rdb1001, rdb1002, rdb1003 and rdb1004 (as of September 2013 – check Puppet for current data).

15 job runners (not counting videoscalers):

  • Eqiad: mw1161-1167 (7), mw1299-3006 (8). [1]
  • Codfw: mw2153-mw2162 (10), mw2243 (1), mw2247-mw2250 (4). [2]

Job queue keys

The queues have various pools which have names of the form

wikidbname:jobqueue:entrytype:setorlistorhashname (example: commonswiki:jobqueue:webVideoTranscode:h-data)

where the sets, lists and hashes are one of the following:

  • z-claimed -- set of 'claimed' job queue entries to be run
  • z-abandoned -- set of job queue entries that have been retried max times and failed, hence we gave up
  • z-delayed -- set of delayed entries (jobs set to run at a given time in the future)
  • l-unclaimed -- list of job queue entries waiting to be claimed to be run
  • h-data -- hash containing job id => job info for all jobs

and there may be others.

Getting counts of queues (lists and sets)

zcard <set-name> gives you the number of items in the set, so e.g.

zcard commonswiki:jobqueue:webVideoTranscode:z-abandoned

llen <list-name> gives you the number of items in the list, so e.g.

llen commonswiki:jobqueue:webVideoTranscode:l-unclaimed

Getting slices of sets

zrange <set-name> <slice-start> <slice-end> (lrange for lists) gives you the values in the set for the given index range, so e.g.

zrange commonswiki:jobqueue:webVideoTranscode:z-abandoned 0 50

will give you a list of items that look like

 1) "0e7b1d37856445f7a1d1eab1478f3ffd"
 2) "0f9f569e43324204bf3d8782ba09de63"
 3) "102e7ef6641644e8ac3654f73f3e7f35"
 4) "2e82cd1078044a1a893edbbc5fd24c8a"

etc. These entries are job ids.

Examining the data for a job

Once you have gotten the job ids by using a zrange statement as above, you can look at the data by asking for the value associated with the hash in h-data, e.g.

hget commonswiki:jobqueue:webVideoTranscode:h-data 092cbbc0f75344d1b33dd85c65d33328

This will return a string of serialized data describing the job, for example:

  1. Wikimedia: operations/puppet.git:/conftool-data/node/eqiad.yaml#L10-L25 (on GitHub)
  2. Wikimedia: operations/puppet.git:/conftool-data/node/codfw.yaml#L149-L163 (on GitHub)