History of job queue runners at WMF/Jobrunner

From Wikitech
This page contains historical information. It may be outdated or unreliable.
See Job queue for current information. 2015

Redis is used for the MediaWiki job queue as a memcached replacement.

The Job queue in production is backed up by a Redis storage (see Redis for documentation on how to issue Redis commands). The jobs in the queue are run by the Jobrunner service.

Status

Cluster

Redis queues:

  • rdb1001, rdb1002, rdb1003 and rdb1004 (as of September 2013 – check Puppet for current data).

15 job runners (not counting videoscalers):

  • Eqiad: mw1161-1167 (7), mw1299-3006 (8). [1]
  • Codfw: mw2153-mw2162 (10), mw2243 (1), mw2247-mw2250 (4). [2]

Job queue keys

The queues have various pools which have names of the form

wikidbname:jobqueue:entrytype:setorlistorhashname (example: commonswiki:jobqueue:webVideoTranscode:h-data)

where the sets, lists and hashes are one of the following:

  • z-claimed -- set of 'claimed' job queue entries to be run
  • z-abandoned -- set of job queue entries that have been retried max times and failed, hence we gave up
  • z-delayed -- set of delayed entries (jobs set to run at a given time in the future)
  • l-unclaimed -- list of job queue entries waiting to be claimed to be run
  • h-data -- hash containing job id => job info for all jobs

and there may be others.

Getting counts of queues (lists and sets)

zcard <set-name> gives you the number of items in the set, so e.g.

zcard commonswiki:jobqueue:webVideoTranscode:z-abandoned

llen <list-name> gives you the number of items in the list, so e.g.

llen commonswiki:jobqueue:webVideoTranscode:l-unclaimed

Getting slices of sets

zrange <set-name> <slice-start> <slice-end> (lrange for lists) gives you the values in the set for the given index range, so e.g.

zrange commonswiki:jobqueue:webVideoTranscode:z-abandoned 0 50

will give you a list of items that look like

 1) "0e7b1d37856445f7a1d1eab1478f3ffd"
 2) "0f9f569e43324204bf3d8782ba09de63"
 3) "102e7ef6641644e8ac3654f73f3e7f35"
 4) "2e82cd1078044a1a893edbbc5fd24c8a"

etc. These entries are job ids.

Examining the data for a job

Once you have gotten the job ids by using a zrange statement as above, you can look at the data by asking for the value associated with the hash in h-data, e.g.

hget commonswiki:jobqueue:webVideoTranscode:h-data 092cbbc0f75344d1b33dd85c65d33328

This will return a string of serialized data describing the job, for example:

a:8:{s:4:\"type\";s:17:\"webVideoTranscode\";s:9:\"namespace\";i:6;s:5:\"title\";s:35:\"Sublimation_of_dry_ice_on_water.ogv\";s:6:\"params\";a:2:{s:13:\"transcodeMode\";s:10:\"derivative\";s:12:\"transcodeKey\";s:9:\"480p.webm\";}s:10:\"rtimestamp\";i:0;s:4:\"uuid\";s:32:\"c0959e4ed0a2451b8677119361f7af73\";s:4:\"sha1\";s:31:\"5t4s7oe4nz9jneymz8bovt226somsve\";s:9:\"timestamp\";i:1370265154;}
  1. Wikimedia: operations/puppet.git:/conftool-data/node/eqiad.yaml#L10-L25 (on GitHub)
  2. Wikimedia: operations/puppet.git:/conftool-data/node/codfw.yaml#L149-L163 (on GitHub)