From Wikitech
Jump to navigation Jump to search
Toolforge tools
Deputy Dispatch
Description Bulk data processor for Deputy users
Keywords copyright, data processing, api, javascript, nodejs, typescript
Author(s) Chlod Alejandrotalk
Maintainer(s) Chlod (View all)
Source code
License Apache License 2.0

Dispatch (or Deputy Dispatch) is a Node.js + Express webserver that exposes API endpoints that processes large masses of data from Wikimedia wikis for easier consumption by Deputy. It is meant to centralize and optimize the gathering and processing of bulk data such that numerous users of Deputy do not individual make taxing requests on Wikimedia servers.

This user makes requests under the user, but does not make any edits. It purely reads data from the Wikimedia servers, and the logged-in status allows it to query more than an anonymous user would be able to.


Dispatch is primarily used through Deputy. Deputy has been built to work cross-wiki and integrate with Dispatch to support every single Wikimedia wiki, with an out-of-box configuration which can handle simple copyright management tasks on the wiki.

The Dispatch API can also be used directly. Documentation for the API is automatically generated, and can be found here.

Asynchronous jobs

Some tasks done by Dispatch may require longer periods of time to run. Though these usually last under 3 minutes, timeouts or network issues may not be able to sustain such a connection for a prolonged period of time. For this reason, tasks which take a while to execute must be ran through asynchronous job requests. An initial request is sent to Dispatch (using POST) which returns a job ID. The progress of the job can then be polled using a GET to the /{id}/progress sub-path of that endpoint. Lastly, the result of that job when it completes can be accessed with a GET to the /{id} sub-path of that endpoint.

Note that attempting to access the result early will end up in a 409 Conflict HTTP error. The data is usually cached for an hour before being discarded. Refer to the documentation for the task information schema.


The deputy tool uses a standard Node.js web service to operate. Logging into the tool with become deputy will immediately drop you into the ~/www/js folder so you can get to work immediately.

Deployments are not automatic. As the Wikimedia GitLab instance develops, this may change in the future. For now, the following steps are used to deploy new versions of the tool.

  1. [me@tools-sgebastion-XX] become deputy
    • That's pretty obvious already.
  2. [tools.deputy@tools-sgebastion-XX] git pull
    • Change as needed to download specific tags, branches, etc.
  3. [tools.deputy@tools-sgebastion-XX] webservice shell
    • Getting a shell on the actively-running webservice ensures that the correct Node.js version is being used.
    • When in doubt, you can use a temporary Node.js pod as well.
  4. [tools.deputy@shell-XXXXXXXXXX] cd $HOME/www/js
    • In case you're not there already.
  5. [tools.deputy@shell-XXXXXXXXXX] npm ci
    • This re-downloads dependencies.
  6. [tools.deputy@shell-XXXXXXXXXX] npm run build:tsoa
    • This rebuilds the API routes and documentation.
    • Routes are handled by the package tsoa, hence the name.
    • Contrary to what you'd expect, TypeScript isn't actually transpiled into JavaScript in this step. TypeScript is run directly (see step 7).
  7. [tools.deputy@shell-XXXXXXXXXX] exit
    • In other words, leave the webservice shell.
  8. [tools.deputy@tools-sgebastion-XX] webservice restart
    • This restarts the webservice, which makes the bot use new code.
  9. All done, you can now log out and exit.

For deployment issues, you can email or use Special:EmailUser/Chlod Alejandro. If you both break and fix Dispatch (and you're not User:Chlod Alejandro), you get a complimentary chocolate chip cookie.


Instructions on how to get a development set-up of Dispatch can be found at Note that you will need a Toolforge account, because you'll need access to the Wiki Replicas. Attempting to run Dispatch without properly setting the database connection information up will cause a big warning to show up and all endpoints which rely on the Replicas to return errors.

External links