From Wikitech
Jump to navigation Jump to search


This page introduces information about running a Web server on Toolforge. In our local jargon, these are also often called web services or webservices, a name possibly inspired by the webservice command line tool that is used to start, stop, and change other features for a web server.

Web Service Introduction

Every tool can have a dedicated web server running on either Kubernetes or if necessary the Grid Engine.

You are encouraged to run web services on Kubernetes.

More information about the differences between Grid Engine and Kubernetes later in this documentation.

Using webservice command

You can use the webservice command to start, stop, restart, and check the status of your tool's web server.

webservice usage
$ become my_cool_tool
$ webservice start

Use webservice --help to get a full list of arguments.

With no other arguments or configuration, webservice start will start a web server using the php7.3 runtime on Kubernetes. The document root for this lighttpd web server is $HOME/public_html. You will have to create this directory yourself.

Common issues

Static file server

Static files in a tool's $HOME/www/static directory are available directly from the URL This does not require any action on the tool's part — putting the files in the appropriate folder (and making the directory readable) should 'just work'.

You can use this to serve static assets (CSS, HTML, JS, etc) or to host simple websites that don't require a server-side component.

External assets

To preserve the privacy of our users, avoid embedding assets (images, css, javascript) from servers outside of Wikimedia Foundation control .

Toolforge provides an anonymizing reverse proxy to cdnjs. Browse libraries
Toolforge provides an anonymizing reverse proxy to Google fonts. Search fonts
Wikimedia provides maps servers with data from OpenStreetMap. Documentation

Using HTTP cookies

Since all tools in the 'tools' project reside under the same domain, prefix the name of any cookie you set with your tool's name, and if possible, add a Path attribute to limit the URLs that the browser will send the cookie back to.

Do not store privacy-related or security information in cookies. Cookies may be read by every other tool your user visits. A simple workaround is to store session information in a database, and use the cookie as an opaque key to that information.

Default tool memory limits for Web service jobs

  • Grid Engine: 4GiB
  • Kubernetes: 2GiB for most runtimes (Java's limit is 4GiB).

Requesting additional tool memory

  1. Request more tool memory by opening a Phabricator task
  2. Notify #wikimedia-cloud connect freenode irc channel that you have filed a request.

A Cloud Services administrator will review your request and can create a /data/project/.system/config/$TOOLNAME.web-memlimit configuration file that will adjust the limit.

Response buffering

An Nginx proxy sits between your webservice and the user. By default this proxy buffers the response sent from your server. For some use cases, including streaming large quantities of data to the browser, this can be undesirable. Buffering can be disabled on a per-request basis by sending a X-Accel-Buffering: no header in your response.[1]

Grid Engine and Kubernetes backends

Toolforge provides two different execution environments for web services: Grid Engine and Kubernetes.

Toolforge administrators recommend that you try using Kubernetes first for new tools and only use the Grid Engine backend if there is a technical limitation that prevents the tool from using Kubernetes.

The Kubernetes backend provides more modern software versions and will eventually be the default environment. The main drawback is that Kubernetes Web services can not spawn additional jobs on the job grid.

Processes Grid Engine and Kubernetes share

  1. Your workflow: you still ssh in, use become, and hack on code as usual
  2. Logging access: locations are the same
  3. Replica DB / Dumps access

Grid Engine

The Grid Engine backend runs your Web service as a grid job on a Debian Stretch grid exec node. This is similar to the way that jsub runs any grid job you submit, but there is a separate exec queue on the grid for running jobs started by webservice.


Kubernetes (k8s) is a platform for running containers that is slowly replacing the Grid Engine in Toolforge. Kubernetes Web services have access to newer versions of most software than the grid engine provides. K8s also provides a more robust system for restarting tools manually or automatically following an application crash.

User visible differences from GridEngine based Web services

  1. Each process runs inside a Docker container, orchestrated by Kubernetes.
    • Provides better resource isolation (one tool can not take down other tools by consuming all RAM or CPU)
    • Better health checking (monitoring built into Kubernetes, not a hack we wrote)
    • Less complex proxy setup, leading to fewer proxy related outages / issues
  2. Containers based on Debian Jessie
    • Newer software versions than those available with Ubuntu Trusty or Precise
    • Better support from Wikimedia TechOps team
  3. Less NFS surface exposed
    • /home is not mounted for web services
    • No /shared - use /data/project/shared instead. The latter works on both Grid Engine and Kubernetes. The former only on Gridengine.
    • /public/dumps and /data/scratch are mounted the same way
  4. It is not possible to interact with the Grid Engine from Kubernetes (no jsub...)
  5. Kubernetes backend has specific webservice options:

 -m MEMORY, --mem MEMORY
                       Set higher Kubernetes memory limit
 -c CPU, --cpu CPU     Set a higher Kubernetes cpu limit
 -r REPLICAS, --replicas REPLICAS
                       Set the number of pod replicas to use

Switching between GridEngine and Kubernetes

You can switch between the backends to make sure your code works on both of them.

From GridEngine to Kubernetes

 webservice --backend=gridengine stop
 webservice --backend=kubernetes <type> start

From Kubernetes to GridEngine

webservice --backend=kubernetes stop
webservice --backend=gridengine start

Configuring a default backend

You can choose a default backend for your tool by creating a $HOME/.webservicerc configuration file.

To set your tool's default to Kubernetes, use this syntax:


To set your tool's default to Grid Engine, use this syntax:


Default web server (lighttpd + PHP)



The lighttpd webservice type includes support for running PHP scripts from files with a .php in $HOME/public_html using a FastCGI helper process.

Use webservice --backend=kubernetes php7.2 start|stop|restart|shell to run a PHP based webservice on Kubernetes. If you need to, you can also use the legacy php5.6 version. See Kubernetes PHP documentation for more details.

Python (uWSGI)

uWSGI is a Web Server Gateway Interface server for Python2 and Python3 web applications. It is commonly used to run applications built with Flask, Django, or other Python web application frameworks.

webservice --backend=kubernetes python3.7
Python3.7 with a default uwsgi configuration
webservice --backend=kubernetes python3.5
Python3.5 with a default uwsgi configuration (deprecated)
webservice --backend=kubernetes python
Python3.4 with a default uwsgi configuration (deprecated)
webservice --backend=kubernetes python2
Python2 with a default uwsgi configuration (deprecated)
webservice --backend=gridengine uwsgi-python
Python2 on Grid Engine with a default uwsgi configuration
webservice --backend=gridengine uwsgi-plain
Python2 or Python3 on Grid Engine with a user supplied uwsgi configuration

Default uwsgi configuration

The uwsgi-python, python3.7, python3.5, python, and python2 types share a common uWSGI configuration designed to make it easy to deploy a typical Python webservice. This uses a convention over configuration design with the following expectations:

  • Your application will have a wsgi entry point in $HOME/www/python/src/ in a variable named app (example).
  • Python libraries will be loaded from a virtualenv located in $HOME/www/python/venv.
  • Custom configuration for uWSGI in ini file form will be loaded from $HOME/www/python/uwsgi.ini
    • Examples of configuration parameters can be found in the uWSGI manual.
    • Headers can be added using route = .* addheader:Access-Control-Allow-Origin: *
  • Logs will be written to $HOME/uwsgi.log

python3.7 (Python3 + Kubernetes)

  • webservice --backend=kubernetes python3.7 start|stop|restart|shell

See Default uwsgi configuration for general information..

This is running Python3.7 with virtualenv support - you must use a virtualenv for installing your libraries.

Using virtualenv with webservice shell

You need to setup and use a new virtualenv. You can do so with the following:

For new projects

First, get your python code setup so that your file lives under ~/www/python/src. Then...

  1. webservice --backend=kubernetes python3.7 shell
  2. mkdir -p ~/www/python
  3. python3 -m venv ~/www/python/venv (on a Toolforge bastion, use virtualenv -p python3 venv)
  4. source ~/www/python/venv/bin/activate
  5. pip install --upgrade pip (This brings in newest pip, which is required for wheel support)
  6. Install the libraries you need (e.g. pip install -r ~/www/python/src/requirements.txt)
  7. exit out of webservice shell
  8. webservice --backend=kubernetes python3.7 start

For Python2 projects, use python2 -m virtualenv in step 3.

Moving an existing project

If you are already running a Python3 Web service using uwsgi-plain on the job grid:

  1. Make a backup of your current venv: mv ~/www/python/venv ~/www/python/venv.gridengine
  2. Move your uwsgi.ini file away as well: mv ~/www/python/uwsgi.ini ~/www/python/uwsgi.ini.gridengine
  3. Follow the instructions #For new projects
  4. Before doing webservice --backend=kubernetes python start, you have to do a webservice --backend=gridengine stop
  5. To switch back to gridengine, you can do:
    1. mv ~/www/python/venv ~/www/python/venv.k8s
    2. mv ~/www/python/venv.gridengine ~/www/python/venv
    3. mv ~/www/python/uwsgi.ini.gridengine ~/www/python/uwsgi.ini
    4. webservice --backend=kubernetes stop
    5. webservice --backend=gridengine uwsgi-plain start

The fundamental thing to remember is that virtualenvs created straight on the bastion work only with gridengine, and virtualenvs created inside webservice shell work only with kubernetes.

Once you are done migrating and are happy with it, you can delete your venv & uwsgi.ini backups.

Installing numpy / scipy / things with binary dependencies

If your package with binary dependencies has a manylinux1 wheel, you can directly install it with pip quickly and with minimum hassle. You can check if your package has a manylinux1 wheel by:

  1. Go to
  2. Search for your package name in top right
  3. Find it in the list and click on it
  4. Look for packages that end in the string: cp34-cp34m-manylinux1_x86_64.whl
  5. If it exists, then this package is installable with a binary wheel!

You can install it by:

  1. webservice --backend=kubernetes python shell
  2. source ~/www/python/venv/bin/activate
  3. pip install --upgrade pip (This brings in newest pip, which is required for wheel support)
  4. pip install $packagename

Tada! You only need to do the pip install --upgrade pip once, after that you can install manylinux1 packages easily.

Note that this only applies if you are using a package with binary dependencies. Most python packages do not have binary dependencies (are pure python) and do not need this!

Python/Python3.5 (Python3 + Kubernetes)

This works mostly like python3.7, but for Python 3.4 respectivly 3.5. These are outdated versions of Python that are no longer supported upstream and should not be used for new tools.

Python2 (Python2 + Kubernetes)

  • webservice --backend=kubernetes python2 start|stop|restart|shell

See Default uwsgi configuration for general information.

uwsgi-python (Python2 + Grid Engine)

  • webservice --backend=gridengine uwsgi-python start|stop|restart

See Default uwsgi configuration for general information. Python 3 is not supported by this type, but see the section on uwsgi-plain below for an alternative.

uwsgi-plain (Python3 + Grid Engine)

  • webservice --backend=gridengine uwsgi-plain start|stop|restart

The uwsgi-plain type leaves configuration of the uWSGI service up to the tool's $HOME/uwsgi.ini configuration file. This allows users with unique requirements to tune the uWSGI service to work with their application. One reason to use this is if you must run a Python3 webservice on Grid Engine. A working config for a Python3 Flask app is documented in Phabricator task T104374.

Using a uwsgi app with a default entry point that is not

The default uwsgi configuration for the uwsgi webservice backend expects to find the uwsgi entry point as the variable app loaded from the $HOME/www/python/src/ module. If your application has another entry point, the easiest thing to do is create a $HOME/www/python/src/ module, import your entry point, and expose it as app. See Making a Django app work for an example of this pattern.

Making a Django app work

There is an issue that may currently need a workaround for Django: using utf8mb4 character and collation on your tables may cause issues with length of unique indexes, for instance when using python-social-auth or in your own models that have unique indexes. Using utf8 may cause errors when inserting 4-byte UTF-8 characters. See the issue for specific workarounds.

Setting up

By default your should be in ~/www/python/src/. And contain:

import os

from django.core.wsgi import get_wsgi_application

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "<YOUR-TOOL-NAME>.settings")

app = get_wsgi_application()

To correctly locate the static files configure the place the uwsgi.ini into ~/www/python/uwsgi.ini. And add this setting:

check-static = /data/project/<YOUR-TOOL-NAME>/www/python

and in use:

STATIC_ROOT = os.path.join(BASE_DIR, 'static')

Then deploy your static files into ~/www/python/static


You can find the logs in ~/uwsgi.log on both platforms

node.js web services

  • webservice --backend=kubernetes node10 start|stop|restart|shell
  • webservice --backend=gridengine nodejs start|stop|restart

Node.js can run fairly well on Toolforge including with websocket support. Using --backend kubernetes node10 is recommended so your code is executed with an up-to-date version of node (v10.15.2 as of November 2019). The Grid Engine backend provides an older version of node (v8.11.1 as of November 2019).

  1. Put your node application in $HOME/www/js in your tool's home directory. It is a dictionary path hardcoded.[1]
  1. Make sure your server starts up properly when npm start is executed. The default way to do this is to name your main script server.js
  2. Your server should bind to a port that is passed in as an environment variable (PORT). You can access this via process.env.PORT. Without this your tool will not be found by the Nginx proxy.
  3. Run webservice --backend=kubernetes node10 start to start your webserver (or webservice --backend=kubernetes node10 restart to restart it after a code change)
  4. Find your container's name by running kubectl get pods and use that name to check your container's logs kubectl logs -f $MY_CONTAINER_NAME
  5. PROFIT! :)

This is an example code for a node.js web server running as a tool:

var http = require('http');
var port = parseInt(process.env.PORT, 10) ; // IMPORTANT!! You HAVE to use this environment variable as port!

http.createServer(function (req, res) {
	res.writeHead(200, {'Content-Type': 'text/plain'});
	res.end('Hello World\n');

Keeping this in $HOME/www/js/server.js and doing a webservice --backend kubernetes node10 start should work; you may first need to create $HOME/www/js/package.json containing the text

    "scripts": {
        "start": "node server.js"

Running npm with webservice shell

To use an up-to-date version of node, e.g. for installing dependencies, run:

  1. webservice --backend=kubernetes node10 shell
  2. cd $HOME/www/js
  3. npm install


  • If you run into errors doing npm install, try LINK=g++ npm install
  • If you can't access the kubectl executable, could it be that you started a webservice shell and didn't exit it?



  • webservice --backend=gridengine tomcat start|stop|restart

Before using Tomcat, you have to setup Tomcat by running setup-tomcat. This will create a local Tomcat installation at $HOME/public_tomcat/.

To deploy a Web Application Archive (WAR), move it to $HOME/public_tomcat/webapps/$TOOL.war where $TOOL is the name of your tool. Archive extraction, deployment, and configuration is done automatically by Tomcat. A Tomcat restart may be required. The application will be available at$TOOL/.

To test the Tomcat webservice, you can use the Tomcat sample application (available on

When reading Tomcat tutorials, it is helpful to know that $CATALINA_HOME under our configuration is the $HOME/public_tomcat directory created by setup-tomcat. The default Tomcat classloader will read jar files such as a MySQL JDBC driver jar that are placed in $HOME/public_tomcat/lib (i.e. $CATALINA_HOME/lib).


If your Java application is more complex, the standard memory settings might not work. You might get errors like There is insufficient memory for the Java Runtime Environment to continue and Tomcat will simply stop working. See Help:Toolforge/Web § Memory limit for instructions on getting the runtime memory limit increased.

The settings for the JVM can be modified in public_tomcat/bin/ If the memory setting from JAVA_OPTS is too low, you'll get the well-known OutOfMemoryError from Java. In same cases, Tomcat may not stop anymore following an OOM error. Killing the grid engine job using qdel may be your only solution.

Play and similar JVM-based frameworks

  • webservice --backend=kubernetes jdk11 start|stop|restart|shell BINARY

Play Framework projects (and other JVM-based projects that have one executable to start the application) can be run on Toolforge. Play Framework uses JDK 8, so we need to use Kubernetes.

In order to work on Toolforge, the following Play configuration changes need to be made:

# Secret key
# ~~~~~
# The secret key is used to secure cryptographics functions.
# If you deploy your application to several instances be sure to use the same key!
# On Toolforge, we will make a startup script that specifies play.crypto.secret
# using a command line option reading from a private file.

# Port
# ~~~~~
# On WMF Toolforge, the port used by kubernetes webservice is 8000/TCP

# HTTP context
# ~~~~~
# Your tool will be available at$TOOLNAME/.
# Play usually expects to be operating at the root of a domain, so this setting is
# required for routes to work properly.

The application secret can be stored in a private file with 440 permissions.

After building the project, start your webservice using webservice --backend=kubernetes jdk11 start '$EXECUTABLE -Dplay.crypto.secret="$(cat /data/project/$TOOLNAME/app_secret)"'. For more details, see User:Sn1per/Play on Tool Labs.

Other / generic web servers

You can easily run other web servers that are not directly supported. This can be accomplished using the generic webservice type on the Grid Engine backend or a runtime specific type on the Kubernetes backend.

  • webservice --backend=gridengine generic start|stop|restart SCRIPT
  • webservice --backend=kubernetes golang start|stop|restart|shell SCRIPT
  • webservice --backend=kubernetes jdk11 start|stop|restart|shell SCRIPT
  • webservice --backend=kubernetes ruby2 start|stop|restart|shell SCRIPT

To start a webserver that is launched by a script at /data/project/toolname/code/server.bash, you would launch it with:

$ webservice --backend=gridengine generic start /data/project/toolname/code/server.bash

Your script will be passed an HTTP port to bind to in an environment variable named PORT. This is the port that the Nginx proxy will forward requests for to.

Ambox notice.png Note that your tool will receive URLs that include your tool prefix - e.g. /YOUR_TOOL/index.html instead of /index.html. You may need to adapt your tool configuration to handle this.

HHVM (experimental)

It is possible to run HHVM in proxygen mode as Generic webservice. Bryan Davis provided the following script (with some tweaks):

Copy the contents and save it as $HOME/ Then start the HHVM process using webservice --backend=gridengine generic start $HOME/

This has been tested and works. However, this is just an experimental implementation and not recommended for production bots, specially since the proxygen mode has some drawbacks:

  • Documentation for configuring HHVM's proxygen webserver is lacking upstream. Information can be found, but it requires a lot of digging.
  • No obvious support for alias configuration to easily map to the tool's $HOME/public_html. This can be worked around using hhvm.virtual_host[default][rewrite_rules] settings.
  • Multiple indexes files (index.php and index.html for example) is not supported yet (hhvm.server.default_document can be set only once; if set multiple times, only the last instance is used).

Running Hack-coded files

hhvm.hack.lang.look_for_typechecker option in the above script has been set to false in order to run Hack files without the Typechecker not running error. Please don't run hh_client in the Bastion or Grid servers; use your own HHVM installation instead.

Further information



Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Receive mail announcements about critical changes
Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
Track work tasks and report bugs
Use the Phabricator workboard #Cloud-Services for bug reports and feature requests about the Cloud VPS infrastructure itself
Learn about major near-term plans
Read the News wiki page
Read news and stories about Wikimedia Cloud Services
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)

See also