Help:Toolforge/Building container images

Toolforge tools can use the Toolforge Build Service to build a container image that includes the tool code and any dependencies it might require. The container image can then be ran as a web service or as a job on the Kubernetes cluster. Using build service built images allows for higher flexibility and performance than the previous-generation, per-language images made possible.

Behind the scenes, the build service uses the Cloud Native Buildpacks standard. Buildpacks allow you to generate a container image without having to add extra scripts or Dockerfiles. The goal of the Build Service is to make the process for deploying code to Toolforge easier and more flexible, by adopting a standard that is documented and maintained by a wider community (Buildpacks are a Cloud Native Computing Foundation -supported initiative).

Overview

The build service currently supports:

Building and running any application that is using a single supported language or runtime
- Compared to the current images, this enables you to use more languages than the ones supported by our custom container images
- Compared to the current images, Python apps are no longer tied to uWSGI and you can now use modern ASGI-based frameworks like FastAPI
- Compared to the current images, you can use newer versions of the base languages (ex. python 3.12, see Specifying a Python runtime), and benefit when newer versions are added upstream.
Running that application as a web service or as a job
Pulling the same image locally that is going to run in Toolforge
Compiling static frontend assets at build time with Node.js. This allows for projects like Flask + Vue, PHP + React, etc.
Installing OS-level packages, as long as they are included with Ubuntu 22.04 (see Installing Apt packages)

Roadmap

Some of the major planned features include:

Push to deploy (task T341065, task T334587)
Full streamlined storage support

See also the current limitations section for more details and other changes with the current process.

Quickstart

Prerequisites

If you don't have a tool account yet, you need to create or join one. Detailed instructions are available at Help:Toolforge/Quickstart.

Your tool's code must be accessible in a public Git repository, any public Git repository will work (Gerrit, GitLab, ...). You can setup one for free for your tool from the Toolforge admin console.

Procfile

You will need to create a Procfile to configure which commands to run to start your app.

The Procfile is always a simple text file that is named Procfile without a file extension. For example, Procfile.txt is not valid. The Procfile must live in your app’s root directory. It does not function if placed anywhere else.

The Toolforge webservice manager uses the web process type, and for jobs you can use whatever process type you want.

Note that every process type in the Procfile will become an executable command in the container, so don’t use the name of a real command for the process type, otherwise there will be a collision. For instance, a process type to run celery should not be called celery itself, but rather something like run-celery.

Example: Python web service

For example, the Procfile for a Python application using Gunicorn (which needs to be installed via requirements.txt) might be this:

web: gunicorn --bind 0.0.0.0 --workers 4 app:app
hardcodedmigrate: python -m app.django migrate
migrate: ./migrate.sh

The first entry (web), will be the one used for webservices if you start it as a webservice (NOTE: no matter it's name, currently is the first one found).

The other entries will be used for jobs, where you can have as many entries as you need for each different job you want to run, with the following behavior (see task T356016 for details):

hardcodedmigrate will not allow any extra arguments to be passed at run time, as the arguments are already defined in the procfile entry.
migrate will allow passing arguments as it has none defined. You can use a wrapper script if you want to hardcode some of the arguments only, or more complex behavior.

Note that there are some differences with the usual runtime environment, see the Known current limitations section for details.

Testing locally (optional)

To test if your application will build correctly, you can check on your local computer using pack. You should be able to build the image and start it, and it should listen on port 8000.

$ pack build --builder tools-harbor.wmcloud.org/toolforge/heroku-builder:22 myimage
$ docker run -e PORT=8000 -p 8000:8000 --rm --entrypoint web myimage
# navigate to http://127.0.0.1:8000 to check that it works

If pack is not available for your operating system, you can run it via Docker itself. Note that this is fairly dangerous, as it requires passing the Docker control socket into the container itself, effectively handing the pack container full control over your Docker daemon:

$ docker run -u root -v /var/run/docker.sock:/var/run/docker.sock -v "$PWD":/workspace -w /workspace buildpacksio/pack build --builder tools-harbor.wmcloud.org/toolforge/heroku-builder:22 myimage

If you're using Node.js in addition to another language, you need to manually specify the buildpacks to use. For example, with PHP and Node.js:

$ pack build --builder tools-harbor.wmcloud.org/toolforge/heroku-builder:22 --buildpack heroku/nodejs --buildpack heroku/php myimage

It is currently not possible to install additional Apt packages or locales because the upstream versions are currently incompatible with the cloud native builder image.

Configuration

Environment variables can be used to specify secret contents or configuration values.

Example: OAuth with Python + Django

If you followed Help:Toolforge/My first Django OAuth tool, you can migrate that app by extracting the values of the SOCIAL_AUTH_MEDIAWIKI_KEY and SOCIAL_AUTH_MEDIAWIKI_SECRET from the environment variables instead.

First let's create the environment variables with the right values:

tools.wm-lol@tools-sgebastion-10:~$ toolforge envvars create SOCIAL_AUTH_MEDIAWIKI_SECRET
Enter the value of your envvar (prompt is hidden, hit Ctrl+C to abort):
name                          value
SOCIAL_AUTH_MEDIAWIKI_SECRET  xxxxxxxxxxxxxxx

tools.wm-lol@tools-sgebastion-10:~$ toolforge envvars create SOCIAL_AUTH_MEDIAWIKI_KEY
Enter the value of your envvar (prompt is hidden, hit Ctrl+C to abort):
name                       value
SOCIAL_AUTH_MEDIAWIKI_KEY  xxxxxxxxxxxxxxx

Now you can use it in your settings.py file like:

import os

SOCIAL_AUTH_MEDIAWIKI_KEY = os.environ.get("SOCIAL_AUTH_MEDIAWIKI_KEY", "dummy-default-value")
SOCIAL_AUTH_MEDIAWIKI_SECRET = os.environ.get("SOCIAL_AUTH_MEDIAWIKI_SECRET", "dummy-default-value")
SOCIAL_AUTH_MEDIAWIKI_URL = 'https://meta.wikimedia.org/w/index.php'
SOCIAL_AUTH_MEDIAWIKI_CALLBACK = 'http://127.0.0.1:8080/oauth/complete/mediawiki/'

Build and deploy

If you are sure that your app will build and start in port 8000, then you can go to login.toolforge.org, and start a build as your tool. For example:

$ become mytool
$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/<your-repo>
$ # wait until command finishes

See toolforge build start --help for additional options such as --ref REF to select a specific branch, tag or commit rather than the current HEAD of the given repository.

You can also pass environment variables using --envvar that will be set during build time, to modify the build behavior, see the docs of each specific buildpack for specifics. For example, you can pass CLOJURE_CLI_VERSION for clojure or USE_NPM_INSTALL for nodejs.

Old builds are cleaned up automatically. The system will try to always keep at least one old successful build and a couple of previous failed builds for troubleshooting.

Webservice

To start a web service:

$ toolforge webservice --mount=none buildservice start

Alternatively, put the following in your service.template to make toolforge webservice start work on its own:

type: buildservice
mount: none

To update the code later, trigger a new build with toolforge build start as above; once the build has finished, a normal toolforge webservice restart will suffice to update it.

To see the logs for your web service, use:

$ toolforge webservice buildservice logs -f

Job

To use your tool's custom container image with the jobs framework:

$ toolforge jobs run --wait --image tool-test/tool-test:latest --command "migrate" some-job

This will run the migrate command as defined in your Procfile. You could also pass additional arguments, for example --command "migrate --production" would run the script specified in Procfile with the --production argument.

In order to load the execution environment properly, buildservice images use the binary launcher command. A call to launcher will be automatically prepended to the command you provide with the --command argument at execution time, but if that command uses composite commands, you must wrap them in a shell for the environment setup to work as expected:

# this will not work as expected, only the first `env` command will have the proper environment loaded
$ toolforge jobs run --wait --image tool-test/tool-test:latest --command "env; ls -la; nodejs --version"

# Instead you can wrap the commands in a shell execution
$ toolforge jobs run --wait --image tool-test/tool-test:latest --command "sh -c 'env; ls -la; nodejs --version'"

If your job needs to access the Toolforge NFS filesystems, for example to read or write files in the tool's $HOME directory, you must use the --mount=all parameter. See #Using NFS shared storage for details.

To see the logs for a specific job:

$ toolforge jobs logs some-job -f

Supported languages

We currently support all the languages included in Heroku's cnb-builder-images builder:

Go
Java (and JVM based languages, not yet clojure though).
Node.js
PHP
Python
Ruby

In addition, the following extra build packs are available:

Not all the languages have a tutorial, but you can find some below at #Tutorials for popular languages.

Note: since 2023-10-30 the heroku-builder-classic:22 has been deprecated, that dropped support for clojure, and moved from using the old 'heroku style' buildpacks to the new cloud-native compatible ones.

Other features

Installing Apt packages

Sometimes you can't get all the libraries you want, or have php and python installed at the same time by using the supported buildpacks only.

In those cases you can install custom packages by creating an Aptfile at the root of your repository, with the packages you want to install listed one per line (comments are allowed), for example, if you want to install imagemagick and php, along with your python application, you can create the following file:

# This file should be placed at the root of your git repository, and named Aptfile
# These will be pulled from the OS repositories
imagemagick
php

NOTE: If you use extra packages, you will not be able to build your project locally, so we encourage you to try using your language preferred package manager instead (pip/bundle/npm/composer/...).

Right now the images are based on Ubuntu 22.04 (jammy), although this can change in the future. You can use the packages.ubuntu.com tool to look up available packages and their versions.

NOTE: Packages are installed in a non-standard root directory (/layers/fagiani_apt/apt). There are environmental changes within the container at runtime to try to compensate for this fact, but there might be some issues related to it. It's recommended to use the language buildpack directly instead of installing through APT whenever possible.

Using Node.js in addition to another language

If your application uses Node.js in addition to another language runtime such as Python or PHP, it will be installed alongside that language as long as there is a package.json in the root of the project's repository. A typical use case for this would be an application that uses Node.js to compile its static frontend assets (including any Javascript code) during build time.

NOTE: this will not work automatically when testing locally using pack. See #Testing_locally_(optional) for a workaround you can use.

Adding support for other locales

You can add extra locales to your build by creating a .locales file at the top of your git repository specifying the list of locales that you want.

For examlpe:

de_DE
fr_FR
zh-hans_CN
zh-hant_TW
ru_RU

Then when your webservice or job builds, it will install and configure the locales. You can see the locales that are supported when running the image (either starting a webservice, or a job):

Enabled locales: de_DE it_IT pl_PL ja_JP en_GB fr_FR zh-hans_CN zh-hans_SG zh-hant_TW zh-hant_HK ru_RU es_ES nl_BE pt_PT

And you can load them by those names, for example in python:

import locale
locale.set_locale(locale.LC_ALL, "zh-hant_TW")
locale.localeconv()

# output
'zh-hant_TW'
{'int_curr_symbol': 'TWD ', 'currency_symbol': 'NT$', 'mon_decimal_point': '.', 'mon_thousands_sep': ',', 'mon_grouping': [3, 0], 'positive_sign': '', 'negative_sign': '-', 'int_frac_digits': 2, 'frac_digits': 2, 'p_cs_precedes': 1, 'p_sep_by_space': 0, 'n_cs_precedes': 1, 'n_sep_by_space': 0, 'p_sign_posn': 1, 'n_sign_posn': 1, 'decimal_point': '.', 'thousands_sep': ',', 'grouping': [3, 0]}

Migrating an existing tool

Many tools that are now running in Toolforge Kubernetes should also work with the Build Service with just a few simple changes.

If your tool is hosted on a public Git repository, it's possible the Build Service will automatically detect all that is needed to build a working container image of your tool.

However, it's likely you will have to change a few things:

You will need to add a Procfile with a `web:` entry to your project specifying the command to start the application (needed even if you are going to be running a job).
If you're migrating a Python tool that uses uWSGI, replace it with Gunicorn, see the above example.

Using NFS shared storage

Whether shared storage is mounted or not can be controlled with the --mount argument. Currently the default is to mount everything (--mount=all) for web services, and to disable all mounts (--mount=none) for the Jobs Framework.

The default mount option for web services might be changed in the future, so it is recommended to explicitly specify --mount=all if your tool relies on reading or writing files on NFS shared storage.

When mounted, NFS directories are located at the same paths as they are for other Kubernetes containers. A runtime difference is that the $HOME environment variable and default working directory do not point to the tool's data directory (/data/project/<toolname>). The $TOOL_DATA_DIR environment variable can be used instead to retrieve the path to the tool's data directory.

We recommend using the envvars service to create environment variables with secret configuration values like passwords instead of reading configuration files from NFS.

Connecting to ToolsDB and the Wiki Replicas

To connect to the databases, there's two sets of credentials provided through environment variables:

TOOL_TOOLSDB_USER and TOOL_TOOLSDB_PASSWORD are the user and password to connect to ToolsDB.
TOOL_REPLICA_USER and TOOL_REPLICA_PASSWORD are the user and password to connect to the Wiki Replicas.

NOTE: Even if the credentials are currently the same for the replicas and toolsdb, we strongly recommend using the variables for each specific service as they might change in the future.

Tutorials for popular languages

We have created some guides (more will be added) on how to deploy apps built with popular languages and frameworks.

Common problems and solutions

Please add to this section any issues you encountered and how you solved them.

Where are my old builds?

In order to be able to maintain a manageable level of builds, we only keep a few ones for each tool.

You should always have the last few successful and failed builds, if that is not the case, please reach to us (see #Communication and support).

Can I get an interactive shell inside my container?

Yes! Try toolforge webservice buildservice shell. Once you are in that shell, you will probably need to use launcher to start things like your target runtime's interpreter as many things are installed in buildpack specific locations. As an example, if you are in a Python project launcher python should start a Python REPL with your packages installed.

Can I run and inspect the container on my local host?

Yes! After you build your tool image using the Toolforge build service, you can fetch it from your local machine.

For example:

user@laptop:~ $ docker run --entrypoint launcher --rm -it tools-harbor.wmcloud.org/tool-mytool/tool-mytool:latest bash
[.. download ..]
heroku@c67659ba5bc2:/workspace$ ls
[.. your code ..]

Troubleshooting

We are actively building the debugging capabilities of the system, so they will be improved soon. Some hints:

If the build failed, try toolforge build logs to check the last build logs, you can also try building it locally.
If the build passed, but the webservice does not start or work, try toolforge webservice buildservice logs to see the logs created by your tool.

If you are unable to figure it out or found a bug, please reach out following #Communication and support.

Known current limitations

Limited support for multiple languages or runtimes (buildpacks) in single image

It is not possible to use multiple buildpacks in a single image, except for Node and a single other runtime. If you have a use case for this, please contact us.

Limited number of languages and runtimes available

Tracked in Phabricator
Task T352389

Currently available buildpacks are limited to those included with the heroku-22 stack and some additional ones curated by the Toolforge admins. It is not currently possible to include arbitrary third-party buildpacks in a build.

No LDAP connection for checking user information

We currently are not using a base image which knows how to use the Developer account LDAP directory. So unix/linux commands that use it, like trying to find the home directory for a given user (expanding ~) or checking which groups a user is (id <user>) in will not work as expected.

$HOME not pointing to the tool home

See #Using NFS shared storage above for storage limitations and changes.

Tool home directory is not available

See #Using NFS shared storage above for storage limitations and changes.

Out of quota

There's a limited amount of space available to store your builds, there's a recurring job cleaning up old images that runs every 5 minutes (see task T336360), and another one that garbage collects untagged ones every hour, if your build fails and you don't know why, running out of space might be the issue, please open a task or contact us and we'll fix it until the long term fix is rolled out.

Technical details

This section needs some cleanup.

Cloud Native Buildpacks are a specification as well as a piece of code you can put in a repository. A "buildpack" applies user code to a "stack" via a "builder" during the "lifecycle" to get code onto a "platform", which is also the term used to describe the full system.
You can read a comparison of building strategies here: Buildpacks vs Jib vs Dockerfile: Comparing containerization methods.

History

The Build Service was discussed for the first time in 2021. Below are some historical discussions that led to its current design and implementation.

The Toolforge admin team invited first tool maintainers to try the build service in May 2023^{[citation needed]}. An open beta phase was announced in October 2023.

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support

Chat in real time in the IRC channel #wikimedia-cloud ^connect or the bridged Telegram group
Discuss via email after you have subscribed to the cloud@ mailing list

Stay aware of critical changes and plans

Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
Read the News wiki page

Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)