Help:Toolforge/Build Service
![]() | This page is currently a draft. More information and discussion about changes to this draft on the talk page. |
![]() | The Toolforge Build Service is currently in Beta state, this means that it is likely to change in non-backward compatible ways and/or suffer from outages and bugs. Although we encourage you to try it out and give feedback, please do not use it for critical tools until it becomes more stable and this banner is removed. |
The Build Service brings Cloud Native Buildpacks to Toolforge. Buildpacks are a specification as well as a piece of code you can put in a repository. A "buildpack" applies user code to a "stack" via a "builder" during the "lifecycle" to get code onto a "platform", which is also the term used to describe the full system.
The goal of the Build Service is to make the process for deploying code to Toolforge easier and more flexible, by adopting a standard that is documented and maintained by a wider community (Buildpacks are a CNCF-supported initiative).
Roadmap
Features that are already available
- Building and running any application that is compatible with one of the Officially supported Heroku buildpacks (python having been tested thoroughly, the others might need some tweaks for now, see current limitations).
- Compared to the current images, this enables you to use more languages than the ones supported by our custom Container images
- Compared to the current images, Python apps are no longer tied to uWSGI and you can now use modern ASGI-based frameworks like FastAPI
- Compared to the current images, you can use newer versions of the base languages (ex. python 3.11, see Specifying a Python runtime), and benefit when newer versions are added upstream.
- Running that application as a webservice
- Limited storage support (see current limitations)
- Running that application as a job
- Limited storage support (see current limitations)
- Pull locally the same image that is going to run in toolforge
Planned features that are not available yet
- Push to deploy
- Multi-language support on the same image (e.g. Python + Node.js)
- Other buildpacks in addition to the Officially supported Heroku buildpacks
- Installing custom system packages on the images
- Full streamlined storage support
- See the current limitations section for more details and other changes with the current process
Quickstart
Prerequisites
If you don't have a tool account yet, you need to create or join one. Detailed instructions are available at Help:Toolforge/Quickstart.
Your tool's code must be accessible in a public Git repository, any public Git repository will work (Gerrit, GitLab, ...). You can setup one for free for your tool from the Toolforge admin console.
Procfile
You will need to create a Procfile
to configure which commands to run to start your app. The Toolforge webservice manager uses the web
process type, and for jobs you can use whatever process type you want. However, for now, you must have a web
process type defined.
Example: Python web service
For example, the Procfile
for a Python application using Gunicorn (which needs to be installed via requirements.txt) might be this:
web: gunicorn --bind 0.0.0.0 --workers 4 app:app
migrate: python3 -m app.migrate
The first entry will be the one used for webservices if you start it as a webservice (NOTE: no matter it's name, currently is the first one found).
Otherwise they will be used for jobs, where you can have as many entries as you need for each different job you want to run.
Note that there are some differences with the usual runtime environment, see the Help:Toolforge/Build_Service#Known_current_limitations section for details.
Testing locally (optional)
To test if your application will build correctly, you can check on your local computer using pack
. You should be able to build the image and start it, and it should listen on port 8000.
$ pack build --builder tools-harbor.wmcloud.org/toolforge/heroku-builder-classic:22 myimage
$ docker run -p 8000:8000 --rm --entrypoint web myimage
# navigate to http://127.0.0.1:8000 to check that it works
If pack
is not available for your operating system, you can run it via Docker itself.
Note that this is fairly dangerous, as it requires passing the Docker control socket into the container itself,
effectively handing the pack
container full control over your Docker daemon:
sudo docker run -u root -v /var/run/docker.sock:/var/run/docker.sock -v "$PWD":/workspace -w /workspace buildpacksio/pack build --builder tools-harbor.wmcloud.org/toolforge/heroku-builder-classic:22 myimage
Build and deploy
If you are sure that your app will build and start in port 8000, then you can go to login.toolforge.org
, and start a build as your tool. For example:
$ become mytool
$ toolforge build start https://gitlab.wikimedia.org/toolforge-repos/<your-repo>
$ toolforge build show # wait until build passed
See toolforge build start --help
for additional options such as --ref REF
to select a specific branch, tag or commit rather than the current HEAD of the given repository.
Webservice
To start a web service:
$ toolforge webservice --backend kubernetes buildservice start
Alternatively, put the following in your service.template to make toolforge webservice start
work on its own:
backend: kubernetes
type: buildservice
To update the code later, trigger a new build with toolforge build start
as above;
once the build has finished, a normal webservice restart
will suffice to update it.
Job
To use with the jobs framework:
$ toolforge jobs run --image tool-test/tool-test:latest --command "migrate" --wait --no-filelog some-job
Supported languages
![]() | This list is likely to change in the future |
We currently support all the languages included in Heroku's builder-classic-22 builder:
- Clojure
- Go
- Java
- Node.js
- PHP
- Python
- Ruby
- Scala
Additional documentation for each buildpack, including the list of supported runtime versions, can be found at https://devcenter.heroku.com/articles/buildpacks
Migrating an existing tool
Many tools that are now running in Toolforge Kubernetes should also work with the Build Service with just a few simple changes.
If your tool is hosted on a public Git repository, it's possible the Build Service will automatically detect all that is needed to build a working container image of your tool.
However, it's likely you will have to change a few things:
- You will need to add a Procfile with a `web:` entry to your project specifying the command to start the application (needed even if you are going to be running a job).
- If you're migrating a Python tool that uses uWSGI, replace it with Gunicorn, see the above example.
Using NFS (might change soon)
Currently NFS directories are mounted on the same paths as they were before, the only difference is that for the tool's home directory, instead of using $HOME
, you have to use $TOOL_DATA_DIR
.
For secrets and most configurations, an alternative is being worked on that does not involve using nfs, that will become the recommended way of setting configuration and secrets (see task T334578).
Example: using a config file with python
If you have a configuration file under /data/project/mytool/myconfig.yaml
, to load it in python you'll have to do something like:
import os
import yaml
my_config = yaml.safe_load(open(os.path.expandvars("$TOOL_DATA_DIR/myconfig.yaml")))
The working directory has changed
If your tool relied on being run from a certain directory, you'll have to adapt it to run from /workspace
.
You can use the environment variable $TOOL_DATA_DIR
to get the tool home directory.
Note that previously, the tool’s working directory depended on the webservice type:
for instance, Python tools ran in the equivalent of $TOOL_DATA_DIR/www/python/src
.
Usually this means us one of:
- changing your execution directory from within the code (python ex.
os.chdir(os.path.expandvars("$TOOL_DATA_DIR/www/python/src"))
) - prepend your paths with
$TOOL_DATA_DIR
(python ex.os.path.expandvars("$TOOL_DATA_DIR/my_file.yaml")
)
Flask config
If you use app.config.from_file()
in Flask,
it will still try to load the config from /workspace
(where the code lives), not the current working directory.
You have to load the config file by its absolute path, like so:
config_path = 'config.yaml'
if 'TOOL_DATA_DIR' in os.environ:
config_path = os.environ['TOOL_DATA_DIR'] + '/www/python/src/config.yaml'
app.config.from_file(config_path, load=yaml.safe_load, silent=True)
You can skip the os.chdir()
in that case.
Tutorials for popular languages
We have created some guides (more will be added) on how to deploy apps built with popular languages and frameworks.
Common problems and solutions
Please add to this section any issues you encountered and how you solved them.
Troubleshooting
We are actively building the debugging capabilities of the system, so they will be improved soon. Some hints:
- If the build failed, try
toolforge build logs
to check the last build logs, you can also try building it locally. - If the build passed, but the webservice does not start or work, try
toolforge webservice --backend kubernetes buildservice logs
to see the logs created by your tool.
If you are unable to figure it out or found a bug, feel free to create a task (see the feedback section) or reach out to us on IRC/etc. (see communications section).
Known current limitations
File permission issues
In the current setup the tool is running on a different user than which was used to build the image, which might cause issues with certain buildpacks or tools if they try to create or modify files out of their home directory and/or $TOOL_DATA_DIR
.
PHP and Scala buildpacks do not work
See #File permission issues above.
No support for multiple language applications (multi-stack) or installing custom packages
We don't support yet having more than one language stack on a single image (ex. PHP + python, js + ruby), exploratory work is underway on how to support multi stack applications.
The same way, for the beta we are using the basic run image, so there's a very limited set of packages installed, we are working to find a way to enable more flexibility on which packages to get with it.
Only heroku builder with default languages are supported
We currently support a single builder (heroku-classic) and the default languages and buildpacks shipped with, for the list see: https://devcenter.heroku.com/articles/buildpacks
For this round of the Beta we are focusing on python, but all the other languages supported by the heroku builder are available if you want to help testing and adding support for them (ex. php, ruby, clojure, ...).
No LDAP connection for checking user information
We currently are not using a base image which knows how to use the Developer account LDAP directory. So unix/linux commands that use it, like trying to find the home directory for a given user (expanding ~
) or checking which groups a user is (id <user>
) in will not work as expected.
$HOME not pointing to the tool home
The way buildpacks work, we had to change the directory that $HOME
points to, if you want to access the tool NFS home directory you can still do it by using the environment variable $TOOL_DATA_DIR
(python ex. os.expandvars("$TOOL_DATA_DIR)
), or by using the full path, though it's more fragile (ex. /data/project/mytool
)
This means that if you code is looking for replica.conf
on the tool directory or any other file, you can access it with $TOOL_DATA_DIR/replica.conf
.
Builds aren't automatically triggered upon commit
Ideally a new build would automatically be generated upon pushing a new commit to your tool's git repository. Thus removing the need to manually trigger a build (though the capability to do so would remain). This "trigger" doesn't yet exist.
Out of quota
There's a limited amount of space available to store your builds, there's a recurring job cleaning up old images that runs every 5 minutes (see task T336360), and another one that garbage collects untagged ones every hour, if your build fails and you don't know why, running out of space might be the issue, please open a task or contact us and we'll fix it until the long term fix is rolled out.
Feedback
If you try to use the Build Service, either to deploy a new tool or to migrate an existing one, we would love to hear your feedback!
You can use this Phabricator template to report a bug, or this one for feature requests.
If you find this documentation page lacking in some way, or if you have more general comments, you can leave a comment in the Talk page.
For any other questions you can always reach us through the other communication channels listed at the bottom of this page.
Contributing
To contribute and/or follow up with the development of the project, take a look at the Ongoing Efforts page and the Contributing page.
History
The Build Service was discussed for the first time in 2021. Below are some historical discussions that led to its current design and implementation.
- Phabricator task
- Enhancement proposal: Toolforge push to deploy
- Enhancement proposal: Toolforge Buildpack Implementation
Communication and support
Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia Movement volunteers. Please reach out with questions and join the conversation:
- Chat in real time in the IRC channel #wikimedia-cloud connect, the bridged Telegram group, or the bridged Mattermost channel
- Discuss via email after you subscribed to the cloud@ mailing list