Help:Toolforge/Web

From Wikitech
Jump to navigation Jump to search

Overview

Every Toolforge tool can run a dedicated website under "https://toolname.toolforge.org". Toolforge provides the webservice command which is used to start and stop the web server for each tool. Toolforge supports websites written in several programming languages including PHP, Python, Node.js, Java, Ruby and others. Toolforge also provides some support services which can help you make your website’s visitors safe from tracking by third party services.

The webservice command uses convention over configuration for some aspects of how the website is deployed. You’ll find details for different programming languages below.

Using the webservice command

You can use the webservice command to start, stop, restart, and check the status of a webserver.

webservice command example
$ ssh login.toolforge.org
$ become my-cool-tool
$ toolforge webservice start

Use webservice --help to get a full list of arguments.

Without any additional arguments or configuration files, webservice start will currently start a PHP 7.3 Kubernetes container serving content from your tool's $HOME/public_html directory using lighttpd as the web server software.

Webservice templates

The webservice command has the concept of a "template" file which can be used to store arguments (and eventually other structured content) for starting a webservice. The code will look for a --template=... command line argument and fallback to looking for a $HOME/service.template file. The $HOME/service.template file is what most tools will be expected to use, but we may find interesting uses for multiple templates in a single tool as well.

A webservice template file is a YAML document. It can contain these settings:

  • backend: the backend to use (equivalent to --backend=...)
  • cpu: the CPU reservation to ask for on Kubernetes (equivalent to --cpu=...)
  • mem: the memory reservation to ask for on Kubernetes (equivalent to --mem=...)
  • release: the operating system to ask for on Grid Engine (equivalent to --release=...)
  • replicas: the number of Pod replicas to use (equivalent to --replicas=...)
  • type: the type of webservice to start (equivalent to TYPE)
  • extra_args: extra arguments to pass to the backend (not used by most backends)

By saving desired startup state in a file, the user can use simple webservice stop; webservice start commands again!

Features

Toolforge has an Nginx server configured as a proxy server which handles all inbound requests to your tool's web server. This proxy server takes care of providing TLS termination and then reverse proxies the inbound request to your tool's web service. Web servers running on Kubernetes have a second Nginx proxy server running as the "Ingress" component inside the Kubernetes cluster. See Portal:Toolforge/Admin/Kubernetes/Networking and ingress for detailed information about the network and web request routing used by the Toolforge Kubernetes cluster.

Toolforge also includes a 404 handler service which will respond to HTTP requests for tools which do not exist and tools which are not currently running a web service. This service is implemented as the fourohfour tool which runs on the Kubernetes backend.

Switching between Kubernetes and Grid Engine

The Toolforge Grid Engine is deprecated and planned to be shut down in early 2024. You should migrate any tools still running on the Grid Engine to newer runtimes. For details, see News/Toolforge Grid Engine deprecation.

From Kubernetes to Grid Engine

$ toolforge webservice --backend=kubernetes stop
$ toolforge webservice --backend=gridengine start

From Grid Engine to Kubernetes

$ toolforge webservice --backend=gridengine stop
$ toolforge webservice --backend=kubernetes <type> start

Supported images and languages

We recommend building your own using the build service, that will give you more control over the setup of the service, allow you to use more languages and language-specific build tools, and it's simpler for us to maintain.

For any use-case that does not fit in the build service, we provide some pre-built images with a more limited set of languages:

Default web server (lighttpd + PHP)

See: Help:Toolforge/Web/Lighttpd

PHP

See: Help:Toolforge/Web/PHP

Python

See: Help:Toolforge/Web/Python

Node.js web services

See: Help:Toolforge/Web/Node.js

Java

See: Help:Toolforge/Web/Java

Other / generic web servers

You can run other web servers that are not directly supported. This can be accomplished using a runtime specific type on the Kubernetes backend.

  • webservice --backend=kubernetes golang start|stop|restart|shell SCRIPT
  • webservice --backend=kubernetes jdk17 start|stop|restart|shell SCRIPT
  • webservice --backend=kubernetes perl5.36 start|stop|restart|shell SCRIPT
  • webservice --backend=kubernetes ruby3.1 start|stop|restart|shell SCRIPT

Your script will be passed an HTTP port to bind to in an environment variable named PORT. This is the port that the Nginx proxy will forward requests for https://YOUR_TOOL.toolforge.org/ to. When using the Kubernetes backend, PORT will always be 8000.

Common tasks and guides

Hosting large files

Toolforge storage uses NFS which has limited storage and network bandwidth. If your tool requires a static file larger than 1GB (for example serving up a container image or tarball), please store that file in the 'Download' project rather than storing it in your tools home directory.

The Download project hosts https://download.wmcloud.org, a public read-only web server for large file storage. If you would like a file added, create a Phabricator ticket or contact WMCS staff directly to have the file added.

Serving static files

Files placed in a tool's $HOME/www/static directory are available directly from the URL tools-static.wmflabs.org/toolname. This does not require any action on the tool's part — putting the files in the appropriate folder (and making the directory readable) should 'just work'.

You can use this to serve static assets (CSS, HTML, JS, etc) or to host simple websites that don't require a server-side component.

Load external assets using our CDN services

To preserve the privacy of our users, avoid embedding assets (images, CSS, JavaScript) from servers outside of Wikimedia Foundation control.

Libraries (Browse libraries)
Toolforge provides an anonymizing reverse proxy to cdnjs.
Fonts (Search fonts)
Toolforge provides an anonymizing reverse proxy to Google Fonts.
Maps (Documentation)
Wikimedia provides maps servers with data from OpenStreetMap.

Runtime memory limits

  • Kubernetes: 2GiB for most runtimes (Java's limit is 4GiB).

Assigning a custom domain

At the moment all Toolforge websites must be available under the domain toolforge.org. So, it's not possible to assign a custom domain, pointing DNS records to Toolforge servers, etc.

This also allows the origin of the tools to be very clear.

Requesting additional tool memory

Kubernetes web servers start with a default limit on both runtime memory and cpu power. These limits vary slightly based on which runtime language (PHP, Python, Java, etc) you are using. The --cpu and --mem command line arguments can be used to increase these defaults up to the quota limit for your tool's Kubernetes namespace. See Kubernetes#Quotas and Resources for instructions on requesting an increased quota for your tool.

Response buffering

An Nginx proxy sits between your webservice and the user. By default this proxy buffers the response sent from your server. For some use cases, including streaming large quantities of data to the browser, this can be undesirable. Buffering can be disabled on a per-request basis by sending an X-Accel-Buffering: no header in your response.[1]

/favicon.ico

A default image will be served by the shared proxy layer if your webservice returns a 404 Not Found response when asked for /favicon.ico. This default icon is the same as the one found at https://tools-static.wmflabs.org/toolforge/favicons/favicon.ico.

/robots.txt

A default response will be served by the shared proxy layer if your webservice returns a 404 Not Found response when asked for /robots.txt. The default robots.txt response denies access to all compliant web crawlers. We decided that this "fail closed" approach would be safer than a "fail open" telling all crawlers to crawl all tools.

Any tool that does wish to be indexed by search engines and other crawlers can serve their own /robots.txt content. Please see https://www.robotstxt.org/ for more information on /robots.txt in general.

View web service logs

Some of the language specific web service types will log errors and other information to a file in the tool home directory.

In addition, the output from the webservice command is stored by the Toolforge Kubernetes infrastructure as long as the web service is running. To view these logs, use:

$ toolforge webservice --backend=kubernetes logs

This command also takes some flags:

  • -f to follow logs in real-time
  • -l [number] to only see a specific number of newest log lines

It is intended that these logs will eventually be stored in a more persistent storage system.

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)

References

  1. https://www.nginx.com/resources/wiki/start/topics/examples/x-accel/

See also