MediaWiki at WMF

From Wikitech
Jump to navigation Jump to search
Wikimedia infrastructure

Data centres and PoPs

Networking

HTTP Caching

MediaWiki


Media

Logs

Search

[edit]

MediaWiki is the collaborative editing software that runs Wikipedia. This page documents its deployment at Wikimedia Foundation.

Infrastructure

Wikipedia request flow

A Wikipedia web request is processed in a series of steps outlined here (as of April 2020).

  • The DNS resolves hostnames like en.wikipedia.org ultimately points to an address like text-lb.*.wikimedia.org, for which the IP addresses are service IPs handled by LVS, which acts as a direct-routing load balancer to our caching proxies.
    » See also DNS, Global traffic routing, and LVS.
  • Wikimedia Foundation owns its content-delivery network. The public load balancers and caching proxies are located in all data centres (especially those with the sole role of being an edge cache, also known as "pop").
    » See also Clusters and PoPs.
  • The caching proxies are servers consisting of three layers: TLS termination, frontend caching, backend caching. Each cache proxy server hosts all three of these layers.
    » See also Caching overview.
    • TLS termination and HTTP/2 handling, handled by Apache Traffic Server (ATS) (internally called ats-tls). Prior to 2020, we used Nginx- here.
    • Frontend caching: This is an in-memory HTTP cache (uses Varnish, called "Varnish frontend", or varnish-fe). The LVS load balancers route the request to a random cache proxy server to maximise the amount of parallel traffic we can handle. Each frontend cache server likely holds the same set of responses in its cache, the logical capicity for the frontend cache is therefore equal to 1 server's RAM.
    • Backend caching: The backend HTTP caches are routed to by frontend caches in case of a cache miss. Contrary to the frontends, these are routed by a consistent hash, and they also persist their cache on disk (instead of in memory). The backend caches scale horizontally and have a logical capacity equal to the total of all servers. In case of a surge in traffic to a particular page, the frontends should each get a copy and distribute from there. Because of consistent hashing, the same backend cache is always consulted for the same URL. We use request coalescing to avoid multiple requests for the same URL hitting the same backend server. For the backend cache, we use a second layer of ATS (ats-be). Prior to 2020, WMF used a second layer of Varnish (varnish-be) for backend caching.
  • After the cache proxies we arrive at the application servers (that is, if the request was not fulfilled by a cache). The application servers are load-balanced via LVS. Connections between backend caches and app servers are encrypted with TLS, which is terminated locally on the app server using a local Envoy instance, which, in turn, hands the request off to the local Apache. Prior to mid-2020, Nginx- was used for TLS termination. Apache there is in charge of handling redirects, rewrite rules, and determining the document root. It then uses php-fpm to invoke the MediaWiki software on the app servers. The application servers and all other backend services (such as Memcached and MariaDB) are located in "Core services" data centers, currently Eqiad and Codfw.
    » See also Application servers for more about how Apache, PHP7 and php-fpm are configured.

App servers

See Application servers for more about how Apache and php-fpm are configured.

The application servers are divided in the following groups:

Description Conftool cluster Hiera cluster Purpose
Main app servers appserver appserver Public HTTP from ATS for wiki domains (except XWD, /w/api.php, or /api/rest_v1).
Debug servers testserver appserver Public HTTP from ATS for wiki domains with X-Wikimedia-Debug.
API app servers api_appserver api_appserver Public HTTP from ATS for wiki domains with /w/api.php.
Parsoid servers parsoid parsoid Internal HTTP to parsoid-php.discovery.wmnet. Used by RESTBase via /w/rest.php.
Jobrunners jobrunner jobrunner Internal HTTP to jobrunner.discovery.wmnet. Used by ChangeProp-JobQueue via /rpc or /w/rest.php.
Videoscalers videoscaler jobrunner Internal HTTP to videoscaler.discovery.wmnet. Used by ChangeProp-JobQueue via /rpc or /w/rest.php.
Maintenance hosts misc Internal. Used for scheduled and ad-hoc maintenance scripts run from the command-line.
Snapshot hosts dumps Internal. Used for scheduled work from the command-line relating to XML dumps.

For web requests using Apache, the "Hiera cluster" value is also exposed as $_SERVER['SERVERGROUP'] to PHP.

In Grafana dashboards, Prometheus metrics, and Icinga alerts the cluster field usually refers to the "Hiera cluster" value as well.

MediaWiki configuration

For web requests not served by the cache, the request eventually arrives on an app server where Apache invokes PHP via php-fpm.

Document root

Example request: https://en.wikipedia.org/w/index.php

The document root for a wiki domain like "en.wikipedia.org" is /srv/mediawiki/docroot/wikipedia.org (source).

The /srv/mediawiki directory on apps servers comes from the operations/mediawiki-config.git repository, which is cloned on the Deployment server, and then rsync'ed to the app servers by Scap.

The docroot/wikipedia.org directory is mostly empty, except for w/, which is symlinked to a wiki-agnostic directory that looks like a MediaWiki install (in that it has files like "index.php", "api.php", and "load.php"), but actually contains small stubs that invoke "Multiversion".

Multiversion

Multiversion is a WMF-specific script (maintained in the operations/mediawiki-config repo) that inspects the hostname of the web request (e.g. "en.wikipedia.org"), and finds the appropiate MediaWiki installation for that hostname. The weekly Deployment train creates a fresh branch from the latest master of MediaWiki (including any extensions we deploy), and clones it to the deployment server in a directory named like /srv/mediawiki/php-–.

For example, if the English Wikipedia is running MediaWiki version 1.30.0-wmf.5, then "en.wikipedia.org/w/index.php" will effectively be mapped to /srv/mediawiki/php-1.30.0-wmf.5/index.php. For more about the "wikiversions" selector, see Heterogeneous deployment.

The train also creates a stub LocalSettings.php file in this php-… directory. This stub LocalSettings. file does nothing other than include wmf-config/CommonSettings.php (also in the operations/mediawiki-config repo).

The CommonSettings.php file is responsible for configuring MediaWiki, this includes database configuration (which DB server to connect to etc.), loading MW extensions and configuring them, and general site settings (name of the wiki, its logo, etc.).

After CommonSettings.php is done, MediaWiki handles the rest of the request and responds accordingly.

MediaWiki internals

To read more about how MediaWiki works in general, see:

  • Manual:Code on mediawiki.org, about entry points and the directory structure of MediaWiki.
  • Manual:Index.php on mediawiki.org, for what a typical MediaWiki entrypoint does.

Timeouts

Request timeouts

Generally speaking, the app servers allow upto 60 seconds for most web requests (e.g. page views, HTTP GET), and for write actions we allow upto 200 seconds (e.g. edits, HTTP POST).

» See HTTP timeouts#App server for a detailed breakdown of the various timeouts on app servers.

Backend timeouts

MySQL/MariaDB
Setting MySQL's event_scheduler core events master core events replica
  • web requests user on running queries for read only replicas: 60s
  • web requests user for idle connections for read only replicas: 60s
  • web requests user read only replicas, on connection overload: 10s
  • web requests user for idle connections on read-write master: 300s
Type Wall clock time
Notes This was added as a measure to prevent pileups from a single event, as well as to overcome the (considered not ideal behavior) of terminated connections keeping running even if there won't be any socket open to report to. Implemented on MySQL's event scheduler for legacy reasons, but using max_execution_time or equivalent should be probably ideal.

Pages in the MediaWiki production category

See also