MediaWiki On Kubernetes/How it works
What is in a MediaWiki kubernetes pod
A MediaWiki kubernetes pod contains the following containers:
- mediawiki-httpd
- The apache web server, containing all the apache configuration, the static files, and a basic web configuration
- mediawiki-app
- The PHP-FPM application server
- mediawiki-mcrouter
- The memcached proxy from facebook
- tls-proxy
- The mesh network sidecar, based on envoy.
- rsyslog
- The rsyslog container
- prometheus exporters
- Exporters for metrics from apache, php-fpm, and mcrouter
mediawiki-httpd (apache)
Resource utilization
resource | requests | limits |
---|---|---|
CPU | 200m | 500m |
Memory | 200Mi | 400Mi |
The docker image layering
The image for the apache web server is called restricted/mediawiki-webserver
and is only downloadable in production with the correct credentials, as it contains potentially sensitive data.
This image is built on the deployment hosts during the execution of scap backport
on the basis of the docker-registry.wikimedia.org/mediawiki-httpd
image, that is in turn based on our docker-registry.wikimedia.org/httpd-fcgi
image.
Each layer of this build provides part of the apache configuration.
Specifically, from the basic image up:
- httpd provides the installation of apache, the basic environment variables, the user setup and the basic directories
- httpd-fcgi provides the installation of a standard-configured httpd server with the ability to funnel all requests to a backend running a fcgi application server. It defines a few environment variables we also use in MediaWiki on kubernetes, see the code for details. It also adds configurations used by a fcgi application server, like the logging configuration, forwarding of headers, opening the admin port, and the configuration for two virtual hosts: a default one to handle fcgi requests and a monitoring subsite to handle requests for metrics coming from PHP internals like php-apcu and php-opcache.
- mediawiki-httpd which adds the basic configuration for apache for mediawiki, so basically everything you'd find on a normal appserver, minus the virtualhosts. Specifically:
- The main apache2.conf file, which is equal to the one on our main appservers
- Configuration for a few modules, same as the one on our main appservers
- Virtual hosts to overwrite the default one from the base image and the one to handle nonexistent domains
- restricted/mediawiki-webserver includes the specific paths used by MediaWiki as
/srv/mediawiki
and all the endpoints and static assets we need to serve from the httpd image.
In the chart
We control the following env variables:
SERVER_NAME = <pod-name>
- the value of theServer:
headerLOG_FORMAT = ecs_rsyslog
- sets sending logs to the local rsyslog via UDPAPACHE_RUN_PORT = .php.httpd.port
- the main port apache will listen onSERVERGROUP = .php.servergroup
- the SERVERGROUP variable we reference in mediawiki-config. It's used to choose various parameters and to tag the metrics with the origin cluster.FCGI_MODE = .php.fcgi_mode
- Whether to talk to fcgi via a unix socket (see below) or a tcp port on localhostLOG_SKIP_SYSTEM = 1
- Suppress access logging for monitoring calls from prometheus and kubernetes
The liveness probe just checks that the tcp port (set in php.httpd.port) is open and accepting connection, while the readiness probe fetches /healthz from the metrics port. This way a pod is only ready when apache is up and php-fpm has free worker slots.
We mount the following volumes in the container:
/run/shared
is an emptyDir used by this container and the php-fpm container to communicate via unix socket if FCGI_MODE is set to FCGI_UNIX.- we bind-mount the website definitions from value
mw.sites
under/etc/apache2/sites-enabled
. Those virtualhosts are injected using templates and a declarative structure for wikis. For standard production installations, this value is included from the/etc/helmfile-defaults/mediawiki/httpd.yaml
file, which is generated from puppet directly via theprofile::kubernetes::deployment_server::mediawiki::config
puppet class.
In addition, we allow optional mounting of two volumes that allow a quick-and-dirty way to modify the behaviour of the pod:
/srv/mediawiki/w/debug
is mounted if the value debug.php.enabled is true, and contains debug endpoints the user wants to inject into the deployment./etc/apache2/conf-enabled/00-aaa.conf
with the content set to the valuemw.httpd.additional_config
if present - providing apache configurations that will be evaluated before everything else.
Logging
The access logs are sent from apache httpd to the rsyslog running in the same pod via UDP on port 10200 using logger(1). Rsyslog then processes these logs and sends them out. The apache error log is sent to standard error and thus is picked up by rsyslog.
This happens because we set the environment variable LOG_FORMAT
to ecs_rsyslog
- otherwise the logs would go to stdout. If you also set the DEBUG
environment variable to 1
, then the loglevel is set to "debug", and mod_log_debug
is also loaded. Please note: this setting is only available in the containers; you are not allowed to set it in the chart as the amount of logs produced would overwhelm our logging infrastructure.
Metrics
Metrics from apache are exported using the apache-httpd-exporter prometheus exporter, running in the pod. As of now, we don't analyze the apache logs to export latency metrics as we do on the bare metal servers, but rather rely on envoy's telemetry data like for all the other services.
mediawiki-app (php-fpm)
Resource utilization
resource | requests | limits |
---|---|---|
CPU | 4 | 5 |
Memory | 1000Mi | 2800Mi |
The docker image layering
The image for running the php-fpm daemon and all of the MediaWiki code is called docker-registry.discovery.wmnet/restricted/mediawiki-multiversion
and is only downloadable in production with the correct credentials, as it contains private data.
The image is built on the deployment hosts during the execution of scap backport
on the basis of the docker-registry.wikimedia.org/php7.4-fpm-multiversion-base
image, that is in turn based on our docker-registry.wikimedia.org/php7.4-fpm
image.
Each layer of this build provides part of the container functionality.
Specifically, from the basic image up:
- php7.4-cli Is the basic docker image for any php application. It sets up the source repository, installs php and all the most common extensions, including our own excimer; defines a large number of environment variables that are then directly injected in the php.ini file
- php7.4-fpm Installs php7.4-fpm, defines even more environment variables to inject both in the php.ini file, including OPCACHE and APC, and the php-fpm configuration. Of main note is the FCGI_MODE variable, that decides if php-fpm will be listening on a socket or a TCP port, by deciding which of the files under
pool.d
will be included. - php7.4-fpm-multiversion-base Is a thin layer on top of php-fpm: it installs all of our own custom extensions (excimer, luasandbox, wikidiff2, wmerrors) and any additional extensions (yaml) one might need; We also install an smtp null agent (msmtp) to make sure we can send email (more on that below).
- restricted/mediawiki-multiversion Is the final image where we install the mediawiki code. It's built on the deployment server, using the make-container-image process like the webserver image. It's typically built incrementally on top of the lower layer. In order to reduce the amount of bytes transferred, we typically build the new version of the image as an additional layer on top of the last one, so each layer will contain more or less the patch we want to apply. At some point, one of our conditions verify (size of the image, number of layers on top of the base image, size of the new layer) that triggers a full rebuild. That usually happens when we deploy the train and so we have many new files in a new version branch under
/srv/mediawiki/php-XXX
., and both building and pushing this image can be very slow.
In the chart
We control quite a few env variables:
SERVERGROUP
,FCGI_MODE
- same as for the httpd containerFCGI_URL
derives fromFCGI_MODE
and gets injected in the php-fpm configuration- FCGI_ALLOW is used to list clients authorized to call php-fpm, and the list is limited to
127.0.0.1
, so other containers in the pod. - PHP__* - all these env variables can modify the php.ini file and thus php's behaviour.
Of note is the fact that if you set the php.devel_mode
value to true, opcache revalidation will be turned on, with checks for every request - so that any change to php files will be picked up by php-fpm.
The liveness probe is just a check that the tcp socket is reachable or that the file /run/shared/fpm-www.sock
is a unix socket.
We mount the following volumes in the container:
/etc/wikimedia-cluster
- a file bind-mounted containing the name of the datacenter we're in. This is used by MediaWiki./var/www
- ifmw.mail_host
is set, we configure out null mailer agent to send email with some configuration in the home directory of thewww-data
user./run/shared
- same as for the httpd container/etc/wmerrors
contains files defined via the mw.wmerrors value as filename:content yaml pairs. This value in production is fetched from /etc/helmfile-defaults/mediawiki/httpd.yaml, which is generated by puppet injecting the fatal-error.php file defined in puppet./var/log/php-fpm
if the value mw.logging.rsyslog is true
In addition, we allow optional mounting of /srv/mediawiki/w/debug
that allow a quick-and-dirty way to inject code for debugging purposes: the configmap will contain one file per key of the value debug.php.contents
, if debug.php.enabled
is true. This allows us to deploy new endpoints to a pod that can be used for debugging purposes. For example, given this configuration for the mw-debug deployment, when using the WikimediaDebug extension, you will be able to reach the code you injected using an url like https://en.wikipedia.org/w/debug/geoip.php (on any wiki).
Logging
When rsyslog is enabled (mw.logging.rsyslog is set to true), php-fpm logs both its error log and its slow log to an emptyDir shared with the rsyslog container. Otherwise, both are logged to stdout and picked up unstructured by our standard k8s logging pipeline.
These logs are elaborated by the local rsyslog (see below) and sent to logstash. Of particular importance are the php-fpm slowlogs (NDA restricted), which allow you to see where MediaWiki is spending time executing code for requests lasting more than 5 seconds.
Metrics
Metrics from php-fpm are collected via the php-fpm exporter sidecar container. Given the interface it uses (php-fpm status page) doesn't provide all the metrics we want, like opcache/apcu status, we've created a "monitoring vhost" running on a separate port, 9181, to be used by all monitoring.
This sends requests to the php backend that extracts the relevant metrics.
mediawiki-mcrouter
Resource utilization
resource | requests | limits |
---|---|---|
CPU | 200m | 700m |
Memory | 100Mi | 200Mi |
The docker image
The README for the image does a good job explaining how the image can be configured, so there is no major point adding any further information here.
In the chart
We mostly use the configurations baked into the cache module for our helm charts. Right now that allows for declarative specification of mcrouter pools. In production, we set up the same pools that we use on-premises. See memcached for MediaWiki for further details about how those routes are organized.
As an important aside: given how mcrouter monitors its configuration using inotify, when we change the configuration for mcrouter we don't need a rolling restart of the pods.
Logging
Standard kubernetes logging applies to the mcrouter containers.
Metrics
Metrics are collected by the mcrouter prometheus exporter, the same way as they're collected on-premises.
tls-proxy (envoy)
Resource utilization
resource | requests | limits |
---|---|---|
CPU | 200m | 750m |
Memory | 100Mi | 350Mi |
Please refer to the page about the service proxy mesh for details about how it works and how to add new services to it.
Syncing with puppet
There's several things in the configuration of the mesh who are kept in sync with puppet:
- The list of potential listeners is under the services_proxy key, and populated in
/etc/helmfile-defaults/general-<cluster>.yaml
, as defined in the puppet classprofile::kubernetes::deployment_server::global_config
- The list of active listeners is under the
discovery.listeners
key, and is populated in/etc/helmfile-defaults/mediawiki/tlsproxy.yaml
, as defined in the puppet classprofile::kubernetes::deployment_server::mediawiki::config
- The error page to serve from envoy in case of connection failure is under the
mesh.error_page
key. It is defined in/etc/helmfile-defaults/mediawiki/tlsproxy.yaml
, and the content is currently generated by themediawiki::errorpage_content
puppet define, included by the puppet classprofile::kubernetes::deployment_server::mediawiki::config
rsyslog
Resource utilization
resource | requests | limits |
---|---|---|
CPU | 100m | 1 |
Memory | 200Mi | 300Mi |
The docker image
The docker image is a very simple bullseye based rsyslog image. Nothing special or fancy about it.
In the chart
We install rsyslog if the value mw.logging.rsyslog
is set to true. We pass to it as env variables a few chunks of kubernetes metadata, so that those can be used in the log messages. We run rsyslog as www-data
, because we need to share the directory /var/log/php-fpm
with the php container in order to parse the slowlog and error log. The configuration files for rsyslog are installed under /etc/rsyslog.d via a configmap
This rsyslog handles various log sources we didn't think we could manage with the node-local rsyslog we're all used to. Specifically:
- The apache httpd access logs, which are sent in json format to rsyslog and then mangled and shipped to logstash over the
mediawiki.httpd.accesslog
kafka topic. They are sent over udp to port 10200 - The php-fpm error log, which is fetched from the shared directory
/var/log/php-fpm
, and parsed according to a custom ruleset. They are also shipped to kafka and then logstash. - The php-fpm slowlog, which is very important for allowing us to understand what's slowing down requests in production. It is parsed using a relatively obscure ruleset, transformed to proper ECS format, and shipped to kafka over the
mediawiki.php-fpm.slowlog
topic, then collected in a logstash dashboard. It's important to note that php-fpm slowlogs are a terrible fit for rsyslog or really any other logging systems - including the fact that its log field separator - an empty line - is prepended to the log line, and not appended. This results in interesting issues outlined in the chart already. - The MediaWiki logs, which are sent via UDP and we just ship out whatever MediaWiki sends us on port 10514 directly to logstash, like we do on-prem.