WikimediaDebug
WikimediaDebug is a set of tools for debugging and profiling MediaWiki web requests in a production environment.
You can use WikimediaDebug through the accompanying browser extension, or from the command-line using the X-Wikimedia-Debug
header.
So long as WikimediaDebug is on, all your requests will be handled as a Varnish cache miss.
Browser usage
Install the extension for Firefox or Chrome:
The extension creates a button with the Wikimedia community icon, usually near the upper right corner of the browser window.
To start debugging:
- Go to a Wikimedia site, e.g. https://en.wikipedia.org,
- Click the icon,
- Toggle the "On/Off" switch.
The extension automatically deactives itself after 15 minutes. This makes sure you don't keep generating debug data indefinitely, or unintentionally flood someone else's debugging session with pageviews.
In situations where installing an extension is not feasible (e.g. issues specific to mobile Chrome), you can visit Special:WikimediaDebug to enable debugging via a cookie. This is more limited but gets handled the same way as the header.
Command-line usage
Force Varnish to skip the cache and pass request to the MediaWiki_On_Kubernetes mwdebug release:
$ curl -H 'X-Wikimedia-Debug: backend=k8s-mwdebug' https://meta.wikimedia.org/wiki/Main_Page
Same as above, but profile a request using Tideways and publish the profile to XHGui:
$ curl -H 'X-Wikimedia-Debug: backend=k8s-mwdebug; profile' https://meta.wikimedia.org/wiki/Main_Page
Request a page with MediaWiki configured for read-only mode:
$ curl -H 'X-Wikimedia-Debug: backend=k8s-mwdebug; readonly' https://meta.wikimedia.org/wiki/Main_Page
Options
The following attributes can influence the backend request:
backend=…
: Choose the mwdebug server that Varnish will route your request. It can be set to1
in Beta Cluster where the only backend acts as both appserver and mwdebug server.forceprofile
(Inline profile): Capture a trace profiler and append it to the web response.log
(Verbose logs): Verbosely enable all MediaWiki debug log groups, submitted to Logstash for querying.profile
(XHGui): Record a trace for performance analysis and publish the result to XHGui.readonly
: Read-only mode requests a page from MediaWiki with read-only mode enabled, to simulate how your code behaves when the database is read-only due to replica lag or scheduled maintenance.shorttimeout
: Simulate how your code handles execution timeouts (e.g. from mediawiki/libs/RequestTimeout, T293568). When enabled,$wgRequestTimeLimit
is set to 2s. This option is available via command-line use only.
Available backends
You can simulate how your code behaves in a secondary data center (Multi-DC), by selecting a backend in a DC other than the current primary. You can use either of the datacenters through this list at any time, even if that datacenter is currently inactive or depooled. If you pick a backend in a datacenter that is not currently the primary DC, some actions may be read-only, disabled or slower.
The default is now to serve debug requests from a specific MediaWiki installation running on our Kubernetes cluster. As all production traffic is now served by MediaWiki_On_Kubernetes, this should be the primary backend used to test changes.
The following application servers are dedicated to WikimediaDebug use:
- k8s-mwdebug (mw-on-k8s, default).
- 2x mwdebug in Eqiad
- 2x mwdebug in Codfw
- k8s-mwdebug-eqiad (mw-on-k8s, specifically target the eqiad deployment).
- k8s-mwdebug-codfw (mw-on-k8s, specifically target the codfw deployment).
Request profiling
Excimer UI provides detailed and interactive flame graphs on a per-request basis. To use it, enable the "Excimer UI" option (excimer
attribute). This will capture PHP trace logs from the current request (using Excimer), and send them to Excimer UI at performance.wikimedia.org, where the visualisation is powered by Speedscope. Excimer UI launched in May 2023 (T291015).
To try out Speedscope with MediaWiki data, open this example profile.
Speedscope views
The default view is Time Order, this renders calls in chronological order on a timeline from left to right.
The Left Heavy view re-arranges each row such that calls to the same child on the next row are grouped together. For example, if a function A calls functions B and C alternatingly, in Left Heavy view, all the calls from A->B are grouped together. This allows you to clearly identify visually where the majority of time is spent at each level.
The Sandwich view sorts each individual function in a list. This allows you to globally find patterns regardless which the parent function is or where it sits in the tree. This is analogous to a "reverse flame graph". For example, if many different unrelated parts of your code all end up using the same low-level library for a particular part of their logic, then that library likely would not show up prominently in either of the other views as its time would be divided across many different branches. In Sandwich mode, you can sort by self time to identify the hottest functions that perform the lowest level of work. These tend to be functions relating to reading files, caches, and databases.
XHGui profiling
XHGui provides a call tree for a given web request with aggregations of execution time, call count, and memory usage for all called functions. The capturing and storing for XHGui is powered by php-tideways (formerly known as XHProf). Our instance is configured via wmf-config makes your profiles publicly accessible at https://performance.wikimedia.org/xhgui/.
To explore XHGui and its features, open this example profile.
Capture a profile from a browser:
- Enable the "XHGui" option.
- Click the "Find in XHGui" link in the WikimediaDebug popup. This takes you directly to a list of profiles matching your Request ID (usually only 1 match).
- Click the sharable permalink (e.g. the timestamp link, or "GET" method link) to view the recorded profile.
Capture from the command-line using Curl:
- Set the
profile
attribute. - Find your profile at https://performance.wikimedia.org/xhgui/.
Plaintext request profile
WikimediaDebug can deliver a plaintext-format profile directly from the web server. Enable the "Inline profile" option (forceprofile
attribute) on any MediaWiki web request (including index.php, api.php, and load.php), and MediaWiki will append a plain text profile to the web response.
Example in the browser: Open any page on Wikipedia and view source (example URL). Then, enable WikimediaDebug with the "Inline profile" option, and reload the browser tab. At the end of the response should be a call-graph summary, like this:
/* 100.00% 1437.125 1 - main() 87.21% 1253.268 1 - ResourceLoader::respond 79.31% 1139.756 1509 - ResourceLoaderModule::getVersionHash 77.88% 1119.292 3 - ResourceLoader::getCombinedVersion ..
Example using cURL:
# Production $ curl -H 'X-Wikimedia-Debug: 1; forceprofile' 'https://test.wikipedia.org/w/load.php?modules=startup&only=scripts&raw=1' # Beta Cluster $ curl -H 'X-Wikimedia-Debug: 1; forceprofile' 'https://en.wikipedia.beta.wmflabs.org/w/load.php?modules=startup&only=scripts&raw=1'
Plaintext CLI profile
You can generate a plaintext-format profile from any MediaWiki maintenance script, by setting the --profiler=text
option. Note that in Wikimedia production, maintenance script must be run via the mwscript
command. For performance reasons, the required debug tools are only installed on mwdebug
servers, so while we normally run Maintenance scripts from an mwmaint
host, when debugging make sure to connect your termimal to a debug host first. (in the Beta Cluster, you can use any MW server).
krinkle@mwdebug$ mwscript showSiteStats.php --wiki=nlwiki --profiler=text Number of articles: 2122688 Number of users : 1276507 <!-- 100.00% 114.964 1 - main() … 22.42% 25.776 1 - ShowSiteStats::execute 16.61% 19.096 2 - Wikimedia\Rdbms\LoadBalancer::getServerConnection 4.80% 5.522 1 - Maintenance::shutdown 4.41% 5.065 1 - Wikimedia\Rdbms\Database::initConnection 3.07% 3.530 1 - DeferredUpdates::doUpdates 2.66% 3.061 1 - Wikimedia\Rdbms\Database::select 2.38% 2.739 1 - Wikimedia\Rdbms\Database::query 1.95% 2.240 1 - section.SELECT * FROM `site_stats` LIMIT N 1.48% 1.700 1 - Wikimedia\Rdbms\DatabaseMysqli::mysqlConnect … -->
Verbose logging
Setting the log
attribute in the X-Wikimedia-Debug header (“Verbose log” checkbox in the extension) will cause MediaWiki to be maximally verbose, recording all log messages on all channels (regardless of whether or not they are otherwise enabled in wmf-config).
These messages will end up in Logstash at https://logstash.wikimedia.org/app/dashboards#/view/mwdebug and in /srv/mw-log/XWikimediaDebug.log
on the mwlog host. See Logs#mw-log for more information.
To view the logs of a specific web request only, the browser extension adds a "Find in Logstash", which opens goes to Logstash with a filter for the request ID of the current page view. You can also construct this URL manually, by navigating to https://logstash.wikimedia.org/app/dashboards#/view/x-debug and entering a search query like reqId:"…"
Debug logging (CLI)
If you are investigating problems with a specific appserver (i.e. not an mwdebug host) or otherwise can't rely on Logstash, you can use the MW_DEBUG_LOCAL
environment variable to send verbose logs to the /tmp/wiki.log
file on the same server. It is recommended to depool the host first.
- If you can excercise the problem from
eval.php
, then consider using its-d
option to print verbose logs directly to your terminal's stdout. See also Manual:Eval.php on mediawiki.org - If you can excercise your debug scenario a specific maintenance script, then
MW_DEBUG_LOCAL
can be passed tomwscript
as follows, for example, the equivalent ofmwscript version.php testwiki
is:
sudo -u www-data MW_DEBUG_LOCAL=1 php /srv/mediawiki/multiversion/MWScript.php version.php --wiki testwiki
- If you need to excercise your code from a web request, you can follow Debugging in production#Pushing code to a debug server to temporarily modify
/srv/mediawiki/wmf-config/logging.php
and replaceif ( getenv( 'MW_DEBUG_LOCAL' ) )
withif ( true )
.
Remember to clean up afterwards by running scap pull
to reset the server which undoes local changes, and repool the server if you depooled it.
Staging changes
This section will walk you through a common WikimediaDebug debugging workflow.
It worked on your machine. It worked on the Beta Cluster. And the unit tests pass. But will it work in production? You can reduce the risk of unexpected breakage by deploying your change to a debugging server in production and using WikimediaDebug to verify its behavior.
Follow the instructions on How to deploy code, but stop after you have completed step 2 ("Get the code on the deployment host"). Now sync your change to one of the debug backends in a new tap by SSHing into it and running scap pull
:
you@laptop:~$ ssh mwdebug1001.eqiad.wmnet
you@mwdebug1001:~$ scap pull
Now, enable the WikimediaDebug extension in your browser, and select the same backend you just pulled your code to. Your browser requests will be routed to this backend, allowing you to verify that your changes are working correctly prior to deployment.
Limitations
Jobs enqueued on WikimediaDebug servers will be run on the regular jobrunners, not on their origin server (phabricator:T255706#9422250).
As of April 2024, WikimediaDebug is only available in desktop browsers. The Firefox extension does not work with mobile Firefox, and mobile Chrome does not support extensions at all.
Code steward
- Maintained by Release Engineering Team
- Live chat (IRC): #wikimedia-releng connect
- Issue tracker: Phabricator (Report an issue)
See also
- Debugging in production: Info on how to push code to a debug backend and for testing non-HTTP code (e.g. maintenance scripts)
- Release Engineering/Runbook/WikimediaDebug: Internal guide for publishing releases.
- Arc Lamp, daily flame graphs from live traffic on MediaWiki production servers.
Further reading
- Flame graphs arrive in WikimediaDebug! (techblog), Timo Tijhof, 2023.
- WikimediaDebug v2 is here! (techblog), Timo Tijhof, 2019.
- Example: Use WikimediaDebug in Pywikibot via the extra_headers option