Jump to content

WikimediaDebug

From Wikitech
(Redirected from Debug servers)
The WikimediaDebug popup in active state.

WikimediaDebug is a set of tools for debugging and profiling MediaWiki web requests in a production environment.

You can use WikimediaDebug through the accompanying browser extension, or from the command-line using the X-Wikimedia-Debug header.

So long as WikimediaDebug is on, all your requests will be handled as a Varnish cache miss.

Browser usage

Install the extension for Firefox or Chrome:

WikimediaDebug supports dark mode!

The extension creates a button with the Wikimedia community icon, usually near the upper right corner of the browser window.

To start debugging:

  1. Go to a Wikimedia site, e.g. https://en.wikipedia.org,
  2. Click the icon,
  3. Toggle the "On/Off" switch.

The extension automatically deactives itself after 15 minutes. This makes sure you don't keep generating debug data indefinitely, or unintentionally flood someone else's debugging session with pageviews.

Without a browser extension

In situations where installing an extension is not feasible (e.g. issues specific to mobile Chrome), you can visit Special:WikimediaDebug to enable debugging via a cookie. This is currently more limited (the UI doesn't expose any of the options) but gets handled the same way as the header.

Command-line usage

Force Varnish to skip the cache and pass request to the MediaWiki_On_Kubernetes mwdebug release:

$ curl -H 'X-Wikimedia-Debug: backend=k8s-mwdebug' https://meta.wikimedia.org/wiki/Main_Page

Same as above, but profile a request using Tideways and publish the profile to XHGui:

$ curl -H 'X-Wikimedia-Debug: backend=k8s-mwdebug; profile' https://meta.wikimedia.org/wiki/Main_Page

Request a page with MediaWiki configured for read-only mode:

$ curl -H 'X-Wikimedia-Debug: backend=k8s-mwdebug; readonly' https://meta.wikimedia.org/wiki/Main_Page

Options

The following attributes can influence the backend request:

  • backend=…: Choose the mwdebug server that Varnish will route your request. It can be set to 1 in Beta Cluster where the only backend acts as both appserver and mwdebug server.
  • forceprofile (Inline profile): Capture a trace profiler and append it to the web response.
  • log (Verbose logs): Verbosely enable all MediaWiki debug log groups, submitted to Logstash for querying.
  • profile (XHGui): Record a trace for performance analysis and publish the result to XHGui.
  • readonly: Read-only mode requests a page from MediaWiki with read-only mode enabled, to simulate how your code behaves when the database is read-only due to replica lag or scheduled maintenance.
  • shorttimeout: Simulate how your code handles execution timeouts (e.g. from mediawiki/libs/RequestTimeout, T293568). When enabled, $wgRequestTimeLimit is set to 2s. This option is available via command-line use only.

Available backends

You can simulate how your code behaves in a secondary data center (Multi-DC), by selecting a backend in a DC other than the current primary. You can use either of the datacenters through this list at any time, even if that datacenter is currently inactive or depooled. If you pick a backend in a datacenter that is not currently the primary DC, some actions may be read-only, disabled or slower.

The default is now to serve debug requests from a specific MediaWiki installation running on our Kubernetes cluster. As all production traffic is now served by MediaWiki_On_Kubernetes, this should be the primary backend used to test changes.

The following application servers are dedicated to WikimediaDebug use:

  • k8s-mwdebug (mw-on-k8s, default).
  • 2x mwdebug in Eqiad
  • 2x mwdebug in Codfw
  • k8s-mwdebug-eqiad (mw-on-k8s, specifically target the eqiad deployment).
  • k8s-mwdebug-codfw (mw-on-k8s, specifically target the codfw deployment).
  • k8s-mwdebug-next (primary DC, plus -eqiad and -codfw equivalents), targeting the k8s "next" deployments dedicated to early testing for major changes (e.g., new versions of PHP).

Request profiling

Excimer UI provides detailed and interactive flame graphs on a per-request basis. To use it, enable the "Excimer UI" option (excimer attribute). This will capture PHP trace logs from the current request (using Excimer), and send them to Excimer UI at performance.wikimedia.org, where the visualisation is powered by Speedscope. Excimer UI launched in May 2023 (T291015).

To try out Speedscope with MediaWiki data, open this example profile.

Speedscope with a MediaWiki request profile loaded.

Speedscope views

The default view is Time Order, this renders calls in chronological order on a timeline from left to right.

The Left Heavy view re-arranges each row such that calls to the same child on the next row are grouped together. For example, if a function A calls functions B and C alternatingly, in Left Heavy view, all the calls from A->B are grouped together. This allows you to clearly identify visually where the majority of time is spent at each level.

The Sandwich view sorts each individual function in a list. This allows you to globally find patterns regardless which the parent function is or where it sits in the tree. This is analogous to a "reverse flame graph". For example, if many different unrelated parts of your code all end up using the same low-level library for a particular part of their logic, then that library likely would not show up prominently in either of the other views as its time would be divided across many different branches. In Sandwich mode, you can sort by self time to identify the hottest functions that perform the lowest level of work. These tend to be functions relating to reading files, caches, and databases.

Sandwich mode

XHGui profiling

XHGui profile with detailed memory and stack trace profiling, captured from the PHP runtime during a MediaWiki request.

XHGui provides a call tree for a given web request with aggregations of execution time, call count, and memory usage for all called functions. The capturing and storing for XHGui is powered by php-tideways (formerly known as XHProf). Our instance is configured via wmf-config makes your profiles publicly accessible at https://performance.wikimedia.org/xhgui/.

To explore XHGui and its features, open this example profile.

Capture a profile from a browser:

  • Enable the "XHGui" option.
  • Click the "Find in XHGui" link in the WikimediaDebug popup. This takes you directly to a list of profiles matching your Request ID (usually only 1 match).
  • Click the sharable permalink (e.g. the timestamp link, or "GET" method link) to view the recorded profile.

Capture from the command-line using Curl:

Plaintext request profile

WikimediaDebug can deliver a plaintext-format profile directly from the web server. Enable the "Inline profile" option (forceprofile attribute) on any MediaWiki web request (including index.php, api.php, and load.php), and MediaWiki will append a plain text profile to the web response.

Example in the browser: Open any page on Wikipedia and view source (example URL). Then, enable WikimediaDebug with the "Inline profile" option, and reload the browser tab. At the end of the response should be a call-graph summary, like this:

/*
 100.00% 1437.125      1 - main()
  87.21% 1253.268      1 - ResourceLoader::respond
  79.31% 1139.756   1509 - ResourceLoaderModule::getVersionHash
  77.88% 1119.292      3 - ResourceLoader::getCombinedVersion
  ..

Example using cURL:

# Production
$ curl -H 'X-Wikimedia-Debug: 1; forceprofile' 'https://test.wikipedia.org/w/load.php?modules=startup&only=scripts&raw=1'

# Beta Cluster
$ curl -H 'X-Wikimedia-Debug: 1; forceprofile' 'https://en.wikipedia.beta.wmflabs.org/w/load.php?modules=startup&only=scripts&raw=1'

Plaintext CLI profile

You can generate a plaintext-format profile from any MediaWiki maintenance script, by setting the --profiler=text option. Note that in Wikimedia production, maintenance script must be run via the mwscript command. For performance reasons, the required debug tools are only installed on mwdebug servers, so while we normally run Maintenance scripts from an mwmaint host, when debugging make sure to connect your termimal to a debug host first. (in the Beta Cluster, you can use any MW server).

krinkle@mwdebug$ mwscript showSiteStats.php --wiki=nlwiki --profiler=text
Number of articles:  2122688
Number of users   :  1276507

<!--
100.00% 114.964    1 - main()
 …
 22.42% 25.776     1 - ShowSiteStats::execute
 16.61% 19.096     2 - Wikimedia\Rdbms\LoadBalancer::getServerConnection
  4.80% 5.522      1 - Maintenance::shutdown
  4.41% 5.065      1 - Wikimedia\Rdbms\Database::initConnection
  3.07% 3.530      1 - DeferredUpdates::doUpdates
  2.66% 3.061      1 - Wikimedia\Rdbms\Database::select
  2.38% 2.739      1 - Wikimedia\Rdbms\Database::query
  1.95% 2.240      1 - section.SELECT * FROM `site_stats` LIMIT N 
  1.48% 1.700      1 - Wikimedia\Rdbms\DatabaseMysqli::mysqlConnect
  …
-->

Verbose logging

This feature requires a WMF or Volunteer NDA Wikimedia developer account.

Setting the log attribute in the X-Wikimedia-Debug header (“Verbose log” checkbox in the extension) will cause MediaWiki to be maximally verbose, recording all log messages on all channels (regardless of whether or not they are otherwise enabled in wmf-config).

These messages will end up in Logstash at https://logstash.wikimedia.org/app/dashboards#/view/mwdebug and in /srv/mw-log/XWikimediaDebug.log on the mwlog host. See Logs#mw-log for more information.

To view the logs of a specific web request only, the browser extension adds a "Find in Logstash", which opens goes to Logstash with a filter for the request ID of the current page view. You can also construct this URL manually, by navigating to https://logstash.wikimedia.org/app/dashboards#/view/x-debug and entering a search query like reqId:"…"

Debug logging (CLI)

If you are investigating problems with a specific appserver (i.e. not an mwdebug host) or otherwise can't rely on Logstash, you can use the MW_DEBUG_LOCAL environment variable to send verbose logs to the /tmp/wiki.log file on the same server. It is recommended to depool the host first.

  • If you can excercise the problem from eval.php, then consider using its -d option to print verbose logs directly to your terminal's stdout. See also Manual:Eval.php on mediawiki.org
  • If you can excercise your debug scenario a specific maintenance script, then MW_DEBUG_LOCAL can be passed to mwscript as follows, for example, the equivalent of mwscript version.php testwiki is:
sudo -u www-data MW_DEBUG_LOCAL=1 php /srv/mediawiki/multiversion/MWScript.php version.php --wiki testwiki

Remember to clean up afterwards by running scap pull to reset the server which undoes local changes, and repool the server if you depooled it.

Staging changes

This section will walk you through a common WikimediaDebug debugging workflow.

It worked on your machine. It worked on the Beta Cluster. And the unit tests pass. But will it work in production? You can reduce the risk of unexpected breakage by deploying your change to a debugging server in production and using WikimediaDebug to verify its behavior.

Follow the instructions on How to deploy code, but stop after you have completed step 2 ("Get the code on the deployment host"). Now sync your change to one of the debug backends in a new tap by SSHing into it and running scap pull:

you@laptop:~$ ssh mwdebug1001.eqiad.wmnet
you@mwdebug1001:~$ scap pull

Now, enable the WikimediaDebug extension in your browser, and select the same backend you just pulled your code to. Your browser requests will be routed to this backend, allowing you to verify that your changes are working correctly prior to deployment.

Limitations

Jobs enqueued on WikimediaDebug servers will be run on the regular jobrunners, not on their origin server (phabricator:T255706#9422250).

As of April 2024, WikimediaDebug is only available in desktop browsers. The Firefox extension does not work with mobile Firefox, and mobile Chrome does not support extensions at all. You can use Special:WikimediaDebug instead on mobile.

Code steward

See also

Further reading