This page documents metrics provided by Wikimedia Performance. We collect metrics using real user monitoring, running synthetic tests and some backend metrics. The aim of collecting the metrics is to get a feeling for the user experience and and finding user experience regressions.
- Main article: Performance/Real user monitoring
These are real-user metrics (RUM) collected on a sample of MediaWiki pageviews, using standard APIs that are built-in to every web browser. Through the webperf services, these eventually end up in Graphite under the
We collect two kinds of metrics:
- Milestones during a page load. This is an offset from the start of the page load, thus the total duration to that instant in time.
- Durations for specific portions of a page load. These measure from the start to end of that particular operation.
responseStart: From navigationStart to here (Navigation Timing).
domInteractive: From navigationStart to here (Navigation Timing).
domComplete: From navigationStart to here (Navigation Timing).
loadEventStart: From navigationStart to here (Navigation Timing).
loadEventEnd: From navigationStart to here (Navigation Timing). Also known as "page load time (PLT)" or "On load", which typically corresponds with the browser's page loading indicator.
firstPaint: From navigationStart to first-timing (Paint Timing).
firstContentfulPaint: From navigationStart to first-contentful-paint (Paint Timing).
dns: Computed as
domainLookupEnd - domainLookupStart, our intermediary layer labels this "dnsLookup".
unload: Computed as
unloadEventEnd - unloadEventStart.
redirect: Computed as
redirectEnd - redirectStart, our intermediary layer labels this "redirecting".
tcp: Computed as
connectEnd - connectStart. (As per the spec, browsers include the TLS handshake for HTTPS).
ssl: Computed as
connectEnd - secureConnectionStart. (As per the spec, browsers report this as subset of
request: Computed as
responseStart - requestStart.
response: Computed as
responseEnd - responseStart.
processing: Computed as
domComplete - responseEnd.
onLoad: Computed as
loadEventEnd - loadEventStart.
See phab:T104902 for how we validate incoming data:
- We don't filter out the value 0 (zero).
- When negative numbers are encountered due to browser bugs, we reject the entire beacon, not just that one data point for that one metric. This ensures fair comparison between two metrics by representing the same pageview samples. We measure how often this happens in Grafana: EventLogging-schema / NavigationTiming via the
nonCompliantcounter. The details of this are logged by webperf-navtiming to
- Milestone diagram: https://www.w3.org/TR/resource-timing-2/#attribute-descriptions
We measure the overall time it takes to process an edit submission. To save an edit in MediaWiki means to create or change a wiki page.
Backend Save Timing
Backend Save Timing measures time spent in MediaWiki PHP, from the process start (
REQUEST_TIME_FLOAT) until the response is flushed to the web server for sending to the client (
PRESEND). The instrumentation resides in the WikimediaEvents extension (source), and is published to Graphite under
The metric is plotted in Grafana: Backend Save Timing Breakdown, and includes slices by account type (bot vs human), by entry point (index.php wikitext editor, vs api.php for VisualEditor and bots), and by page type or namespace (Wikipedia content, or Wikidata entity, or discussion pages).
Frontend Save Timing
Frontend Save Timing is measured as time from pressing "Publish changes" from a user interface in a web browser (e.g. submitting the edit page form) until that browser recieves the first byte of the server response that will render the confirmation page (e.g. the article with their edit applied and a "Post-edit" message).
This is implemented as
navigationStart (the click to submit the form over HTTP POST) to
responseStart (the first byte after the server has finished processing the edit, redirected, and responded to the subsequent GET).
When investigating Save Timing metrics, it may be useful to correlate with:
- performance.wikimedia.org: Flame Graphs, which shows where in the MW codebase time is spent code during particular operations. Use the "api" graph for edits via the API, and the "fn-EditAction" graph for edits via index.php.
- Grafana: Application Servers RED , which measures the overall load and throughput of our web servers by cluster (e.g. appservers for index.php, api_appservers for api.php) and by method (e.g. POST is mostly edits).
- Grafana: Edit Count, which measures the overall rate of edits being saved. This count is derived directly from the Backend Save Timing metric, and thus corresponds fully. Any edit save that we measure for backend save timing is counted here, and vice-versa.
Synthetic tests metrics
The synthetic tests collects metrics from the browser as real user monitoring and and extra metrics by recording a video of the screen when the page is loading and then analysing when different parts is painted. With the synthetics tests you get a video, screenshots and metrics.
Synthetic tests can also collect trace logs from Chrome and Firefox so you can see how much time is in CSS/JS and different functions.
You can see what metrics we get from synthetic tests in synthetic test drill down dashboard.
FirstVisualChange: The time when something for the first time is painted within the viewport.
Logo: The time when the Wikipedia logo is painted
Heading: The time when the first h1/h2 heading is painted at its final position within the viewport.
LargestImage: The time when the largest image is painted at its final position within the viewport.
SpeedIndex: SpeedIndex is the average time at which visible parts of the page are displayed. It is expressed in milliseconds and dependent on size of the view port.
lastVisualChange: The time when the last paint happens within the viewport.
VisualComplete85: The time when 85% of the content within the viewport is painted
VisualComplete95: The time when 95% of the content within the viewport is painted.
VisualComplete99: The time when 99% of the content within the viewport is painted.
Google Web Vitals
Read more about Google Web Vitals.
- Largest Contentful Paint (LCP) - marks the point in the page load timeline when the page's main content has likely loaded - a fast LCP helps reassure the user that the page is useful.
- First Input Delay (FID) - measuring load responsiveness and quantifies the experience users feel when trying to interact with unresponsive pages - a low FID helps ensure that the page is usable.
- Cumulative Layout Shift (CLS) - measuring visual stability because it helps quantify how often users experience unexpected layout shifts - a low CLS helps ensure that the page is delightful.
You can find a full definition of the metrics collected.
You can explore these on our Grafana: Chrome User Experience dashboard.
Google collects metrics within its Chrome browser from all people who have "opted-in" by syncing with their Google account. These are used by Google Search as real-world signal from how a website performs in practice. The data Google collects from its Chrome users is publicly available through the Chrome User Experience Report.
In order to keep track of how Wikipedia is doing from Google's point of view, we import a copy of this data once a day from the Google API and store it in a Graphite instance.
The daily crawl runs on the
gpsi.webperf.eqiad1.wikimedia.cloud server, where run a couple of tests and collect if we are slow/moderate/fast. The data is collected using the Sitespeed.io CruX plugin (similar setup as for Performance/Synthetic testing).