MediaWiki Engineering/Guides/Frontend performance practices

From Wikitech

This is a guide to understand and improve frontend web performance. We describe which aspects of a web page impact page loads, and how changes can influence those aspects.

Getting started

Metrics

We primarily focus on the following three metrics:

  • Visual rendering time: The page should render to completion as quickly as possible. Measured on page views using the Paint Timing API (first-contentful-paint), and synthetically using Browsertime (Last Visual Change, SpeedIndex, Cumulative Layout Shift).
  • Total page load time: The load indicator in web browsers. This waits for the download and rendering of HTML, and the download and initial execution of all sub resources (JavaScript, CSS, images). Observed through the Navigation Timing API (loadEventEnd).
  • Responsiveness: The page should be available to respond to human interaction at all times. There should not be background or periodic code execution that causes lag or freezing of the main thread for prolonged periods of time. Observed through how many and how long entries are reported via the Long Tasks API.

Principles

Performance principles, in order of their relative importance:

  1. Users. First is the end-user's overall experience and perceived performance. User experience is measured using our key metrics (visual rendering, page load time, responsiveness). This includes backend latency response time.
  2. Developers. Strive for the best developer productivity as long as it doesn't compromise the user experience (or, for developer-facing tools, don't make it harder or less likely for that experience to be realised). Developer experience encompasses ease of learning, cost of maintaining, and debugging ability. We make it easy to do the right thing, and make other things possible.
  3. Servers. We generally don't prioritise reductions in backend server costs unless they translate directly to improved end-user experience (such as faster backend response, less client-side processing, or smaller download size). Reducing server costs such as memory usage, disk activity, and CPU utilisation is great, but if you find that your code could produce a faster or smaller response by utilising more resources in a shorter period of time, that could be even better!

These are inspired by the W3C Design Principles.[1]

When ResourceLoader launched in 2011, it needed only 9 Varnish CDN servers and 4 MediaWiki backend servers to serve 90,000 requests per second at peak with 99.82% edge cache-hit ratio.[2] These numbers are a celebration of a fast end-user experience, not low server costs. The JavaScript code was developed such it was safe to liberally cache and re-use CDN responses, thus incurring very few simultaneous backend requests.

Shipping frontend assets

ResourceLoader is the delivery system for CSS and JavaScript files in MediaWiki. For example, it takes care of localisation, bundling, minification, and caching.

General approach

Load little to no JavaScript code upfront, especially on page views. Use page and account metadata to your advantage on the server-side, through conditional loading of modules. Determine whether or not the current specific URL and account need a given module to do something.

We aim to offer people a smooth experience, where interface components render without delay, or render progressively. Use fixed, predictable, or visually contained layouts by presenting information and interfaces such that their dimensions are constant in HTML or CSS. This avoids moving or pushing down elements after they became visible on the page (known as layout shifts, or reflows).

We enforce that modules cannot affect the initial rendering of the page, particularly "above the fold" (the top portion of the page that's initially visible on the screen). This is handled by ResourceLoader's startup architecture, which processes the module queue asynchronously.

Familiarise yourself with Compatibility policy and the Architecture Principles. Our architecture is modelled after the web itself. Every page starts in "Basic" mode, where only the HTML is rendered. Assistive technology needs to understand the information and structure based on HTML semantics alone. CSS can be assumed to succeed for visual readers and should be used for presentation and to convey visual meaning only. We aim to be fairly aggressive in raising JavaScript requirements for modern browsers, which reduces costs of development and maintenance, and also reduces payload size ("page weight"). This aim is only achievable when components start out with a solid and functional Basic experience, with server-rendered access to information, and traditional request-response cycles for contributing to the wiki.

The JavaScript requirements for the "Modern" layer are implemented via a feature test in the startup module, inspired by the "cutting the mustard" approach.

Embrace that every page starts with basic HTML and CSS, and that JavaScript adds optional layers that may or may not arrive. Its eventual arrival depends on numerous factors, and may vary over time even for the same person, including:

  • Due to contexts outside our canonical website:
    • offline re-use of our content, such as Kiwix, Archive.org, archive.today, and IPFS. For example, this archived tweet is struggling to reconstruct the original page due to non-trivial JavaScript URLs, downloading over 100MB of JavaScript in a crash-loop.
    • external re-use, such as Apple Dictionary/Lookup, and Apple Siri, third-party mobile apps for Wikipedia, and alternative Wikipedia reader sites (e.g. Wikiwand).
    • search engines, especially non-Google ones. Crawling the entire Internet through full JS-enabled virtual browsers is extremely expensive and even Google doesn't do so consistently.
  • Due to personal circumstance or upstream vendor decisions:
    • device age and capability.
    • browser choice. The anonymous Tor Browser disables JavaScript by default when in safe mode.[3] The "Reader" mode in Firefox and Safari extract pure HTML content, either from the DOM after effectively suspending JavaScript, or by the original server-sent HTML.
    • intervention by browser vendors. E.g. Chrome on Android sometimes auto-disables JS if the connection appears slow.[4]
    • browser preferences. People may disable JavaScript and grant permission only when it appears needed and trusted on a per-site basis.
  • Due to failure or chance:
    • server and network stability. No server or Internet connection is or promises to be end-to-end 100% up. One or more JavaScript requests can fail. HTML+CSS are the earliest and highest priority resources, and requiring other requests to succeed for a non-broken interface would exponentially increase failure mode by multiplying small probabilities.[5]
    • network speed. The JavaScript request can be so slow or large that it won't finish downloading (and executing) before you are done reading the page. Or, it may hit a timeout in the browser, on the cellular network of the phone carrier, roaming, at our CDN, or on our backend servers.
    • interference by browser extension. At any given moment there are plugins that knowingly or unknowingly break code on the page in some form or another.

If you render or visualise information client-side only, it is de-facto inaccessible to these environments. Also consider how information and graphs are crawled by search engines. If it doesn't have a URL or isn't discoverable by HTML anchor link, it's probably not crawlable, searchable, findable, or sharable. The 2018 MW-Graph extension represents not only a hole in our articles in all the above environments, it also meant the graphs were no longer part of Google Images, and could no longer be shared through instant messengers as image URL. The question is, do we develop to be functional by default, or do we fail anytime anyone encounters a deviation from the presumed norm? We are all the 1%, at different times.[6]

HTTP caching

Improving the cacheability of responses to web requests used in the critical path is expected to have the following impact:

  • First views: None.
  • Repeat views:
    • For any resource:
      • Consume less bandwidth (reduces mobile data costs).
      • Consume less power (fewer cell-radio activations required).
    • For CSS files:
      • Reduce time for visual rendering. Cached stylesheets load faster without a network roundtrip, allowing rendering to start and/or complete sooner.
      • Reduce time to domComplete and loadEventEnd metrics. Stylesheets are sub-resources required for DOM completion.
    • For JavaScript files:
      • Reduce "Time to Interactive". Cached scripts take less time to load, parse, and compile. Browsers may store the compiled bytecode from previous page views. As of writing (Dec 2019), the bottleneck in loading JavaScript code is often not download or execution of JS, but the parsing/compilation of JS. Allowing this to be cached, or reducing in amount, can benefit page load time more than optimising how fast it executes.

Size of HTML payload

The first 14 KB (per TCP slow-start), should ideally contain everything needed to render the skin layout and a little bit of the article text. And, it should allow the browser to render that initial layout in a way that isn't later moved around or otherwise invalidated (additional components may appear later, but existing components should not move).

Examples of how to improve this:

  • Reduce amount of per-page header bloat, e.g. <link> tags, RLQ, mw.config, mw.loader.
  • Better minification for inline CSS, inline JavaScript and HTML within the <head>.
  • Arranging the HTML to ensure layout and start of content render first.

This threshold was reached and validated for the Vector skin in 2019 (T231168).

Size of stylesheets

Reducing the size of the main (blocking) stylesheet as referenced from the HTML head, is expected to have the following impact:

  • All views (first view, and repeat views):
    • Improvement of all paint metrics. Smaller stylesheets load and parse faster, allowing rendering to start and complete sooner.
    • Reduce time to domComplete and loadEventEnd metrics. Stylesheets are sub-resources required for DOM completion.
    • Consume less bandwidth (reduces mobile data costs).
    • Consume less power (reduce CPU time for stylesheet parsing, especially data URIs. See T121730 for on-going work).

Size of scripts

The page loading process for MediaWiki. Note where the "Startup module" sits.

Size of startup manifest

Reduce the amount of code contained in the Startup module by keeping the number of distinct module bundles low. In general, additional scripts should be added to an existing module bundle instead of creating new ones (see blog post and Grafana).

We generally recommend that any given extension or core feature register no more than three modules. If you find yourself needing more, please reach out to Wikimedia Performance Team who may be able to help you find an alternate approach or to improve the platform to work better for you. See also Developing with ResourceLoader on mediawiki.org.

To help quantify the cost of modules, and to help with code maintenance more generally, we organise frontend assets in subdirectories by module bundle. See also code conventions, Best practices for extensions#File structure, and T193826: Organise files in directories by module name.

Size of module scripts

Reducing the size of scripts is expected to impact all views (first view, and repeat views):

  • Reduce "Time to Interactive". Less code to download, parse, and execute.
  • Reduce time to domComplete and loadEventEnd metrics. These scripts are sub resources part of DOM completion.
  • Consume less bandwidth (reduces mobile data costs).

(The above does not apply to scripts that are lazy-loaded from a human interaction after document-ready.)

Scripts run in one of three overall phases. Each phase depends on the outcome of previous phases (they run serially). Earlier phases have a bigger impact when reduced in cost, compared to later phases, because they allow subsequent phases to start their work sooner.

  1. Inline scripts in HTML <head> (including page configuration, page module queue, and the async request to the Startup module).
  2. The Startup manifest (including module manifest, dependency tree, and the mw.loader client).
  3. The page modules (the source code of modules loaded on the current page, and their dependencies).

Latency

Reduce the time it takes for the browser to receive the response to a request it makes.

Impact on first views and repeat views:

  • Reduce time overall (for the feature to load, or for the action to be performed).
  • Consume less power (making more optimum use of the network means less idle time and thus fewer cell-radio activations required).

Different strategies to achieve this:

The fastest request is no request

  • … render server-side if possible to reduce need for additional JavaScript payloads.
  • … the request for an interface icon can be skipped entirely by using @embed on the CSS background image. (How and when).

Speed up the response

  • … by improving backend response times on the server,
  • … by allowing the response to be cached by the CDN in a nearby datacenter,
  • … by reducing the size of the response.

Start the request earlier

Making the request start earlier. For example:

  • … by making requests in parallel instead of serially one-after-the-other.
    • If same code is in charge of multiple requests, the $.when() function can be used to make parallel requests.
    • If multiple pieces of code are in charge of their own requests, consider returning a Promise and letting the other code proceed immediately to make its own request. Then, only once you truly need the data from the first Promise (or to wait for its response) and call its then() method. You can also use $.when() to return a Promise that auto-resolves when multiple other Promises have settled.
  • … by hinting the browser directly about your intentions. This has the benefit of not needing any changes to your code! By the time your HTML, CSS, or JS, makes a related request, the result of these hints will automatically be used.
  • ... by moving the request invisible with stale-while-revalidate. If you let the CDN and browsers cache something for 24 hours, this means that the first request after 24 hours will be delaying the client whilst we wait for this cache miss response. By setting stale-while-revalidate you can allow the browser to make use of the cache one more time, while in the background the browser will fetch the new response for next time. For example, you could allow one stale response for up to 7 days. Or if it must be within 24 hours, you could shorten the regular cache period to 12 hours and then allocate the remaining 12 hour for a stale-while-revalidate response, adding up to 24h.

Avoid image embedding, sometimes

Main article: mw:ResourceLoader/Architecture#Embedding

We generally recommend you avoid @embed in new code. The documented performance benefits of @embed have not changed since its introduction with ResourceLoader in 2010, but, as of 2019 we no longer recommend @embed for general use.[7] This is due to numerous costs associated with @embed that stand separate from its benefits, as well as numerous improvements to alternatives that don't have these drawbacks.

When is embedding still worthwhile? We do recommend @embed for SVG icons up to 0.3KB (before compression). These icon files are so small that they are actually an uncompromising net-win, mainly due to how close their contents are to the size of a URL (e.g. /w/path/to/MyExtension/resources/my-module/path/to/foobar.svg?version). After all, a URL to an image is also data that you pay for. The 0.3KB threshold represents four times the size of a typical icon URL. Embedding an SVG twice the size of its URL is a net-win because downloading it by URL would mean downloading the URL and the underlying data (plus delay from the network, plus overhead in request headers and response headers), SVGs also compress by about 50% under gzip, with an additional 12% compression specific to CSS-embedded contexts.

Drawbacks to consider:

  1. Image embedding delays first paint.
    • It is generally preferred to improve the first contentful paint (by not embedding data in the stylesheet not needed for first paint) and thus have the icon show up later, than to keep the entire screen blank for longer by forcefully downloading an icon as part of the render-blocking stylesheet through image embedding, even though it would avoid a minor delay for the icon's appearance.
  2. Embedded images make suboptimal use of browser cache.
    • The list of style modules queued on a given page is variable. Special:RecentChanges and Special:Search queue different stylesheets, and articles also differ from each other with "w:Barcelona" containing a Kartographer map, and "w:Video" containing a TMH video player.
    • When referencing an icon by URL instead of embedding it, you get to load it instantly from the browser cache (even faster than an embed!) on subsequent page views. If it is embedded, then it will have to be downloaded multiple times, each time part of a different stylesheet bundle.

Changes in bigger picture trade-off:

  • In HTTP/2, the base cost of an individual request has been reduced, thus becoming more competitive with an embed that carries no such overhead.
  • In HTTP/2, browsers no longer serially delay and restrict the starting of concurrent requests (HTTP/1 typically permitted around 5 concurrent requests). Thus the mere presence of image requests no longer creates delays for other requests in the background, and the images themselves no longer get delayed behind other requests. Previously, this meant an icon would start its request well after the browser discovered it.
  • Our key metrics (§ Metrics) now place more value on visual completion and total page load time. Cellular bandwidth remains an important aspect, but access to and cost of bandwidth is no longer the metric above all others.
  • All major browsers support rel=preload which is almost as fast as embedding in practice, and avoids the above drawbacks.

Avoid preloading, sometimes

Avoid using rel="preload" in link tags or HTTP headers on the HTML response as this can cause congestion and competition against the HTML resource itself, as well as delay critical CSS resources needed for initial rendering. In MediaWiki these resources are already linked from the <head> and naturally discovered and pre-fetched by the browser as soon as possible, by the browser's lookahead parser.

Preloading is useful for resources that a browser cannot discover early. For example when a resource is fetched indirectly through CSS or JavaScript. To start the download of late-discovered resources earlier than usual, consider preloading closer to the time they are needed to avoid this competition.

As an example, the Wikipedia logo is a CSS background image on an <a> element in the sidebar. When the browser encounters an image tag in HTML, it immediately downloads it, regardless of whether the HTML tag will be parsed or made visible. When the browser parses and applies a stylesheet, however, it ignores background image URLs because the CSS rule in question does not apply yet. (Rules are only applied if the rule matches the state of an element in the DOM, and when the browser first renders an article it typically has not yet reached the HTML of the sidebar).

The "natural" point where the browser will start the download for the logo file is when that sidebar HTML has been reached and rendered to the screen (first without logo). This means that until the HTML is fully downloaded and visually rendered, the browser will not even start downloading the logo. We improved this by emitting a preload header from the CSS request that tells the browser to start the download right away, because we know it will be needed soon. You can read more about this in our blog post.

How to:

  • Emit Link: rel=preload headers from a CSS or JS server response, preferably the same response that contains the code where this will later be used.
  • Dynamically create an element like <link rel="preload"> in JavaScript. For example, from a click handler that has to load several resources. If there's a technical reason that you cannot safely start these resources at the same time using their normal means (e.g. if you have to execute them serially in a specific order), you can start them with a preload. When the actual Ajax request or link/script/img tag is made later, the browser will magically re-use the on-going (or finished) preload request.

Processing cost of scripts

Reduce the amount of time spent in executing JavaScript code during the critical path. This includes:

  • Defer work that does not need to happen before rendering.
  • Split work in to smaller idle-time chunks, to avoid blocking the main thread for too long (non-interactive "jank").
  • Rearrange code so that there are no style reads after style writes in the same event loop. The browser naturally alternates between a cycle of JS execution and a cycle of style computation and rendering. If styles are changed in the main JS cycle and then read back within that same cycle, the browser is forced to pause the script and perform an ad-hoc render cycle before being able to resume the script. See also:

Expected impact on all views:

  • Reduce "Time to Interactive". Less JS execution before the page is ready to use. Less uninterrupted execution that can block interactions. Fewer "forced synchronous layouts" which significantly slow down code execution.
  • Reduce time to loadEventEnd metric. The load event is blocked until scripts have finished both their downloads and their initial execution.

See also

Meta

This guideline was originally drafted in February 2016 after T127328.

Notes

  1. W3C Design Principles, 26 November 2007.
  2. For more details and the data source for ResourceLoader backend performance, refer to mw:ResourceLoader/Architecture#Backend performance.
  3. Tor Browser - JavaScript and Flash, torproject.org.
  4. JavaScript isn’t always available and it’s not the user’s fault by Adam Silver (2019).
  5. Everyone has JavaScript, right? by Stuart Langridge (2015).
  6. Why availability matters, Stuart Langridge (2015).
  7. T121730: Audit use of @embed by Timo Tijhof (2019), phabricator.wikimedia.org.