Performance/Guides/Frontend performance practices

From Wikitech
Jump to navigation Jump to search

This is a guide to understand and improve frontend web performance. We describe which aspects of a web page impact page loads, and how changes can influence those aspects.

Introduction

The page loading process for MediaWiki.

Metrics

We primarily focus on the following three metrics:

  • Visual rendering: Most of the page should render to completion as quickly as possible. Observed through the Paint Timing API (first-contentful-paint).
  • Total page load time: The load indicator in web browsers. This waits for the download and rendering of HTML document and the download and initial execution of all sub resources (JavaScript, CSS, images). Observed through the Navigation Timing API (loadEventEnd).
  • Responsiveness: The page should be able to respond to human interaction at all times. There should not be background or periodic code execution that causes lag or freezing of the main thread for prolonged periods of time. Observation through the Long Tasks API.

Principles

Main article: mw:ResourceLoader/Architecture#Principles

Performance principles, in order of their importance:

  1. User. (Perceived performance and overall user experience. This includes backend latency response times.)
  2. Developer. (Engineering productivity; ease of learning, maintaining, and debugging.)
  3. Server efficiency. (Such as disk space, memory usage, CPU load, number of servers, etc.)

Shipping frontend assets

Deliver CSS and JavaScript fast (bundled, minified, and avoiding duplication) while retaining the benefits of caching. This is all done for you by ResourceLoader.

ResourceLoader is the delivery system for bundling and loading CSS/JavaScript files in MediaWiki.

General principles

When loading a module, it must not affect the initial rendering of the page, particularly "above the fold" (the top portion of the page that's initially visible on the device's screen). Load little to no JavaScript code upfront. Make the most of page and account metadata on the server-side through conditional loading. Anticipate whether or not a specific account, loading a specific URL, needs the module to do something. See loading modules for more information.

People should have a smooth experience; interface components should render progressively. Preserve positioning of elements (e.g. avoid pushing down content in a reflow).

Familiarise yourself with Compatibility policy and the Architecture Principles. Our architecture is modelled after the web itself. Every page load starts in "Basic" mode, where only the HTML is rendered. Assistive technology needs to understand the information and structure based on the semantics alone. CSS can be assumed to succeed for visual design and should be used for presentation and to convey visual meaning only (though keep in mind that stylesheets degrade well in Grade C browsers). We aim to be fairly aggressive in raising Grade A requirements to modern browsers, which reduces maintenance cost and payload overhead. This aim is only achievable when components start out with a solid and functional base experience, with server-rendered access to information, and traditional request-response cycles for contributing to the wiki.

Embrace that every page load starts in Basic, and that the Grade A JavaScript layer is an optional one that may or may not arrive. Its eventual arrival depends on numerous factors, and may vary from page to page even for the same person, including:

  • Due to contexts outside our canonical website:
    • offline re-use of our content, such as Kiwix, Archive.org, archive.today, and IPFS. For example, this archived tweet is struggling to resconstruct the original page due to non-trivial JavaScript URLs, downloading over 100MB of JavaScript in a crash-loop.
    • external re-use, such as Apple Dictionary/Lookup, and Apple Siri, third-party mobile apps for Wikipedia, and alternative Wikipedia reader sites (e.g. Wikiwand).
    • search engines, especially non-Google ones. Crawling the entire Internet through full JS-enabled virtual browsers is extremely expensive and even Google doesn't do so consistently.
  • Due to choice or upstream decisions:
    • device age and capability.
    • browser choice. The anonymous Tor Browser disables JavaScript by default when in safe mode.[1] The "Reader" mode in Firefox and Safari extract pure HTML content, either from the DOM after effectively suspending JavaScript, or by the original server-sent HTML.
    • intervention by browser vendor. E.g. Chrome on Android sometimes auto-disables JS if the connection appears slow.[2]
    • browser preferences. People may disable JavaScript and grant permission only when it appears needed and trusted on a per-site basis.
  • Due to failure or chance:
    • server and network stability. No server or Internet connection is or promises to be end-to-end 100% up. One or more JavaScript requests can fail. HTML+CSS are the earliest and highest priority resources, and requiring other requests to succeed for a non-broken interface would exponentially increase failure mode by multiplying small propabilties.[3]
    • network speed. The JavaScript request can be so slow or large that it won't finish downloading (and executing) before you are done reading the page. Or, it may hit a timeout in the browser, cellphone carrier, roaming, our CDN, or a backend server.
    • interference by browser extension. At any given moment there are plugins that knowingly or unknowingly break code on the page in some form or another.

If you render or visualise information client-side only, it is de-facto inaccessible to these environments. Also consider how information and graphs are crawled by search engines. If it doesn't have a URL or isn't discoverable by HTML anchor link, it's probably not crawlable, searchable, findable, or sharable. The 2018 MW-Graph extension represents not only a hole in our articles in all the above environments, it also meant the graphs were no longer part of Google Images, and could no longer be shared through instant messagers as Image URL. The question is, do we develop to be functional by default, or do we fail anytime anyone encouters a deviation from the presumed norm? We are all the 1%, at different times.[4]

HTTP caching

Improving the cachability of responses to web requests used in the critical path, is expected to have the following impact:

  • First views: None.
  • Repeat views:
    • For any resource:
      • Consume less bandwidth (reduces mobile data costs).
      • Consume less power (fewer cell-radio activations required).
    • For CSS files:
      • Reduce time for visual rendering. Cached stylesheets load faster without a network roundtrip, allowing rendering to start and/or complete sooner.
      • Reduce time to domComplete and loadEventEnd metrics. Stylesheets are subresources required for DOM completion.
    • For JavaScript files:
      • Reduce "Time to Interactive". Cached scripts take less time to load, parse, and compile. Browser may store the compiled bytecode from previous page views. As of writing (Dec 2019), the bottleneck in loading JavaScript code is often not the download or execution, but the parsing/compilation. Allowing this to be cached, or reducing the amount of code, can benefit page load time more than optimising how fast it executes.

Latency

Reduce the time it takes for the browser to receive the response to a request it makes.

Impact on first views and repeat views:

  • Reduce time overall (for the feature to load, or for the action to be performed).
  • Consume less power (making more optimum use of the network means less idle time and thus fewer cell-radio activations required).

Different stragies to achieve this:

The fastest request is no request

  • … render server-side if possible to reduce need for additional JavaScript payloads.
  • … the request for an interface icon can be skipped entirely by using @embed on the CSS background image. (How and when).

Speed up the response

  • … by improving backend response times on the server,
  • … by allowing the response to be cached by the CDN in a nearby datacenter,
  • … by reducing the size of the response.

Start the request earlier

Making the request start earlier. For example:

  • … by making requests in parallel instead of serially one-after-the-other.
    • If same code is in charge of multiple requests, the $.when() function can be used to make parallel requests.
    • If multiple pieces of code are in charge of their own requests, consider returning a Promise and letting the other code proceed immediately to make its own request. Then, only once you truly need the data from the first Promise (or to wait for its response) and call its then() method. You can also use $.when() to return a Promise that auto-resolves when multiple other Promises have settled.
  • … by hinting the browser directly about your intentions. This has the benefit of not needing any changes to your code! By the time your HTML, CSS, or JS, makes a related request, the result of these hints will automatically be used.
  • ... by moving the request invisible with stale-while-revalidate. If you let the CDN and browsers cache something for 24 hours, this means that the first request after 24 hours will be delaying the client whilst we wait for this cache miss response. By setting stale-while-revalidate you can allow the browser to make use of the cache one more last time, and meanwhile in the background the browser will fetch the new value to use next time. For example, you could allow one stale response for upto 7 days. Or if it must be within 24 hours, you could shorten the regular cache period to 12 hours and then allow 12 hour of stale responses, adding up to 24h.

Avoid image embedding (sometimes)

Main article: mw:ResourceLoader/Architecture#EmbeddingWe generally recommended to avoid use of @embed in new code. The documented performance benefits of @embed have not changed since its introduction with ResourceLoader in 2010, but, we as of 2019 we no longer recommend @embed for general use.[5] This is due to numerous costs associated with @embed that stand separate from its relative benefits, as well as numerous improvements to alternatives that don't have these drawbacks.

When is embedding still worthwhile? We do still recommend @embed for SVG icons simple enough to be smaller than 0.3KB (before compression). The reason is that their contents are highly compressible under gzip, after which the SVG content is actually on-par with the URL itself. After all, even when you don't embed the image, the URL to the image is also data, and we ultimately save nothing if we use a URL to refer to an image that is smaller than the image itself.(e.g. /w/path/to/my/extension/resources/some.example.module/path/to/my/images/foobar.svg?version). Drawbacks to consider:

  1. Image embedding delays first paint.
    • It is generally preferred to improve the first contentful paint (by not embedding data in the stylesheet not needed for first paint) and thus have the icon show up later, than to keep the entire screen blank for longer by forcefully downloading an icon as part of the render-blocking stylesheet through image embedding, eventhough it would avoid a minor flash for the absence of the icon.
  2. Embedded images make suboptimal use of browser cache.
    • The list of style modules queued on a given page is variable. Special:RecentChanges and Special:Search queue different stylesheets, and even articles differ from each other with "Barcelona" containing a Kartographer map, and "Video" containing a TMH video player.
    • By referencing an icon by URL instead of embedding it, you get to load it instantly from the browser cache (even faster than an embed) on subsequent page views. If it is embedded, than it often has to be downloaded multiple times, each time part of a different stylesheet bundle.

Changes in bigger picture trade-off:

  • In HTTP/2, the base cost of an individual request (separate from the image's size) has been reduced, thus becoming slightly more competative with the embedded compression benefit.
  • In HTTP/2, browsers no longer serially delay and restrict the starting of concurrent requests, thus image requests have much less effect on other requests in the background, and are themselves no longer as delayed by other requests that would otherwise make it likely for an icon start its request much later than the browser discovers it.
  • Our key metrics (§ Metrics) now place more value on visual completion and total page load time. It remains important to be considerate of people's access to and costs of cellular bandwidth, but bandwidth is no longer the metric above all others.
  • All major browsers support rel=preload which is almost as fast as embedding in practice, and avoids the above drawbacks.

Avoid preloading (sometimes)

Avoid using rel="preload" in link tags or HTTP headers on the HTML response as this can cause congestion and competition against the HTML resource itself, as well as delay critical CSS resources needed for initial rendering. In MediaWiki these resources are already linked from the <head> and naturally discovered and pre-fetched by the browser as soon as possible, by the browser's lookahead parser.

Preloading is useful for resources that a browser cannot discover early. For example when a resource is fetched indirectly through CSS or JavaScript. To start the download of late-discovered resources earlier than usual, consider preloading closer to the time they are needed to avoid this competition.

As an example, the Wikipedia logo is a CSS background image on an <a> element in the sidebar. When the browser encounters an image tag in HTML, it immediately downloads it, regardless of whether the HTML tag will be parsed or made visible. When the browser parses and applies a stylesheet, however, it ignores background image URLs because the CSS rule in question does not apply yet. (Rules are only applied if the rule matches the state of an element in the DOM, and when the browser first renders an article it typically has not yet reached the HTML of the sidebar).

The "natural" point where the browser will start the download for the logo file is when that sidebar HTML has been reached and rendered to the screen (first without logo). This means that until the HTML is fully downloaded and visually rendered, the browser will not even start downloading the logo. We improved this by emitting a preload header from the CSS request that tells the browser to start the download right away, because we know it will be needed soon. You can read more about this in our blog post.

How to:

  • Emit Link: rel=preload headers from a CSS or JS server response, preferably the same response that contains the code where this will later be used.
  • Dynamically create an element like <link rel="preload"> in JavaScript. For example, from a click handler that has to load several resources. If there's a technical reason that you cannot safely start these resources at the same time using their normal means (e.g. if you have to execute them serially in a specific order), you can start them with a preload. When the actual Ajax request or link/script/img tag is made later, the browser will magically re-use the on-going (or finished) preload request.

Size of HTML payload

The first 14 KB (per TCP slow-start), should ideally contain everything needed to render the skin layout and a little bit of the article text. And, it should allow the browser to render that initial layout in a way that isn't later moved around or otherwise invalidated (additional components may appear later, but existing components should not move).

Examples of how to improve this:

  • Reduce amount of per-page header bloat, e.g. <link> tags, RLQ, mw.config, mw.loader.
  • Better minification for inline CSS, inline JavaScript and HTML within the <head>.
  • Arranging the HTML to ensure layout and start of content render first.

See also T231168 for on-going work in this area.

Size of stylesheets

Reducing the size of the main (blocking) stylesheet loaded by the HTML, is expected to have the following impact:

  • All views (first view, and repeat views):
    • Improvement of all paint metrics. Smaller stylesheets load and parse faster, allowing rendering to start and complete sooner.
    • Reduce time to domComplete and loadEventEnd metrics. Stylesheets are subresources required for DOM completion.
    • Consume less bandwidth (reduces mobile data costs).
    • Consume less power (reduce CPU time for stylesheet parsing, especially data URIs. See T121730 for on-going work).

Size of scripts

Note where the "Startup module" sits.

Reducing the size of scripts is expected to impact all views (first view, and repeat views):

  • Reduce "Time to Interactive". Less code to download, parse, and execute.
  • Reduce time to domComplete and loadEventEnd metrics. These scripts are sub resources part of DOM completion.
  • Consume less bandwidth (reduces mobile data costs).

(The above does not apply to scripts that are lazy-loaded from a human interaction after document-ready.)

Scripts run in one of three overall phases. Each phases depends on the outcome of previous phases (they run serially). Earlier phases have a bigger impact when reduced in cost, compared to later phases, because they allow subsequent phases to start their work sooner.

  1. Inline scripts in HTML <head> (including page configuration, page module queue, and the async request to the Startup module).
  2. The Startup manifest (including module manifest, dependency tree, and the mw.loader client).
  3. The page modules (the source code of modules loaded on the current page, and their dependencies).

Size of startup manifest

Reduce the amount of code contained in the Startup module by keeping the number of distinct module bundles low. In general, additional scripts should be added to an existing module bundle instead of creating new ones (see blog post and Grafana).

To help quantify the cost of modules, and to help with code maintenance more generally, we organize frontend assets in subdirectories by module bundle. See also code conventions, Best practices for extensions#File structure, and T193826.

Processing cost of scripts

Reduce the amount of time spend in executing JavaScript code during the critical path. This includes:

  • Deferring work that does not need to happen before rendering.
  • Splitting up work in smaller idle-time chunks to avoid blocking the main thread for too long (non-interactive "jank").
  • Re-arranging code so that there are no style reads after style writes in the same event loop. The browser naturally alternates between a cycle of JS execution and a cycle of style computation and rendering. If styles are changed in the main JS cycle and then read back within that same cycle, the browser is forced to pause the script and perform an ad-hoc render cycle before being able to resume the script. See also:

Expected impact on all views:

  • Reduce "Time to Interactive". Less execution before the page is ready. Less uninterrupted execution which blocks interactions Fewer "forced synchronous layouts" which significantly slow down code execution.
  • Reduce time to loadEventEnd metrics. These finishing of initial scripts' execution holds back "loadEvent".

See also

Meta

This guideline was originally drafted in February 2016 after T127328.

  1. Tor Browser - JavaScript and Flash, torproject.org.
  2. JavaScript isn’t always available and it’s not the user’s fault by Adam Silver (2019).
  3. Everyone has JavaScript, right? by Stuart Langridge (2015).
  4. Why availability matters by Stuart Langridge (2015).
  5. T121730: Audit use of @embed by Timo Tijhof (2019), phabricator.wikimedia.org.