Performance/Synthetic testing

From Wikitech
Jump to navigation Jump to search

Synthetic performance testing is using a browser on a server or phone somewhere in the world, load a web page, and collect performance metrics. Together with real-user measurements (collect performance metrics from real users) this is one of two ways we collect web performance metrics for Wikipedia.

The reason we use synthetic performance testing is to be able to find performance regressions in our code. But we also use the data we collect as a wayback machine, where we can go back in time and see what kind of content we where serving.

To be able to get stable metrics to find regressions we need to have stable environment for our tests:

  • Run the browser and tool on a stable server or on a dedicated mobile phone. At the moment we run our tools on AWS and on Android phones.
  • The tests needs to have stable connectivity, we need to have the same connectivity all the time so that it is not affecting our metrics. We use tc to get that.
  • We need to have a stable browser. We keep track of browser updates and browser behave the same all the time.
  • The pages we test should stable performance. Depending on how the page is built it can be more or less stable.


We use two different ways of measure the performance of Wikipedia:

  • We use Browsertime to measure the full performance journey from the browser to the server and back. We use it to collect user journeys (a user visiting multiple pages).
  • We use WebPageReplay together with Browsertime to focus on the front end performance. We use that to measure one pages performance with an empty browser cache. WebPageReplay is a traffic replay proxy that we use to get rid of server and internet flakiness.

Browsertime is the engine of that controls the browser and runs the test. Browsertime is also used by Mozilla to measure the performance of Firefox. WebPageReplay is used by Chrome to keep track of Chromes performance.

Both tools tests are configured in

We also use Browsertime/WebPageReplay to collect metrics from Android browser. Those tests are configured in


We've been using Browsertime/ since 2017 and you can read about our setup. We collect metrics and store them on our Graphite instance. You can see all pages/metrics on our page drill down dashboard.

You can choose different types of testing with the device dropdown:

  • desktop - tests where we test desktop Wikipedia
  • emulatedMobile - tests where we test mobile Wikipedia using a desktop browser, emulating a mobile phone
  • android - tests mobile Wikipedia using real mobile phones

The next dropdown Test type choses what test to see. It can be first view tests (with a cold browser cache), warm cache view, webpagereplay tests or different user journeys.

We also alert on those metrics using WebPageReplay for desktop/emulated mobile and Android and first view cold cache on desktop.


We do collect the some overall metrics from Chrome User Experience Report to keep track on how Wikipedia is doing from Chromes point of view. On we daily run a couple of tests and collect if we are slow/moderate/fast. You can see those metrics on the Chrome User Experience dashboard.

The data is collected using CruX plugin.

When to use what tool/test?

If you test the mobile version of Wikipedia, you should run tests on Android and our emulated mobile tests. What's good about running Android tests is that you know for sure the performance on that specific Android device, and we can say things like the first visual change of the Barack Obama page on English Wikipedia regressed by ten percent on a Moto G 5 phone.

If you want to find small frontend regression, testing with WebPageReplay should be your thing. However at the moment we only test one page (first view cold cache) tests with WebPageReplay.

If you want to test user journeys, test them direct agains Wikipedia servers using Browsertime. If you are not sure what tests to use, please reach out to the performance team and we will help you!