Performance/WebPageReplay

From Wikitech
Jump to navigation Jump to search

Background

In the path to have more stable metrics in our synthetic testing we have been trying out mahimahi, mitmproxy and WebPageReplay to record and replay Wikipedia. For mahimahi we have used patched version fixed by Gilles over Benedikt Wolters HTTP2 version of https://github.com/worenga/mahimahi-h2o. With mitmproxy and WebPageReplay we use the default version. The work has been done in T176361.

We have put mahimahi on ice because it is too much of hack to get HTTP/2 to work at the moment and WebPageReplay works out of the box with HTTP/2. mitmproxy worked fine but offered no clear benefit over WebPageReplay.

Replaying vs non replaying

Let us compare what the metrics looks like comparing WebPageTest vs WebPageReplay (Chrome).

Compare emulated mobile First Visual Change on Obama
Compare emulated mobile Speed Index on Obama
First Visual Change on Desktop using WPT vs WebPageReplay
Compare Speed Index on Desktop using WPT vs WebPageReplay

WebPageReplay setup

The current version run that collects the data for https://grafana.wikimedia.org/dashboard/db/webpagereplay is a Docker container with this setup:

https://github.com/soulgalore/browsertime-replays/tree/master/webpagereplay and the setup looks like this:

WebPageReplay setup

Running on AWS (instance type c4.large) we get stable metrics. We have tried running the same code on WMCS, bare metal and Google Cloud and in all those cases the metrics stability over time was at least 2 to 4 times worse than AWS. This difference remains unexplained and probably lies somewhere in AWS's secret sauce (custom hypervisor, custom kernel).

On desktop we can use 30 frames per second for the video and we get a metric stability span of 33 ms for first visual change. Which is 1 frame of accuracy, since at 30fps one frame represents 33.33ms. Speed Index's stability span is a little wider but still ok (less than 50 points but it depends on the content).

For emulated mobile, we can use 60 frames per second and get the same first visual change and Speed Index stability span as desktop at 30fps. We run the both desktop and mobile with 100ms simulated latency during the replays.

Servers

We run tests from three servers at the moment:

  • 50.19.169.203 - Run tests on English, Swedish, French and Dutch Wikipedia
  • 35.174.76.194 - Run tests on German, Spanish, Japanese, Chinese and Russian Wikipedia
  • 34.205.254.252 - Run tests on beta, deployment group 0 and deployment group 1
Access

Access the servers with the pem file:

# English, Swedish, French and Netherlands Wikipedia
ssh -i "webpagereplay.pem" ubuntu@50.19.169.203
# Beta, group0 and group 1
ssh -i "webpagereplay.pem" ubuntu@34.205.254.252
# German, Spanish, Japanese, Chinese and Russian Wikipedia
ssh -i "webpagereplay.pem" ubuntu@35.174.76.194

Setup a new server

Here are the details of our current setup. We currently run the tests on a C4.large VM on AWS using Ubuntu 16.

Install

To make it work, we need to install four things:

  1. Install Docker and grant your user right privileges to start Docker.
  2. Install Node.js+npm (latest LTS)
  3. Install bttostatsv: npm install bttostatsv -g
  4. Install directory-to-s3: npm install directory-to-s3 -g
  5. Make sure to EXPORT AWS_ACCESS_KEY_ID and EXPORT AWS_SECRET_ACCESS_KEY with the S3 account ids for the user that will run the tests. You find them on the other servers in .bashrc
Reboot

Also make sure the script start on server restart. Run crontab -e

And add @reboot rm /home/ubuntu/browsertime.run;/home/ubuntu/run.sh

That will remove the run file and restart everything if the server reboots.

Job setup

We run this job as an infinite loop and when we wanna update the script, we remove the control file. This is the script we use to test the English Wikipedia.

#!/bin/bash
CONTROL_FILE=/home/ubuntu/browsertime.run
LOG_FILE=/tmp/webpagereplay.log
exec > $LOG_FILE 2>&1

# Alway verify that the script isn't already running 
if [ -f "$CONTROL_FILE" ]
then
  echo "$CONTROL_FILE exist, do you have running tests?"
  exit 1;
else
  touch $CONTROL_FILE
fi

# Default settings
CONTAINER=sitespeedio/browsertime:3.5.0
CHROME_RUNS=11
FIREFOX_RUNS=11
MOBILE_RUNS=5
CHROME_FRAMERATE=30
MOBILE_FRAMERATE=60
FIREFOX_FRAMERATE=30
WIKI=enwiki
DOCKER_SETUP="--cap-add=NET_ADMIN --shm-size=2g -v /etc/localtime:/etc/localtime:ro --name browsertime"
LATENCY=100
declare -a DESKTOP_URLS=(https://en.wikipedia.org/wiki/Barack_Obama https://en.wikipedia.org/wiki/Facebook https://en.wikipedia.org/wiki/Sweden https://en.wikipedia.org/wiki/Aretha_Franklin https://en.wikipedia.org/wiki/Metalloid)

declare -a MOBILE_URLS=(https://en.m.wikipedia.org/wiki/Barack_Obama https://en.m.wikipedia.org/wiki/Facebook https://en.m.wikipedia.org/wiki/Sweden https://en.m.wikipedia.org/wiki/Aretha_Franklin https://en.m.wikipedia.org/wiki/Metalloid)


function cleanup() {
  docker system prune --all --volumes -f
}

function control() {
  if [ -f "$CONTROL_FILE" ]
  then
    echo "$CONTROL_FILE found. Make another run ..."
  else
    echo "$CONTROL_FILE not found - stopping after cleaning up ..."
    cleanup
    echo "Exit"
    exit 0;
  fi
}

function sendMetrics() {
    if [ $? -eq 0 ]
    	then
            bttostatsv result/browsertime.json $GRAPHITE_PREFIX.$GRAPHITE_KEY https://www.wikimedia.org/beacon/statsv >> /tmp/s.log 2>&1
            sleep 3
            sudo mkdir -p data/$WIKI/$TYPE/$BROWSER/$LATENCY/$GRAPHITE_KEY/$DATE
            sudo cp result/screenshots/1.jpg data/$WIKI/$TYPE/$BROWSER/$LATENCY/$GRAPHITE_KEY/latest.jpg
	        sudo cp result/browsertime.har.gz data/$WIKI/$TYPE/$BROWSER/$LATENCY/$GRAPHITE_KEY/latest.har.gz
            sudo mv result/* data/$WIKI/$TYPE/$BROWSER/$LATENCY/$GRAPHITE_KEY/$DATE
            directory-to-s3 -d data webpagereplay-wikimedia
            sudo rm -fR data
      else
        echo 'Browsertime returned an error, not sending metrics'
      fi
    # Always cleanup the result folder between runs  
    sudo rm -fR result      
}
function runChrome() {
    FRAMERATE=$CHROME_FRAMERATE
    RUNS=$CHROME_RUNS
    BROWSER=chrome
    TYPE=desktop
    GRAPHITE_PREFIX=browsertime.enwiki.$TYPE.$BROWSER.anonymous.replay.$LATENCY
    GRAPHITE_KEY=$(basename $URL)
    DATE=`date '+%Y-%m-%d-%H-%M'`
    docker run $DOCKER_SETUP --rm -v "$(pwd)":/browsertime -e REPLAY=true -e LATENCY=$LATENCY $CONTAINER -b $BROWSER -n $RUNS --resultDir result --cacheClearRaw --videoParams.framerate $FRAMERATE --connectivity.alias $LATENCY --chrome.timeline true --gzipHar --videoParams.nice 8 --videoParams.createFilmstrip false --resultURL https://s3.amazonaws.com/webpagereplay-wikimedia/$WIKI/$TYPE/$BROWSER/$LATENCY/$GRAPHITE_KEY/$DATE/ --screenshot true $URL
    
    sendMetrics
}

function runFirefox() {
    FRAMERATE=$FIREFOX_FRAMERATE
    RUNS=$FIREFOX_RUNS
    BROWSER=firefox
    TYPE=desktop
    GRAPHITE_PREFIX=browsertime.enwiki.$TYPE.$BROWSER.anonymous.replay.$LATENCY
    GRAPHITE_KEY=$(basename $URL)
    DATE=`date '+%Y-%m-%d-%H-%M'`
    docker run $DOCKER_SETUP --rm -v "$(pwd)":/browsertime -e REPLAY=true -e LATENCY=$LATENCY $CONTAINER --resultDir result -n $RUNS -b $BROWSER --cacheClearRaw --videoParams.framerate $FRAMERATE --connectivity.alias $LATENCY --videoParams.nice 8 --videoParams.createFilmstrip false --resultURL https://s3.amazonaws.com/webpagereplay-wikimedia/$WIKI/$TYPE/$BROWSER/$LATENCY/$GRAPHITE_KEY/$DATE/ --gzipHar true --screenshot true $URL
    
    sendMetrics
}

function runMobile() {
    FRAMERATE=$MOBILE_FRAMERATE
    RUNS=$MOBILE_RUNS
    BROWSER=chrome
    TYPE=mobile
    GRAPHITE_PREFIX=browsertime.enwiki.$TYPE.$BROWSER.anonymous.replay.$LATENCY
    GRAPHITE_KEY=$(basename $URL)
    DATE=`date '+%Y-%m-%d-%H-%M'`
    docker run $DOCKER_SETUP --rm -v "$(pwd)":/browsertime -e REPLAY=true -e LATENCY=$LATENCY $CONTAINER --resultDir result -b $BROWSER -n $RUNS --cacheClearRaw --videoParams.framerate $FRAMERATE --chrome.mobileEmulation.deviceName 'iPhone 6' --videoParams.nice 8 --connectivity.alias $LATENCY --gzipHar --chrome.timeline true --videoParams.createFilmstrip false --resultURL https://s3.amazonaws.com/webpagereplay-wikimedia/$WIKI/$TYPE/$BROWSER/$LATENCY/$GRAPHITE_KEY/$DATE/ --screenshot true $URL
    sendMetrics
}

while true
do
  echo "Run desktop tests 100"
  for URL in "${DESKTOP_URLS[@]}"
  do
    runChrome
    control
    runFirefox
    control
  done

  echo "Run emulates mobile tests 100"
  for URL in "${MOBILE_URLS[@]}"
  do
    runMobile
  done

  sleep 30
  control
  cleanup
done
Start and restart

Start the script:  nohup /home/ubuntu/run.sh &

Restart: First remove /home/ubuntu/browsertime.run and then tail the log and wait for the script to exit. Then start as usual.

Store the data

The metrics/videos/screenshots and HAR files are sent to S3 where they are kept for one week.

http://webpagereplay-wikimedia.s3-website-us-east-1.amazonaws.com/

Log

You can find the log file at /tmp/webpagereplay.log. There you can find all log entries from Browsertime.

Upgrade to a new version

Firefox and Chrome are bundled in the Docker container. When there's a new version, check the changelog and update like this:

  1. SSH to the server
  2. Remove the run file: rm /home/ubuntu/browsertime.run
  3. Wait for the tests to finish by tailing the log and look for Exit tail -f /tmp/webpagereplay.log
  4. Update to a new Docker container by editing the run file: nano /home/ubuntu/run.sh And change x.y.z to your new version sitespeedio/browsertime:x.y.z
  5. Start the tests again (the new container will automatically be downloaded): nohup /home/ubuntu/run.sh &
  6. Go to the WebPageReplay Grafana dashboard and add an annotation with the tag webpagereplay and a message of what you upgraded
  7. Keep looking at the graph and verify that everything looks ok.
  8. Upgrade the other servers to the new version.

Alerts

We also run alerts on the metrics we collect from WebPageReplay. Checkout Performance/WebPageReplay/Alerts.