WebPageTest is a web performance tool that uses real browsers to access web pages and collect timing metrics. The killer feature of WebPageTest is the metric called SpeedIndex – a measure of how fast the above-the-fold content is displayed. The Wikimedia Performance Team runs a private instance of WebPageTest at http://wpt.wmftest.org on AWS, and you can view the metrics we collect at https://grafana.wikimedia.org/dashboard/db/webpagetest.
- 1 Background
- 2 Setup
- 3 Usage
- 3.1 Install the CLI
- 3.2 Running a test using the CLI
- 3.3 Run on Jenkins
- 3.4 Test as a authenticated user (use scripting)
- 3.5 Using custom metrics
- 3.6 How to test new features/changes
- 3.7 Caution: Choose what to see on the result page
- 4 WebPageTest and AWS
- 4.1 Setup the server
- 4.2 Setup the agents
- 4.3 Connect to an agent
- 4.4 Timeout and agents not responding
- 5 Bulk test
- 6 Using http://wpt.wmftest.org/
- 7 Alert setup
Synthetic testing on the other hand tries to minimize the different factors that can impact the metrics to let us pinpoint the correlation between code changes and metrics impact. Synthetic testing tries to run from the same location, same latency, same browser and measuring the same way. Using WebPageTest as our synthetic tool also has another advantage: SpeedIndex - the best way today to measure when the above the fold content is ready for the user. Synthetic testing's downside is that the metrics aren't from real users.
We use the NavigationTiming extension to add our own script to collect RUM metrics, we run a private instance of WebPageTest to collect synthetic testing and we run Browsertime/WebPageReplay to collect synthetic testing under isolated premisses.
The current setup looks like this:
We have one instance at the moment called us-east. We don't use auto scaling since it's broken on Linux see https://github.com/WPO-Foundation/wptagent/issues/56.
Install the CLI
We have use a small wrapper script that talks to the WebPageTest API using the https://github.com/marcelduran/webpagetest-api. It collects the metrics and reports the metrics as csv/json or to graphite/statsv.
You can clone the project https://github.com/wikimedia/wpt-reporter or install the wrapper with
npm install wpt-reporter -g
Running a test using the CLI
If you have installed the project, you can simply run it like this:
wpt-reporter --webPageTestKey WPT_API_KEY --reporter json --webPageTestHost wpt.wmftest.org --location us-east:Chrome https://en.wikipedia.org/wiki/Facebook
It will send a request to our WebPageTest instance and start a test. Make sure to change the WPT_API_KEY value to your secret key.
If you want to see what you can configure and the default values, run:
All parameters that you send to the tool are passed to the WPT API, there are a lot of things you can configure and here is the full list.
Send to statsv
Sending statistics to statsv is disabled by default. Turn it on like this (by setting the reporter to statsv and setting a valid endpoint):
wpt-reporter --webPageTestKey WPT_API_KEY --endpoint YOUR_STATSV_ENDPOINT --reporter statsv --webPageTestHost wpt.wmftest.org --location us-east:Chrome https://en.wikipedia.org/wiki/Facebook
Store as CSV
If you want to verify that your changes are faster than your current version, you can use WebPageTest and test the changes and store the result as a CSV file. You can choose where CSV data will be stored with the file option. If the file doesn't exist, it will add one line with all the column names of the metrics. If the file exists, it will just append the new metrics on a new line.
wpt-reporter --webPageTestKey WPT_API_KEY --webPageTestHost wpt.wmftest.org --reporter csv --file myresult.csv --location us-east:Chrome https://en.wikipedia.org/wiki/Facebook
If you want to test mobile pages, you can fake the user agent and set viewport and size using our own WebPageTest instance. If you want to test with a real mobile phone, you can use the Motorola G phones that are available on WebPageTest.org.
Set view port and screen size
You can fake a mobile user agent (only when using) Chrome. By using emulateMobile you will use a Chrome mobile user agent, 640x960 screen, 2x scaling and fixed viewport.
wpt-reporter --webPageTestKey WPT_API_KEY --webPageTestHost wpt.wmftest.org --emulateMobile true --reporter json --location us-east:Chrome https://en.m.wikipedia.org/wiki/Denali%E2%80%93Mount_McKinley_naming_dispute
Set your User Agent string
If you want to set your own user agent, use --userAgent (it will only work if you use Chrome).
wpt-reporter --webPageTestKey WPT_API_KEY --webPageTestHost wpt.wmftest.org --location us-east:Chrome --userAgent "Mozilla/5.0(iPhone;U;CPUiPhoneOS4_0likeMacOSX;en-us)AppleWebKit/532.9(KHTML,likeGecko)Version/4.0.5Mobile/8A293Safari/6531.22.7" --reporter json https://en.m.wikipedia.org/wiki/Denali%E2%80%93Mount_McKinley_naming_dispute
Use a real mobile phone
If you want to use a real mobile phone, you can do that by using the public instance of WebPageTest.org and use the Motorola 5. Make sure that you use the WebPageTest key for the public instance (and not the key for our private instance).
wpt-reporter --webPageTestKey WPT_ORG_API_KEY --webPageTestHost http://www.webpagetest.org --location "Dulles_MotoG:Motorola G - Chrome" --reporter json https://en.m.wikipedia.org/wiki/Facebook
Test with 2g connectivity
The current version of the WebPageTest API doesn't have a short handle for 2g but you can set the connectivity yourself and simulate 2g. Just make sure to also increase the timeout time, because running multiple tests on 2g takes time.
wpt-reporter --webPageTestKey WPT_API_KEY --webPageTestHost wpt.wmftest.org --bandwidthDown 35000 --bandwidthUp 32000 --latency 1300 --timeout 2400 --connectivity custom --reporter json --emulateMobile true --location us-east:Chrome https://en.m.wikipedia.org/wiki/Facebook
If you want to try out your own settings, you can checkout WebPageTest ini file for inspiration.
Test multiple URLs
You can test multiple URLs by choosing the --batch option. Running a batch will fetch all the URLs and the configuration from a file. You supply the path and name of the file and the file needs to have one URL/run on each line. It looks like this:
## Test the Facebook page 15 times --webPageTestKey <%WMF_WPT_KEY> --webPageTestHost wpt.wmftest.org --runs 15 --median SpeedIndex --reporter json --location us-east:Chrome https://en.wikipedia.org/wiki/Facebook ## And then test Barack 31 and use SpeedIndex as median pick --webPageTestKey <%WMF_WPT_KEY> --webPageTestHost wpt.wmftest.org --runs 31 --median SpeedIndex --reporter json --location us-east:Chrome https://en.wikipedia.org/wiki/Barack_Obama
Look closely at the webPageTestKey value. The value <%WMF_WPT_KEY> will be replaced by the environment variable named WMF_WPT_KEY. Running in node, the value will be replaced with process.env.WMF_WPT_KEY. If the variable isn't found, there will be an error logged. You can create your own variables in the script files and feed the values with environment variables following the same pattern.
Changing and testing a batch file
When we run the tests in Jenkins we run batch scripts located here: https://github.com/wikimedia/performance-WebPageTest/tree/master/scripts/batch
When you want to add a URL to test or change anything, you need to test it locally. You can easily do that by following the example in each batch.
Here's an example of running a batch locally.
WMF_WPT_KEY=OUR_SECRET_KEY STATSV_ENDPOINT=http://127.0.0.1 WPT_RUNS=1 WMF_WPT_LOCATION=us-east bin/index.js --batch scripts/batch/desktop.txt
When you add a new test, make sure to check out the screenshots on WebPageTest to see that it worked as expected (the user is logged in etc). There's also a bash script that will test all batch files, so if you make changes to the current batch scripts, please run it like this before you commit your changes:
When your changes has been approved and Jenkins pull them, make sure that the jobs in Jenkins work fine:
Run on Jenkins
We use Jenkins to continuously run the tests (https://integration.wikimedia.org/ci/job/performance-webpagetest-linux-wmf/ and https://integration.wikimedia.org/ci/job/performance-webpagetest-wpt-org/). Jenkins runs the batch files and to be able to do that, you need to do three things:
- Git clone the project: https://gerrit.wikimedia.org/r/performance/WebPageTest.git and use the master branch refs/heads/master
- Add the binding for the environment variables using Bindings and share them as Secret Text. You need to setup WMF_WPT_KEY, WPT_ORG_WPT_KEY, WPT_USER & WPT_USER_PASSWORD
- Run the tests in an execute shell build step:
#!/bin/bash declare -i RESULT=0 # These tests runs on our own Linux WebPageTest instance export STATSV_ENDPOINT="https://www.wikimedia.org/beacon/statsv" export WPT_RUNS="5" export WPT_MOBILE_RUNS="5" export WMF_WPT_LOCATION="us-east" npm install --production ./bin/index.js --batch ./scripts/batch/mobile.txt RESULT+=$? ./bin/index.js --batch ./scripts/batch/desktop.txt RESULT+=$? ./bin/index.js --batch ./scripts/batch/login-mobile.txt RESULT+=$? ./bin/index.js --batch ./scripts/batch/login-desktop.txt RESULT+=$? ./bin/index.js --batch ./scripts/batch/second-view-mobile.txt RESULT+=$? ./bin/index.js --batch ./scripts/batch/second-view-desktop.txt RESULT+=$? exit $RESULT
We automatically run wpt a couple of times a day to collect metrics and send them to statsv. If you want to look at a specific test, go to http://wpt.wmftest.org/testlog/30/ and choose Show tests from all users. You will then look at all the test runs for the last 30 days. You can change the time span by changing the View and choose Update list.
Test as a authenticated user (use scripting)
In the world of WebPageTest you can either test a specific URL or write a script that perform a couple of interactions; like access a page, log in the user and access another page. And you can choose when to start collecting metrics. You can read more about the setup: https://sites.google.com/a/webpagetest.org/docs/using-webpagetest/scripting
Using scripts, it follows the same pattern as running batch scripts, meaning you can have variables in your script that will be replaced with the value of environment variables at run time. Name your variable <%YOUR_NAME_HERE> and it will be replaced with process.env.YOUR_NAME_HERE.
Using custom metrics
How to test new features/changes
One idea with using WebPageTest is that we also can test new features, changes and have a good way of measuring metrics before and after a change. If you want to collect metrics for a change, it is important that you run the test many times so that we have a median value that can reflect the change. Many times means at least 31 times :) Always choose an odd number, that way a real run is picked as the median.
If you want to run a massive tests, you need to setup a new agent instance, else it will compete with our hourly runs.
Choose to report the result as CSV, then you will have one file with all the URLs you test in one condition, switch the condition and make another CSV and import the two CSV files into a program that handles CSV files. The CSV file will on each row contain the tested URL and the different metrics. This is a good approach:
- Create a batch file with the URLs you want to test. Make sure to parameterize things like connectivity, number of runs and so on. One line in your batch file can look like this:
--webPageTestKey <%WMF_WPT_KEY> --webPageTestHost wpt.wmftest.org --median SpeedIndex --location <%WMF_WPT_LOCATION>:Chrome --label chrome --runs <%WPT_RUNS> --connectivity <%WPT_CONNECTIVITY> --reporter csv --file <%CSV_FILE_NAME%> --timeout <%WPT_TIMEOUT> https://en.wikipedia.org/wiki/Main_Page
- Add the rest of the URLs that you want to test in the file by copy/pasting the row and changing the URL at the end.
- Setup your environment so you export the environment variables. First time your run it: make sure to only use one run and good connectivity like cable to run through the test fast and see that all URLs and configuration are ok. The timeout time is high by default but if you test with 2g for example, the test will take a long time to finish, so you need to increase the timeout value (default is 20 minutes and you define it in seconds, so default timeout is 1200).
- Verify the CSV file. Does it look the way you need it? Good, keep going.
- Change the number of runs to the number you want (and the connectivity) and run the script again. This can take time. Keep your terminal open, each and every test will be submitted when the last one is finished.
- Ok we have some numbers, now configure the change so that when you access the test URL we have the new feature/change. Make sure to change the name of the CSV file, so we keep two different ones.
- Run the test again.
- Now you have the numbers! Lets compare them. If you need help to understand them, talk to the performance team and we will help you analyze them.
Caution: Choose what to see on the result page
By default WebPageTest will pick the median run of pageLoadTime. That's not optimal because we want to focus on SpeedIndex or start render time. By adding parameters to the start result page, you can choose what run and metric that will picked up as the median run. Choose which metric to use and if you want the median or the fastest run:
WebPageTest and AWS
WebPageTest consists of two separate entities: a server and agent(s). On AWS there are ready made AMI:s (prepared images) for the two, so it is an easy to click and deploy.
WebPageTest can run headless or not. Headless in this context meaning no GUI available to start a test, you then need to use the API to submit tests.
Setup the server
It can be hard finding the right AMI, for a server in us-west we use AMI id ami-d7bde6e7
- Find the right AMI (ami-d7bde6e7) under Images/AMI and pick it (make sure to choose Public images).
- Choose Launch and use type t2.micro. Make sure to choose Next: Configure Instance Details
- Go the the Advanced Details section and add the configuration for the server. Make sure you change all the secret placeholders to the real values and choose Next: Choose storage
- Nothing you need to do here, choose Next: Tag Instance
- Add the tag Name with the value: WebPageTest Server and then Next: Configure Security Group
- Change the SSH access to only be our IP range.
- Add access for HTTP by choosing Add Rule - use the dropdown and choose HTTP and keep the rest of the values default.
- Choose Review and launch and then Launch. You will be asked to choose an existing key pair or use an existing. Create a new pair (name it webpagetest) and download it. You will need the keys to be able to SSH to the server so make sure to download them.
- Attach the new server the Elastic Path IP: NETWORK & SECURITY/Elastic IPs and choose the IP and Associate Address (we use ip 184.108.40.206).
- Use the tag WebPageTest Server in the instance field and choose Associate.
- You should now be able to access http://wpt.wmftest.org and see the "headless" start page.
Login to the server
ssh -i webpagetest.pem firstname.lastname@example.org
These are the configuration details that you use in the Advanced Details section.
email@example.com ec2_key=SECRET ec2_secret=SECRET ; the key used when starting tests api_key=SECRET ; no GUI for submitting tests, but we can check the results headless=1 ; Define maximums runs per URL maxruns=51 ; Quality of images, lets define this to something good iq=80 ; save png full-resolution screen shots pngss=1 ; automatically update the agent when a new version is available agentUpdate=http://cdn.webpagetest.org/ ; needed for autoscaling EC2.default=us-east-1 ; keep an instance up and running EC2.us-east-1.min=1 EC2.us-east-1.max=2 ; how long to keep tests locally before sending them to S3 archive_days=0 ; archiving to s3 (using the s3 protocol, not necessarily just s3) archive_s3_server=s3.amazonaws.com archive_s3_key=SECRET archive_s3_secret=SECRET archive_s3_bucket=wpt-wikimedia
WebPageTest can automatically store the test results on S3 and that is perfect for us so we can drop the server instance whenever we want.
To setup S3 (these are the instructions to do it the first time):
- Log into the AWS console and choose S3
- Choose Create a bucket
- Add a Bucket name and name it wpt-wikimedia (the bucket name needs to correspond to the property archive_s3_bucket when you configure the server).
- Add the Region. We use Oregon and that matches the configuration property archive_s3_server key.
- Choose Create and we have created the bucket.
- Next step is to setup the properties on the bucket, meaning giving access for HTTP traffic and the server to upload the test results.
- Choose your bucket (webpagetest) and choose Properties/Permissions.
- Choose Add more permissions and add Authenticated Users as Grantee and give it Upload/Delete permissions.
- Then we need to configure how long time we want to store the data.
- Choose Lifecycle and Add rule.
- Apply the rule for the whole bucket
- Choose Permanently Delete and 370 days.
- Choose Review and add a Rule Name: Permanently remove tests after 370 days
- Choose Create and Activate Rule
We are now finished setting up S3.
WebPageTest is stateless and stores everything on file. To be able to find old tests, WebPageTest uses a log file. The log file is not backed up to S3, so to be able to find old tests if the server is dropped, we need to store the logs on a separate disk.
- Choose Elastic Block Store / Volumes
- Choose Create Volume
- Choose a Size in GiB (the lowest 30GB will do fine)
- Choose Availability Zone. Use the same as the server
- Leave everything else as the default and choose Create.
- Choose the radio button for the newly created volume and the Tag label.
- Add a new Tag with the key Name and the value WebPageTest logs and choose Save.
- Make sure the volume is selected with the radio button and choose Action/Attach Volume.
- Choose WebPageTest server as the instance and use the default Device and choose Attach.
- The volume is now attached to our server, the next step is to login to the server and make sure that the logs are stored on the device.
- Use the pem-file for the server and login: ssh -i NAME.pem ubuntu@SERVER_IP (change name of the pem file to your pem file and the SERVER_IP to the real IP and follow these instructions: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-using-volumes.html
- Follow the instructions and mount the device to /data
- Now your volume is mounted, the next step is to change WebPageTest log dir to a symbolic link to a directory that exists on the volume. If you haven't done any tests, the directory should be empty except for a .htaccess file.
- Make your new directory on the mounted device: sudo mkdir /data/logs
- Move the access file: sudo mv /var/www/webpagetest/www/logs/.htaccess /data/logs
- Remove the old one: sudo rm -fR /var/www/webpagetest/www/logs/
- Make the symbolic: sudo ln -s /data/logs /var/www/webpagetest/www/logs
- Make sure we have the right owner for the directory: sudo chown -h www-data:www-data /data/logs
Depending on the AMI image, it could be that we are missing connectivity profiles: 3GFast, 3GSlow and 2G. If they are missing, you should add them in /var/www/webpagetest/www/settings/connectivity.ini
[3GFast] label="Mobile 3G - Fast (1.6 Mbps/768 Kbps 150ms RTT)" bwIn=1600000 bwOut=768000 latency=150 plr=0 timeout=120 [3GSlow] label="Mobile 3G - Slow (780 Kbps/330 Kbps 200ms RTT)" bwIn=780000 bwOut=330000 latency=200 plr=0 timeout=240 [2G] label="Mobile 2G (280 Kbps/256 Kbps 800ms RTT)" bwIn=280000 bwOut=256000 latency=800 plr=0 timeout=300
The username and password for the master AWS account is recorded in
iron:/srv/passwords/aws-webpagetest. Please avoid using this account directly. Ask an existing maintainer to create an IAM user for you instead. The IAM users sign-in link for this account is https://wikimedia.signin.aws.amazon.com/console .
If something isn't working for you on the WebPageTest server instance you can find the logs here (yep they are on different locations)
/var/www/webpagetest/www/cli/archive.log /var/www/webpagetest/www/logs/ /var/www/webpagetest/www/ec2/log /var/www/webpagetest/www/log /var/log/nginx/error.log
Restart the server
If for some reason you want to restart the server (normally if you manually changed settings in /var/www/webpagetest/www/settings/settings.ini) restart nginx:
sudo service nginx restart
Archive old tests
Old test data will automatically be sent to S3. But we also need to remove old tests, easiest way to do it is to run the archive page. We do it in the crontab. Edit the crontab (as the ubuntu user): crontab -e
SHELL=/bin/bash PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin 0 * * * * curl -sS http://wpt.wmftest.org/cli/archive.php >> /tmp/cron.txt
We then run archiving every our.
We use Amazon cloud watch to keep track of the disk space of the WebPageTest server. You need to install a couple of libraries to get it up and running, follow the instructions. Then add one line to your crontab to start sending the metrics to Amazon.
AWS_CREDENTIAL_FILE=~/.aws/awscreds.txt */5 * * * * ~/aws-scripts-mon/mon-put-instance-data.pl --disk-space-used --disk-space-avail --disk-path=/ --from-cron
Then setup an alarm for the disk space. The current alarm warns (sends an email to the web perf list) when we only have 2 gb disk free.
Install the wrapper on the server
You only need to do this if you don't want to run the wrapper on Jenkins. First we install node & git to be able to get and run our wrapper:
sudo apt-get update && sudo apt-get install -y git nodejs npm && sudo ln -s /usr/bin/nodejs /usr/local/bin/node
Install the latest version of the wrapper
npm install wpt-reporter -g
Add jobs to run automatically
To schedule jobs we use Jenkins.
Setup the agents
We have an agent up and running on us-east on EC2 and it is called us-east. We use Docker version of the agent that you can get from https://hub.docker.com/r/wikimedia/wptagent.
- In the AWS console, choose to Launch instance and choose Ubuntu Server 18.04 LTS .
- Choose the instance size c5.xlarge.
- Choose the WebPageTestAgent.pem file and start the server.
The next step is login to the server and install Docker. Follow the official Docker install instructions: https://docs.docker.com/install/linux/docker-ce/ubuntu/
Then you can create a start script on the server (start.sh):
#!/bin/bash VERSION=Mozilla_Firefox_66.0.2-Google_Chrome_73.0.3683.86_-2019-04-03 sudo modprobe ifb numifbs=1 sudo docker run -d \ -e SERVER_URL="http://wpt.wmftest.org/work/" \ -e LOCATION="us-east" \ -e KEY="SECRET" \ --cap-add=NET_ADMIN \ --shm-size=2g \ --name wptagent \ --init \ -v /etc/localtime:/etc/localtime:ro \ wikimedia/wptagent:$VERSION
You need to change the VERSION to be the latest tagged version on https://hub.docker.com/r/wikimedia/wptagent and change the SECRET to be the configured secret on the WebPageTest Server (called location_key in /var/www/webpagetest/www/settings/settings.ini). The LOCATION need to match the location configured on the server.
You do that in /var/www/webpagetest/www/settings/locations.ini. That file is parsed with the ec2_locations.ini file and the result is the configured agents.
[locations] 1=Hosted_Linux default=Hosted_Linux [Hosted_Linux] 1=us-east label="Linux US east 1" default=us-east group=Desktop [us-east] browser=Chrome,Chrome Beta,Chrome Canary,Firefox,Firefox Nightly,Opera,Opera Beta,Opera Developer label="Linux US east"
You can read more about how to configure the locations.ini file at https://github.com/WPO-Foundation/webpagetest/blob/master/www/settings/locations.ini.sample
When you started your agent, and changed the locations.ini file, restart nginx on the WebPageTest server:
sudo service nginx restart
Then you can start you agent:
And then verify that you can see your instance at http://wpt.wmftest.org/getLocations.php?f=html
Connect to an agent
You can ssh to the agent with the WebPageTestAgent.pem file. You can find the IP of the agent on AWS.
Timeout and agents not responding
We have seen that a couple of times one of the agents just stop working. You can see that by that all tests timeout and if you go to http://wpt.wmftest.org/ and check the latest finished results, the report will say that the agent couldn't be contacted by the server. To fix that, you need to login to the AWS console and go to EC2 management and make sure you are on "US East" region, choose the agent (it is named WebPagetest Agent), and choose Instance state -> Restart.
If you want to test changes before and after it's super important to test it many times to get correct values, use WPTBulkTest for that. Make sure to setup a new agent for your bulk test!
At the moment our test instance is busy running our continuously performance tests that we graph on https://grafana.wikimedia.org/dashboard/db/webpagetest. We run one test agent to minimize the costs. If you want to use wpt.wmftest.org to run your own one shot tests, I (firstname.lastname@example.org) can help you with that. You will need the key for the instance and choose which location you wanna use and then I can help you verify that the location is setup with the correct instance type.
We run automatic tests every hour (you can find the tests here https://github.com/wikimedia/wpt-reporter/tree/master/scripts/batch). We test mainly test the English Wikipedia: 3 desktop URLs using Chrome, the same URLs using Firefox and the mobile version on emulated mobile. You can see how we graph the metrics (and alert on regression): https://grafana.wikimedia.org/dashboard/db/webpagetest-alerts
Our mainly focus is testing on empty browser cache but we also run test with multiple page views (first hit one and then another) and as authenticated users. The metrics are too unstable at the moment to add alerts but we hope we can do that in the future.