Jump to content

Pixel

From Wikitech

Pixel is a tool that the Wikimedia Foundation uses to detect UI regressions. It is currently hosted on https://pixel.wmcloud.org/. For problems checkout Pixel/Runbook.

Maintainers

The code is maintained by the Wikimedia Foundation QTE Team. It was originally created by the Web Team.

Current setup

The current production version runs on production.pixel.eqiad1.wikimedia.cloud and can be accessed through https://pixel.wmcloud.org/.

How it works

The machine is hosted on Cloud VPS. You can ssh into it using

ssh production.pixel.eqiad1.wikimedia.cloud

High level overview

  • The git repository is checked out inside /home/pixel/pixel
  • A cron job runs a bash script at /home/pixel/pixel.sh. This executes the Pixel runAll command which creates the index.html and report pages. The priority is defined based on the hour of the day. Priority 1 jobs run hourly. Priority 2 jobs run twice a day, Priority 3 jobs run once a day.

Code

~/.bashrc

Used by cronjob to get correct Node.js version.

export NONINTERACTIVE=true
export MW_SERVER=https://en.wikipedia.org
export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"  # This loads nvm
[ -s "$NVM_DIR/bash_completion" ] && \. "$NVM_DIR/bash_completion"  # This loads nvm bash_completion

/home/pixel/pixel

The code is a clone of https://github.com/wikimedia/pixel.

/home/pixel/pixel.sh

The script generates the reports and index page.

#!/bin/bash

# At the moment the script run as root (as on the old pixel server). The reason
# is that  the outcome of the docker containers are stored as root and then the
# pixel.js script cannot access those files.
#
# The script is called from the pixel users crontab.
whoami

export PATH=/usr/bin/:/home/pixel/.nvm/versions/node/v18.17.0/bin/:$PATH
source /home/pixel/.bashrc  &&
cd /home/pixel/pixel || exit 1

export PIXEL_REPORT_DIRECTORY=/mnt/pixel-data/reports
export NONINTERACTIVE=true
export NVM_DIR="/home/pixel/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"
nvm use || exit 1

priority=$1

if [ -z "$priority" ]; then
  echo "Error: Priority argument is missing"
  exit 1
fi

# Make sure we pick up the latest tag
current_tag=$(git describe --tags --exact-match 2>/dev/null || echo "")
git fetch --tags
latest_tag=$(git tag --sort=-creatordate | head -n 1)
if [ "$latest_tag" != "$current_tag" ]; then
    echo "New tag: $latest_tag. Switch to that tag."
    git checkout "tags/$latest_tag"
else
    echo "Already on the latest tag: $current_tag"
fi

echo -e "\n\n\n"
echo -e "\e[32m===================================================\e[0m"
echo "Starting Pixel 'runAll --priority $priority'"
echo -e "\e[32m===================================================\e[0m"
echo "UTC datetime: $(date -u +"%Y-%m-%d %H:%M:%S %Z")"
echo "PDT datetime: $(TZ='America/Los_Angeles' date +"%Y-%m-%d %H:%M:%S %Z")"
echo -e "\e[34m---------------------------------------------------\e[0m"
echo -e "\n\n\n"

start_time=$SECONDS

node pixel.js runAll --priority "$priority" --directory "$PIXEL_REPORT_DIRECTORY"
exit_status=$?
end_time=$SECONDS

duration=$((end_time - start_time))
hours=$((duration / 3600))
minutes=$(((duration % 3600) / 60))
seconds=$((duration % 60))

echo "Exit code $exit_status"
echo -e "\e[34m---------------------------------------------------\e[0m"
printf "Finished Pixel 'runAll --priority %s'\nDuration (%s): %02d:%02d:%02d\n" "$priority" "$priority" $hours $minutes $seconds
echo -e "\e[32m===================================================\e[0m"
echo -e "\n\n\n"

timestamp=$(date +%s)

nc -w 5 wpt-graphite.wmftest.org 2003 <<< "pixel.jobprio.${priority}.failure ${exit_status} ${timestamp}"


if [ "$exit_status" -eq 0 ]; then
    nc -w 5 performance-testing-graphite.wmftest.org 2003 <<< "pixel.jobprio.${priority}.duration ${duration} ${timestamp}"
fi

REPORT_DIR="/home/pixel/pixel/report/"
CURRENT_TIME=$(date +%s)
find "$REPORT_DIR" -type f -name "report.json" | while read -r FILE; do
  ID=$(jq -r '.id' "$FILE")
  PASS=$(jq '.tests | map(select(.status=="pass")) | length' "$FILE")
  FAIL=$(jq '.tests | map(select(.status=="fail")) | length' "$FILE")
  FILE_TIME=$(stat -c %Y "$FILE")
  if [ "$ID" != "MediaWiki" ]; then
    echo "Sending $ID pass[$PASS] fail[$FAIL] for time ${FILE_TIME}"
    nc -w 2 performance-testing-graphite.wmftest.org 2003 <<< "pixel.job.${ID}.pass ${PASS} ${FILE_TIME}"
    nc -w 2 performance-testing-graphite.wmftest.org 2003 <<< "pixel.job.${ID}.fail ${FAIL} ${FILE_TIME}"
  fi
done

su crontab -l

pixel.sh is passed the priority:

PATH=/usr/bin/:/home/pixel/.nvm/versions/node/v20.11.1/bin/:$PATH
0 0,2,4,6,8,14,16,18,20,22 * * * sudo su -c "/home/pixel/pixel.sh 1 >> /var/log/pixel/pixel.log 2>&1"
0 10 * * * sudo su -c "/home/pixel/pixel.sh 2 >> /var/log/pixel/pixel.log 2>&1"
0 12 * * * sudo su -c "/home/pixel/pixel.sh 3 >> /var/log/pixel/pixel.log 2>&1"
45 4 * * * sudo su -c "cd /home/pixel/pixel && ./optimize-pngs.sh >> /var/log/pixel/pixel-optimize-png.log 2>&1"
10 5 * * * sudo su -c "cd /home/pixel/pixel && ./rebuild.sh >> /var/log/pixel/pixel-rebuild.log 2>&1"
35 5 * * * /home/pixel/archive.sh >> /var/log/pixel/pixel-archive.log 2>&1
45 23 * * * sudo su -c "find /mnt/pixel-data/reports/* -mtime +60 -exec rm {} \; >/dev/null 2>&1"
Script          UTC           PDT           Minute      Hour
p1              00:00 AM      05:00 PM      0           0
p1              02:00 AM      07:00 PM      0           2
p1              04:00 AM      09:00 PM      0           4
optimize-pngs   04:45 AM      09:45 PM      45          4
rebuild         05:10 AM      10:10 PM      10          5
archive         05:35 AM      10:35 PM      35          5
p1              06:00 AM      11:00 PM      0           6
p1              08:00 AM      01:00 AM      0           8
p2              10:00 AM      03:00 AM      0           10
p3              12:00 PM      05:00 AM      0           12
p1              02:00 PM      07:00 AM      0           14
p1              04:00 PM      09:00 AM      0           16
p1              06:00 PM      11:00 AM      0           18
p1              08:00 PM      01:00 PM      0           20
p1              10:00 PM      03:00 PM      0           22
find            11:45 PM      04:45 PM      45          23

Dashboard

Metrics from the server is pushed to https://grafana.wikimedia.org/d/lC3anj1Iz/pixel? .

Setup

The server is setup using these instructions and runs a tagged version of pixel.

There is also a beta server beta.pixel.eqiad1.wikimedia.cloud that runs the same setup except that it runs the latest commit of pixel. You can access it through https://pixel-beta.wmcloud.org/.

User

Pixel runs as the user pixel. You can change to the pixel user using sudo su - pixel. Pixel runs are triggered through the crontab for the pixel user. In the users home directory, the git repository is cloned and then a tag is checked out and the tests runs from that directory. When you log in to the server you will get information about the setup.

Update to a new pixel

Documented in the Runbook.

Volumes

There are two volumes attached to the server:

  • /mnt/docker - Docker uses the volume for all Docker files.
  • /mnt/pixel-data - All result files from pixel.

Logs

Pixel logs to /var/log/pixel/pixel.log and the cleanup script logs to /var/log/pixel/pixel-clean.log

Workarounds

There are a couple of hacks in the current setup that we should fix:

  • When you run Pixels docker containers, the output (files) are owned by root and the pixel script (started by the user pixel) do not have write access to the files. To fix that for now, the pixel user runs the script as su. That issue is tracked in T365419.
  • The report directory (where pixel writes the reports) is a symlink to the volume that has been setup for pixel data. But the /report/ is committed to the pixel repo (and all files within the folder is ignored). One way to fix that can be to gitignore the full folder. Another way would be to make the output folder configurable in pixel.js, however at the moment there's mix between using the current start scripts folder (__dirname in NodeJS) and setup the report directory using docker-compose. This is tracked in T365422.