Backport windows/Deployers
This document is intended to provide detailed instructions for deployers as to how to run the backport windows. Hopefully, this document will prove useful to new deployers as well as provide a place for more experienced deployers to take notes on any tips and tricks they have discovered in the course of doing deployments.
General advice before you start
- Claim the window early to avoid confusion.
- When jouncebot pings deployers in #wikimedia-operations connect, if you want to run that window, say so
I can do the deploys today!
- Try to think out loud and be explicit.
- If you are nervous about deploying a particular patch, mention it to the patch owner. It's better to have a conversation than to quietly fret over patches. If the patch is high risk and you're not comfortable deploying it, you can decline, and if no deployers are available, the patch owner can reschedule.
- Be prepared.
- Open the relevant SSH connections and browser tabs before you start deploying code; see the relevant section in this document for details.
- If the patch requires a maintenance script be run afterwards, make sure that the patch owner has provided for this, or that you are comfortable running the script yourself; see the maintenance scripts section of this document for how to run them and some examples, or Wikimedia_site_requests#Common_tasks_that_need_a_maintenance_script for a longer list of changes that require scripts, and more details about them.
- Learn how manual deployment works and then don't do it. Use scap backport instead.
- Release the window early when done to mark the rest of the window free.
!log UTC morning deploys done
,!log UTC afternoon deploys done
,!log UTC late deploys done
, or something to that effect.
- Add git information to your terminal prompt. The
git-prompt
command is available on deployment servers. There are instructions for use in the comments at the beginning of the script. One simple way to use it is to add the following to your~/.profile
:GIT_PS1_SHOWUNTRACKEDFILES=1 GIT_PS1_SHOWDIRTYSTATE=1 GIT_PS1_SHOWUPSTREAM="auto verbose" . /etc/bash_completion.d/git-prompt PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
- Enable Git configuration
status.submoduleSummary
. Submodules have limited visibility ingit status
and it's easy to miss thegit submodule update
step. Git can show you a short summary of submodule changes ingit status
. Enable it by executing:you@deploy1002:~$ git config --global status.submoduleSummary true
git submodule update
, and so a given repository may appear "dirty". This is the normal state of the repository. See https://phabricator.wikimedia.org/T229285 for further related discussion.Deploying patches outside scheduled deployment windows
Sometimes there's a need to deploy a patch outside of a scheduled window. There's a formal policy for emergency deployments. But deployers may also ship non-emergency patches outside of a dedicated window, as long as they do so thoughtfully. You should keep the following things in mind when doing these kinds of deploys:
- Communicate on -operations (or on _security). You should clearly announce when you're about to deploy and when you're done deploying. You should also remember to scroll up to see if someone else is deploying, since merging patches might take a while.
- Use
jouncebot: nowandnext
to determine if anything else is going on, and ensure that you have enough time to not step on any scheduled windows. - Only deploy patches where you're confident in your ability to troubleshoot any issues that might come up, since there will be fewer people around to help.
- You still need approval from RelEng and SRE to deploy on Fridays and weekends.
- If in doubt: stick to the normal windows.
SSH Connections and Error Logs: Set up before deploying
When running the window, you'll want to see what's on the calendar, and watch error logs as you deploy so that you can be sure nothing you have just deployed is broken. Also, there are several machines on which you may need to run commands depending on the nature of the deploy; it's good to open all SSH connections before you have to think about them.
A script to set up these browser tabs and terminals automatically is available.
Browser tabs
- Deployments calendar, links to patches that are scheduled for the window.
- MediaWiki versions tool, to check what versions may need a backport.
- Zuul Status dashboard, to watch the progress of CI for the patches being merged.
- Logstash: mwdebug, ensure no warnings or errors appear on the debug host when the patch owner does their verification.
- Logstash: mediawiki-errors, ensure no new errors appear after patch is deployed to all servers.
Terminal tabs
- mwlog1002.eqiad.wmnet - to run
logspam-watch
, which you may wish to keep an eye during the window.- You may also run
logspam
for a one-off listing of recent errors, suitable for grepping.
- You may also run
- deployment.eqiad.wmnet (which is an alias to a deployment host in the currently primary data center) - This is where you stage changes. Once there you need to start a
tmux
orscreen
session—if you've never used either, try tmux. Both tmux and screen are terminal multiplexers—if you lose connection in the middle of a deploy, your terminal session stays running.
you@laptop$ ssh mwlog1002.eqiad.wmnet
you@mwlog1002:~$ logspam-watch
you@laptop$ ssh deployment.eqiad.wmnet
you@deployment-host:~$ tmux new -s deployment
you@deployment-host:~$ cd /srv/mediawiki-staging
Occasionally you may also need to run maintenance scripts on a maintenance server; maintenance.eqiad.wmnet
is an alias to a maintenance host in the current primary data center. (At some point, this will be obsoleted by mwscript-k8s
, currently under development at T341553.)
you@laptop$ ssh maintenance.eqiad.wmnet
you@mwmaint1002:~$
Previously, you also had to have a terminal on an mwdebug host. This is no longer necessary when using scap backport
– that command deploys the change to mwdebug servers automatically.
Using scap backport
Most deployments are a single command: scap backport <change_number_or_url>
.
This command will backport Gerrit patches for operations/mediawiki-config
, any currently deployed MediaWiki version, extension, skin, or submodule of any of the above. It can deploy single patches, or multiple changes together. It will handle merging the Gerrit patch (if needed), deploying to testservers, wait for confirmation, and then run sync to all appservers (including Kubernetes).
See Scap Backport Deployments for more details.
See below for detailed instructions on how to manually deploy backports.
Merging and applying patches
The deploy-commands tool, for any given gerrit change, produces a list of commands for deployment and revert that can be copy-pasted, and this tool is automatically linked to patches added to the Deployment Calendar via Module:Gerrit. Don't copy-paste without reading carefully first, though; you should double-check the output in all cases.
You will be merging patches either for a wmf branch of mediawiki repositories (core, or a WMF-installed extension or skin), or the operations/mediawiki-config repo.
Check the MediaWiki versions tool to confirm which branches may need a backport for a given patch.
Merging patches
When +2
ing patches, it's often helpful to have the Zuul Dashboard open
- to ensure that Zuul is picking up your changes,
- to see how long (approximately) the test will take before it auto-merges.
It's a good practice to put Backport
as the comment when you +2
before you click Publish Comments
to ensure that there is a record of why you approved the patch.
Fetching patches
After the patch has merged in Gerrit, you need to fetch it down to deployment.eqiad.wmnet
. Make sure that the code you fetch down to deployment.eqiad.wmnet
is the code you expected to fetch down.
Use git log -p HEAD..@{u}
after you git fetch
to check that the patch(es) you fetched down were the same ones the you +2
'd. If they aren't, poke the person that wrote the patch in #wikimedia-operations connect to figure out what to do with the fetched code. It's always better to ask than to do something silently and unilaterally.
Process is slightly different for operations/mediawiki-config, mediawiki/core, mediawiki/extensions and mediawiki/skins.
operations/mediawiki-config
you@deploy1002:/srv/mediawiki-staging$ git status
you@deploy1002:/srv/mediawiki-staging$ git fetch
you@deploy1002:/srv/mediawiki-staging$ git log -p HEAD..@{u}
you@deploy1002:/srv/mediawiki-staging$ git rebase
mediawiki/core
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git fetch
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git log -p HEAD..@{u}
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git rebase
mediawiki/extensions and mediawiki/skins
As soon as the change to an extension or skin gets merged, Gerrit bumps the associated submodule in the wmf/*
branch of mediawiki/core
. To deploy, you thus just have to fetch that parent repository, verify the difference between the current state (HEAD
) and the tracked remote branch (@{u}
), rebase and update the submodule:
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git fetch
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git log -p HEAD..@{u}
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git rebase
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status [extensions|skins]/[NAME]
you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git submodule update [extensions|skins]/[NAME]
Security Patches
Refer to How to deploy code -> Security patches for information about security patches.
Deploying changes
Canary
After changes have been fetched and otherwise git
-wrangled on deployment.eqiad.wmnet
, changes can be fetched down to mwdebug1002.eqiad.wmnet
and tested via the X-Wikimedia-Debug header.
you@mwdebug1002:~$ scap pull
After changed have been fetched, ask patch-owner to test changes on mwdebug1002.eqiad.wmnet
using X-Wikimedia-Debug.
Full deployment
After a change has been tested on mwdebug1002.eqiad.wmnet
it can be deployed to all machines. To deploy the code you will run: scap sync-file <file> [message for SAL]
. The code path passed to scap sync-file
should be relative to /srv/mediawiki-staging
.
The message you type after the file or directory name to be synced will appear in the Server Admin Log — wikitext is legal and can be useful. Copy/pasting the wikitext for that backport item from the Deployments calendar is easy. If the Gerrit change has an associated Phabricator task, mention the task ID in the message as appropriate. This will trigger Stashbot to reply back on tasks and indicate that the associated change was synced. You can use the backport-summary script (locally, not on the deployment host) to build the summary based on the Gerrit change URL.
you@deploy1002:/srv/mediawiki-staging$ scap sync-file [FILE|FOLDER] 'Backport: [[gerrit:[GERRIT-NUMBER]|[COMMIT-MESSAGE] ([PHABRICATOR-TASK])]]'
operations/mediawiki-config
Example:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file wmf-config/InitialiseSettings.php 'Config: [[gerrit:444901|Enable FileExporter for sourceswiki (T198594)]]'
mediawiki/core
Example:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file php-1.32.0-wmf.12/includes/cp/ChronologyProtector.php 'Backport: [[gerrit:445113|rdbms: fix value of ChronologyProtector::POSITION_COOKIE_TTL ([T194403])]]'
mediawiki/extensions and mediawiki/skins
Example:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file php-1.32.0-wmf.12/extensions/WikimediaEvents 'Backport: [[gerrit:445377|Add event logging for WMDE fundraising banners (T197571)]]'
Purging
See also Multicast HTCP purging#One-off purge.
When a patch for mediawiki-config changes a file in /static
, its public url must be purged from Varnish. For performance reasons, unversioned files in static have unconditional caching for up to a year. They rely on manual purges to propagate updates. This purge must be done with en.wikipedia.org
as hostname of the url, regardless of which wiki the file relates to. This is because the cache for /static
is shared between all wikis, and the canonical form internally for it is en.wikipedia.org
.
- View url in browser. Example: https://en.wikipedia.org/static/images/project-logos/newikibooks.png
- Purge the url from a Maintenance server:
you@mwmaint1002:~$ echo 'https://en.wikipedia.org/static/images/project-logos/newikibooks.png' | mwscript purgeList.php
- Refresh url in browser.
Verification
After the sync and any purge is finished, monitor logs and ask patch-owner to, once again, test their changes to confirm the patch was deployed successfully. Make sure the patch-owner verifies it with X-Wikimedia-Debug turned off.
Reverting
If a patch doesn't work as expected, or causes errors, it will have to be reverted.
Process
Use scap backport --revert <change_number_or_url>
to automatically revert and deploy code, or follow the instructions below. Note that the revert command will wait for the patches to merge before deploying, so in an emergency it may be ideal to revert manually.
For either process to work without git
prompting you for authentication information on each use, you will need to add some configuration on the deployment server. The simplest configuration is done via a $HOME/.netrc
file which git
will automatically read:
touch $HOME/.netrc
chmod go-rwx $HOME/.netrc
vim $HOME/.netrc
- Add a line like
machine gerrit.wikimedia.org login [USERNAME] password [PASSWORD]
- The [USERNAME] and [PASSWORD] placeholders should be replaced with data from https://gerrit.wikimedia.org/r/settings/#HTTPCredentials
- Add a line like
A more complex configuration involves having an ssh public-private key pair connected to your Gerrit account on the deployment server and $HOME/.gitconfig
pushInsteadOf settings to rewrite the https://... git URL to an ssh://... equivalent.
Manual revert
Revert the commit causing errors locally on the deployment server:
you@deploy1002:[FOLDER]$ git revert [SHA1]
If the patch being reverted is a merge commit you will have to supply -m
:
you@deploy1002:[FOLDER]$ git revert [SHA1] -m1
Push code live BEFORE pushing patches to Gerrit:
you@deploy1002:/srv/mediawiki-staging$ scap sync-file [FILE/FOLDER] 'Backport: Revert "[[gerrit:[NUMBER]|[TEXT] (T[NUMBER])]]"'
Push revert patch to Gerrit:
you@deploy1002:[FOLDER]$ git push origin HEAD:refs/for/[BRANCH]%topic=revert-[SHA1]
You will be prompted for your Gerrit https username and password if you have not done the $HOME/.netrc
setup described above.
Maintenance scripts
During the course of the window, you may encounter a patch that needs a maintenance script to be run as part of deployment. As noted earlier, maintenance scripts are run from mwmaint1002.eqiad.wmnet
or mwmaint2002.codfw.wmnet
.
For convenience, the most frequently run maintenance scripts are presented below.
tmux or screen
For long running scripts, it is recommended they are run in tmux or screen.
If you want to save terminal output, you can use script.
you@mwmaint1002:~$ script file_name
you@mwmaint1002:~$ # run command
...
you@mwmaint1002:~$ cat file_name
...
If you prefer tmux:
you@mwmaint1002:~$ tmux new -s 'backport'
...
you@mwmaint1002:~$ exit
If you need to leave in the middle you can do ctrl-b d
to detach.
If you prefer screen:
you@mwmaint1002:~$ screen -D -RR backport
...
you@mwmaint1002:~$ exit
If you need to leave in the middle you can do ctrl-a d
to detach.
Run a maintenance script on a group of wikis
See Wikimedia binaries#mwscriptwikiset.
Run a maintenance script on all wikis
See Wikimedia binaries#foreachwiki.
createExtensionTables
Allow to source SQL files to create tables for most extensions we deploy.
For example, to create on ar.wikipedia PageAssessments tables:
you@mwmaint1002:~ $ mwscript extensions/WikimediaMaintenance/createExtensionTables.php arwiki pageassessments
namespaceDupes
When a new namespace is added to an existing wiki, the namespaceDupes
maintenance script should be run for that wiki. First do a dry run of namespaceDupes for the wiki (without --fix
) to check if there are pages that need fixing. Then append --fix
to fix namespace duplication:
you@mwmaint1002:~ $ mwscript namespaceDupes.php testwiki
you@mwmaint1002:~ $ mwscript namespaceDupes.php testwiki --fix
updateCollation
When the default collation is changed for a wiki, the updateCollation.php
maintenance script will need to be run:
you@mwmaint1002:~$ mwscript updateCollation.php --wiki=iswiki --previous-collation=<VALUE>
Replace <VALUE>
with what the wiki's previously configured collation name was in $wgCategoryCollation
.
The default collation in MediaWiki is uppercase
, as such in most cases when a wiki switches to a different collation the previous can be specified as --previous-collation=uppercase
.