Jump to content

Backport windows/Deployers

From Wikitech
Deployments

This document is intended to provide detailed instructions for deployers as to how to run the backport windows. Hopefully, this document will prove useful to new deployers as well as provide a place for more experienced deployers to take notes on any tips and tricks they have discovered in the course of doing deployments.

General advice before you start

  • Claim the window early to avoid confusion.
When jouncebot pings deployers in #wikimedia-operations connect, if you want to run that window, say so I can do the deploys today!
  • Try to think out loud and be explicit.
If you are nervous about deploying a particular patch, mention it to the patch owner. It's better to have a conversation than to quietly fret over patches. If the patch is high risk and you're not comfortable deploying it, you can decline, and if no deployers are available, the patch owner can reschedule.
  • Be prepared.
Open the relevant SSH connections and browser tabs before you start deploying code; see the relevant section in this document for details.
If the patch requires a maintenance script be run afterwards, make sure that the patch owner has provided for this, or that you are comfortable running the script yourself; see the maintenance scripts section of this document for how to run them and some examples, or Wikimedia_site_requests#Common_tasks_that_need_a_maintenance_script for a longer list of changes that require scripts, and more details about them.
  • Learn how manual deployment works and then don't do it. Use scap backport instead.
  • Release the window early when done to mark the rest of the window free.
!log UTC morning deploys done, !log UTC evening deploys done, !log UTC late deploys done, or something to that effect.
  • Add git information to your terminal prompt. The git-prompt command is available on deployment servers. There are instructions for use in the comments at the beginning of the script. One simple way to use it is to add the following to your ~/.profile:
    GIT_PS1_SHOWUNTRACKEDFILES=1
    GIT_PS1_SHOWDIRTYSTATE=1
    GIT_PS1_SHOWUPSTREAM="auto verbose"
    . /etc/bash_completion.d/git-prompt
    PS1='[\u@\h \W$(__git_ps1 " (%s)")]\$ '
    
  • Enable Git configuration status.submoduleSummary. Submodules have limited visibility in git status and it's easy to miss the git submodule update step. Git can show you a short summary of submodule changes in git status. Enable it by executing:
    you@deploy1002:~$ git config --global status.submoduleSummary true
    
The deployment of security patches for extensions does not entail a git submodule update, and so a given repository may appear "dirty". This is the normal state of the repository. See https://phabricator.wikimedia.org/T229285 for further related discussion.

Deploying patches outside scheduled deployment windows

Sometimes there's a need to deploy a patch outside a scheduled window. There's a formal policy to follow when doing this. Though quite a few people with deployment access do this for their own patches, you still need to be very careful. You should keep the following things in mind when doing these kinds of deploys:

  • Communicate on -operations (or on _security). You should clearly announce when you're about to deploy stuff and announcing when you're done. You should also remember to scroll up to determine if someone else is deploying, since merging patches might take a while.
  • Use jouncebot: nowandnext to determine if anything else is going on, and ensure that you have enough time to not step on any scheduled windows.
  • Only deploy patches where you're confident in your ability to troubleshoot any issues that might come up, since there will be less people around to help.
  • You still need approval from RelEng and SRE to deploy on Fridays and weekends.
  • If in doubt: stick to the normal windows.

SSH Connections and Error Logs: Set up before deploying

When running the window, you'll want to see what's on the calendar, and watch error logs as you deploy so that you can be sure nothing you have just deployed is broken. Also, there are several machines on which you may need to run commands depending on the nature of the deploy; it's good to open all SSH connections before you have to think about them.

A script to set up these browser tabs and terminals automatically is available.

Browser tabs

Terminal tabs

  • mwlog1002.eqiad.wmnet - to run logspam-watch, which you may wish to keep an eye during the window.
    • You may also run logspam for a one-off listing of recent errors, suitable for grepping.
  • deployment.eqiad.wmnet (which is an alias to a deployment host in the currently primary data center) - This is where you stage changes.
you@laptop$ ssh mwlog1002.eqiad.wmnet

you@mwlog1002:~$ logspam-watch
you@laptop$ ssh deployment.eqiad.wmnet

you@deploy1002:~$ cd /srv/mediawiki-staging

Occasionally you may also need to run maintenance scripts on a maintenance server; maintenance.eqiad.wmnet is an alias to a maintenance host in the current primary data center. (At some point, this will be obsoleted by mwscript-k8s, currently under development at T341553.)

you@laptop$ ssh maintenance.eqiad.wmnet

you@mwmaint1002:~$

Previously, you also had to have a terminal on an mwdebug host. This is no longer necessary when using scap backport – that command deploys the change to mwdebug servers automatically.

Most deployments are a single command: scap backport <change_number_or_url>.

This command will backport Gerrit patches for operations/mediawiki-config, any currently deployed MediaWiki version, extension, skin, or submodule of any of the above. It can deploy single patches, or multiple changes together. It will handle merging the Gerrit patch (if needed), deploying to testservers, wait for confirmation, and then run sync to all appservers (including Kubernetes).

See Scap Backport Deployments for more details.

See below for detailed instructions on how to manually deploy backports.

Merging and applying patches

The deploy-commands tool, for any given gerrit change, produces a list of commands for deployment and revert that can be copy-pasted, and this tool is automatically linked to patches added to the Deployment Calendar via Module:Gerrit. Don't copy-paste without reading carefully first, though; you should double-check the output in all cases.

You will be merging patches either for a wmf branch of mediawiki repositories (core, or a WMF-installed extension or skin), or the operations/mediawiki-config repo.

Check the MediaWiki versions tool to confirm which branches may need a backport for a given patch.

Merging patches

When +2ing patches, it's often helpful to have the Zuul Dashboard open

  • to ensure that Zuul is picking up your changes,
  • to see how long (approximately) the test will take before it auto-merges.

It's a good practice to put Backport as the comment when you +2 before you click Publish Comments to ensure that there is a record of why you approved the patch.

Fetching patches

After the patch has merged in Gerrit, you need to fetch it down to deployment.eqiad.wmnet. Make sure that the code you fetch down to deployment.eqiad.wmnet is the code you expected to fetch down.

Use git log -p HEAD..@{u} after you git fetch to check that the patch(es) you fetched down were the same ones the you +2'd. If they aren't, poke the person that wrote the patch in #wikimedia-operations connect to figure out what to do with the fetched code. It's always better to ask than to do something silently and unilaterally.

Process is slightly different for operations/mediawiki-config, mediawiki/core, mediawiki/extensions and mediawiki/skins.

operations/mediawiki-config

you@deploy1002:/srv/mediawiki-staging$ git status

you@deploy1002:/srv/mediawiki-staging$ git fetch

you@deploy1002:/srv/mediawiki-staging$ git log -p HEAD..@{u}

you@deploy1002:/srv/mediawiki-staging$ git rebase

mediawiki/core

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git fetch

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git log -p HEAD..@{u}

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git rebase

mediawiki/extensions and mediawiki/skins

As soon as the change to an extension or skin gets merged, Gerrit bumps the associated submodule in the wmf/* branch of mediawiki/core. To deploy, you thus just have to fetch that parent repository, verify the difference between the current state (HEAD) and the tracked remote branch (@{u}), rebase and update the submodule:

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git fetch

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git log -p HEAD..@{u}

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git rebase

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git status [extensions|skins]/[NAME]

you@deploy1002:/srv/mediawiki-staging/php-[VERSION]$ git submodule update [extensions|skins]/[NAME]
Security Patches

Refer to How to deploy code -> Security patches for information about security patches.

Deploying changes

Canary

After changes have been fetched and otherwise git-wrangled on deployment.eqiad.wmnet, changes can be fetched down to mwdebug1002.eqiad.wmnet and tested via the X-Wikimedia-Debug header.

you@mwdebug1002:~$ scap pull

After changed have been fetched, ask patch-owner to test changes on mwdebug1002.eqiad.wmnet using X-Wikimedia-Debug.

Full deployment

After a change has been tested on mwdebug1002.eqiad.wmnet it can be deployed to all machines. To deploy the code you will run: scap sync-file <file> [message for SAL]. The code path passed to scap sync-file should be relative to /srv/mediawiki-staging.

The message you type after the file or directory name to be synced will appear in the Server Admin Log — wikitext is legal and can be useful. Copy/pasting the wikitext for that backport item from the Deployments calendar is easy. If the Gerrit change has an associated Phabricator task, mention the task ID in the message as appropriate. This will trigger Stashbot to reply back on tasks and indicate that the associated change was synced. You can use the backport-summary script (locally, not on the deployment host) to build the summary based on the Gerrit change URL.

you@deploy1002:/srv/mediawiki-staging$ scap sync-file [FILE|FOLDER] 'Backport: [[gerrit:[GERRIT-NUMBER]|[COMMIT-MESSAGE] ([PHABRICATOR-TASK])]]'

operations/mediawiki-config

Example:

you@deploy1002:/srv/mediawiki-staging$ scap sync-file wmf-config/InitialiseSettings.php 'Config: [[gerrit:444901|Enable FileExporter for sourceswiki (T198594)]]'

mediawiki/core

Example:

you@deploy1002:/srv/mediawiki-staging$ scap sync-file php-1.32.0-wmf.12/includes/cp/ChronologyProtector.php 'Backport: [[gerrit:445113|rdbms: fix value of ChronologyProtector::POSITION_COOKIE_TTL ([T194403])]]'

mediawiki/extensions and mediawiki/skins

Example:

you@deploy1002:/srv/mediawiki-staging$ scap sync-file php-1.32.0-wmf.12/extensions/WikimediaEvents 'Backport: [[gerrit:445377|Add event logging for WMDE fundraising banners (T197571)]]'

Purging

See also Multicast HTCP purging#One-off purge.

When a patch for mediawiki-config changes a file in /static, its public url must be purged from Varnish. For performance reasons, unversioned files in static have unconditional caching for up to a year. They rely on manual purges to propagate updates. This purge must be done with en.wikipedia.org as hostname of the url, regardless of which wiki the file relates to. This is because the cache for /static is shared between all wikis, and the canonical form internally for it is en.wikipedia.org.

you@mwmaint1002:~$ echo 'https://en.wikipedia.org/static/images/project-logos/newikibooks.png' | mwscript purgeList.php
  • Refresh url in browser.

Verification

After the sync and any purge is finished, monitor logs and ask patch-owner to, once again, test their changes to confirm the patch was deployed successfully. Make sure the patch-owner verifies it with X-Wikimedia-Debug turned off.

Reverting

If a patch doesn't work as expected, or causes errors, it will have to be reverted.

Revert on the deployment host and sync FIRST to minimize downtime/errors, even if scap automatically detected the errors and aborted the full sync.

Process

Use scap backport --revert <change_number_or_url> to automatically revert and deploy code, or follow the instructions below. Note that the revert command will wait for the patches to merge before deploying, so in an emergency it may be ideal to revert manually.

For either process to work without git prompting you for authentication information on each use, you will need to add some configuration on the deployment server. The simplest configuration is done via a $HOME/.netrc file which git will automatically read:

  1. touch $HOME/.netrc
  2. chmod go-rwx $HOME/.netrc
  3. vim $HOME/.netrc

A more complex configuration involves having an ssh public-private key pair connected to your Gerrit account on the deployment server and $HOME/.gitconfig pushInsteadOf settings to rewrite the https://... git URL to an ssh://... equivalent.

Manual revert

Revert the commit causing errors locally on the deployment server:

you@deploy1002:[FOLDER]$ git revert [SHA1]

If the patch being reverted is a merge commit you will have to supply -m:

you@deploy1002:[FOLDER]$ git revert [SHA1] -m1

Push code live BEFORE pushing patches to Gerrit:

you@deploy1002:/srv/mediawiki-staging$ scap sync-file [FILE/FOLDER] 'Backport: Revert "[[gerrit:[NUMBER]|[TEXT] (T[NUMBER])]]"'

Push revert patch to Gerrit:

you@deploy1002:[FOLDER]$ git push origin HEAD:refs/for/[BRANCH]%topic=revert-[SHA1]

You will be prompted for your Gerrit https username and password if you have not done the $HOME/.netrc setup described above.

Maintenance scripts

During the course of the window, you may encounter a patch that needs a maintenance script to be run as part of deployment. As noted earlier, maintenance scripts are run from mwmaint1002.eqiad.wmnet or mwmaint2002.codfw.wmnet.

For convenience, the most frequently run maintenance scripts are presented below.

tmux or screen

For long running scripts, it is recommended they are run in tmux or screen.

If you want to save terminal output, you can use script.

you@mwmaint1002:~$ script file_name
you@mwmaint1002:~$ # run command
...
you@mwmaint1002:~$ cat file_name
...

If you prefer tmux:

you@mwmaint1002:~$ tmux new -s 'backport'
...
you@mwmaint1002:~$ exit

If you need to leave in the middle you can do ctrl-b d to detach.

If you prefer screen:

you@mwmaint1002:~$ screen -D -RR backport
...
you@mwmaint1002:~$ exit

If you need to leave in the middle you can do ctrl-a d to detach.

Run a maintenance script on a group of wikis

See Wikimedia binaries#mwscriptwikiset.

Run a maintenance script on all wikis

See Wikimedia binaries#foreachwiki.

createExtensionTables

Allow to source SQL files to create tables for most extensions we deploy.

For example, to create on ar.wikipedia PageAssessments tables:

you@mwmaint1002:~ $ mwscript extensions/WikimediaMaintenance/createExtensionTables.php arwiki pageassessments

namespaceDupes

When a new namespace is added to an existing wiki, the namespaceDupes maintenance script should be run for that wiki. First do a dry run of namespaceDupes for the wiki (without --fix) as a sanity check. Then append --fix to fix namespace duplication:

you@mwmaint1002:~ $ mwscript namespaceDupes.php testwiki
you@mwmaint1002:~ $ mwscript namespaceDupes.php testwiki --fix

updateCollation

When the default collation is changed for a wiki, the updateCollation.php maintenance script will need to be run:

you@mwmaint1002:~$ mwscript updateCollation.php --wiki=iswiki --previous-collation=<VALUE>

Replace <VALUE> with what the wiki's previously configured collation name was in $wgCategoryCollation.

The default collation in MediaWiki is uppercase, as such in most cases when a wiki switches to a different collation the previous can be specified as --previous-collation=uppercase.

resetAuthenticationThrottle

See Increasing account creation threshold.