How to deploy code

From Wikitech
Jump to: navigation, search
This article is mainly about deployment of changes to MediaWiki code,

Introduction

  • All configuration and utilities are in version control (in the operations/mediawiki-config.git repository)
  • Each version of MediaWiki (e.g. 1.20wmf1) is in a branch of the mediawiki/core.git repository, with submodules for the extensions deployed in that version.
  • This mediawiki-config repository is checked out on the deployment host tin at /a/common, with each branch of the MediaWiki codebase and its extensions checked out in /a/common/php-1.XXX subdirectories
  • sync scripts synchronize that working copy on tin onto /usr/local/apache/common on hundreds of servers.

See also

Basic common sense

  • Be careful. Breaking the site is surprisingly easy!
    • don't make deployment changes from a development directory, instead use a separate clean git clone just for deployments
    • check git status constantly (or set your shell prompt to show the info).
  • If you're deploying code written by someone else, ask them to be around during deployment so they can troubleshoot if necessary.
  • Make sure you know about anything hairy, such as additional prerequisites (e.g. schema changes) or potential complications when rolling back.
  • Perform operations in the right order. For example, if you're deploying code affecting the databases, you should create or edit SQL tables before deploying a change requiring these tables.
  • Join #wikimedia-operations and #wikimedia-tech on Freenode and be available before and after all changes.

Deployment requirements

  • Getting Deploy access
    • Cluster account request through RT - (requires manager and/or sr dev approval)
      • If you can ssh into tin, and ssh into a random srv box (e.g. srv300) from there, you already have this.
    • Join/read the Ops mailing list
    • Recommended: Ask an experienced deployer to tag along once or twice before attempting your own.
  • Deployment branch access requested
  • Common sense. See above
  • Some shiny code
  • A window of time to deploy during (that doesn't overlap with anyone else's window). Deployments is the calendar for planning and recording activities in these windows.
  • A clean local git repository of mediawiki/core (use ssh for speed), in which you have set up git review using git review -s
  • Be present on IRC. #wikimedia-tech and #wikimedia-operations are two places where people will come to yell at you if something goes wrong, you should be able to hear them.

Step 1: get the code in the deployment branch

Before you can deploy anything, it has to be in the deployment branch(es). Our deployment branches are named wmf/1.MAJORwmfMINOR where MAJOR and MINOR are numbers that increase over time as new branches are cut. A new branch with an incremented MINOR number is cut at the start of each deployment cycle, and after each tarball release MAJOR is incremented and MINOR is reset to 1. Strict access control is enforced on the deployment branches, but you should have access to them if you are a deployer. On the deployment host tin, the checkout of each deployment branch is in /a/common/php-1.MAJORwmfMINOR .

Note that in most cases the cluster will be running on two deployment branches, with some wikis running version N and some running version N+1. To see what versions the cluster is currently running, on tin execute:

$ /a/common/multiversion/activeMWVersions

To see which wiki is running which version, inspect /a/common/wikiversions.json (public mirror) or look at Special:Version.

If your code or change needs to go live to all wikis, you will need to change all deployment branches that are in use. An easy way to see all of the versions currently in use is to log onto the deployment host (tin) and run mwversionsinuse from the command line. You can also run mwversionsinuse --withdb to see a wiki that is running each version.

NOTE: All examples on this page assume there is a single deployment branch called wmf/1.20wmf1 checked out on the cluster in php-1.20wmf1. You may need to adapt the examples to use a different branch name. If you are updating multiple deployment branches, simply repeat the steps for each deployment branch separately.

NOTE: Also, all git examples assume you have a clean working copy, that is, you have no uncommitted changes. To verify this, run git status, it should say nothing added to commit (working directory clean) or nothing added to commit but untracked files present . If you are doing git-fu with a dirty working copy, there is a high probability you will screw things up, so don't do that unless you know what you're doing.

Case 1a: core changes

You are deploying changes to MediaWiki core. This should be rare because core is updated from master every two to three weeks, but in some cases it might be necessary. For core changes, you will simply need to push or submit changes to the wmf/1.20wmf1 branch in core. The most common use case is to take a commit that is already in the repository somewhere (usually in master, sometimes a commit that's still pending review) and cherry-pick it into the deployment branch, so only that case is documented below.

To cherry-pick a commit into the deployment branch, do the following things locally:

$ cd mediawiki/core      # go to your checkout of mediawiki/core.git
 
# Set up a local wmf/1.20wmf1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.20wmf1 branch,
# you can skip this step
$ git branch --track wmf/1.20wmf1 origin/wmf/1.20wmf1
Branch wmf/1.20wmf1 set up to track remote branch wmf/1.20wmf1 from origin.
 
# Switch to the wmf/1.20wmf1 branch and update it from the remote
$ git checkout wmf/1.20wmf1
$ git pull
$ git submodule update --init --recursive
 
# Cherry-pick a commit from master, identified by its patch set hash
$ git cherry-pick ffb1b38ad83927606c539ac941e9f3eb2653a840
 
# If there are conflicts, this is how you fix them:
# - run 'git status' to see which files are conflicted
# - start fixing conflicted files using your favorite editor
# - use 'git add filename' to tell git you've fixed the conflicts in a file
# - once all conflicts are resolved, commit the result using 'git commit'
 
# Submit your cherry-pick commit for review
$ git review
# If you don't want or need this to be reviewed, you can +2 your own
# commit if you are in the wmf-deployment group

Case 1b: extension changes

You are deploying changes to an extension, but you don't just want to deploy master. Instead, you want to deploy the code that is currently deployed, plus your change. If you do want to just update your extension to master, read the extension update section instead.

Starting with 1.23wmf10, all deployed extensions have automatically-created wmf/1.xxwmfyy branches. Each of these extension branches should be in sync with the corresponding submodule pointer in the corresponding core branch. To deploy an extension update, you make changes to this branch, then update the submodule pointer in core.

Updating the deployment branch

Just like in core, the most common use case for updating a deployment branch is to cherry-pick changes from master. You can do this using the Cherry Pick To button in Gerrit, or from the command line as follows:

$ cd mediawiki/extensions/MyCoolExtension      # go to your extension checkout
 
# Set up a local wmf/1.20wmf1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.20wmf1 branch, you can skip this step
$ git branch --track wmf/1.20wmf1 origin/wmf/1.20wmf1
Branch wmf/1.20wmf1 set up to track remote branch wmf/1.20wmf1 from origin.
 
# Switch to the wmf/1.20wmf1 branch and update it from the remote
$ git checkout wmf/1.20wmf1
$ git pull
 
# Cherry-pick a commit from master, identified by its hash
$ git cherry-pick 176ffdd3b71e463d3ebaa881a6e77b82acba635d
# If there are conflicts, this is how you fix them:
# run 'git status' to see which files are conflicted
# start fixing conflicted files
# use 'git add filename' to tell git you've fixed the conflicts in a file
# once all conflicts are resolved, commit the result using 'git commit'
 
# Submit your commit for review
# Note: 'wmf/1.20wmf1' is the name of the remote branch you are pushing to, not the name of your local tracking
# branch (although in this example they are the same).
$ git review wmf/1.20wmf1
# If you don't want or need this to be reviewed, you can +2 your own
# commit if you are in the wmf-deployment group

You can repeat this process multiple times to commit or cherry-pick multiple changes.

Updating the submodule

After all of the commits you submitted to the deployment branch in the step above have been merged, you will need to update the submodule in core:

$ cd mediawiki/core            # Go to your checkout of core
 
# Set up a local wmf/1.20wmf1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.20wmf1 branch, you can skip this step
$ git branch --track wmf/1.20wmf1 origin/wmf/1.20wmf1
Branch wmf/1.20wmf1 set up to track remote branch wmf/1.20wmf1 from origin.
 
# Switch to the wmf/1.20wmf1 branch and update it
$ git checkout wmf/1.20wmf1
$ git pull
# Update the extension submodules. This may take a while when you run it for the first time
$ git submodule update --init --recursive
 
# cd to your extension's submodule
$ cd extensions/MyCoolExtension
# Update it from the remote
$ git fetch
# Point the submodule to the remote wmf/1.20wmf1 branch
$ git checkout origin/wmf/1.20wmf1
HEAD is now at 96052d0... (bug 34885) Blah blah blah
 
# cd back to the main repo
$ cd ../..
# Check that you have updated the submodule correctly
# The diff should only show your submodule change, nothing else
$ git diff
diff --git a/extensions/MyCoolExtension b/extensions/MyCoolExtension
index 6a6eaab..96052d0 160000
--- a/extensions/MyCoolExtension
+++ b/extensions/MyCoolExtension
@@ -1 +1 @@
-Subproject commit 6a6eaabcdae29201c0f3e7bbcbac2d1953c574a5
+Subproject commit 96052d0d9733f89d5f8fd43c6ba05ecc756ea1ec
 
# Commit the submodule update and submit it for review
$ git commit -a -m "Update MyCoolExtension"
$ git review
# If you don't want or need this to be reviewed, you can +2 your own
# commit if you are in the wmf-deployment group

Case 1c: extension update

You are deploying an update to your extension, updating it to the current master. If you want to apply changes to your extension but don't want to update it to exactly the current master state, read the extension changes section instead.

NOTE: This workflow is deprecated. You should generally cherry-pick changes instead unless you have a good reason to update to master. When in doubt, talk to experts.

First, you need to update the deployment branch to master:

$ cd mediawiki/extensions/MyCoolExtension      # go to your extension checkout
 
# Set up a local wmf/1.20wmf1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.20wmf1 branch, you can skip this step
$ git branch --track wmf/1.20wmf1 origin/wmf/1.20wmf1
Branch wmf/1.20wmf1 set up to track remote branch wmf/1.20wmf1 from origin.
 
# Switch to the wmf/1.20wmf1 branch and update it from the remote
$ git checkout wmf/1.20wmf1
$ git pull
 
# Advance to master
$ git merge --ff-only origin/master
 
# Push the update back up
$ git push origin wmf/1.20wmf1

If you get errors during any of these steps, STOP and talk to experts. You're probably trying to do something strange, like updating an extension to master while there are already cherry-picks in the same branch.

Now you just need to update the submodule using the instructions above, and you're done.

Case 1d: new extension

You are adding an entirely new extension that wasn't deployed before, and you're deploying from master (if you need to deploy something other than the master state, that's possible, but it generally shouldn't be done for an initial deployment; master should just be clean and deployable).

Your new extension has been deployed and tested on the mw:beta cluster for weeks, right? Otherwise, STOP and talk to experts.

You need to add a submodule to the core deployment branch:

$ cd mediawiki/core            # Go to your checkout of core
 
# Set up a local wmf/1.20wmf1 branch that tracks the remote
# You only need to do this once for each branch; if you've already got a wmf/1.20wmf1 branch, you can skip this step
# If you get an error, try 'git remote update' first
$ git branch --track wmf/1.20wmf1 origin/wmf/1.20wmf1
Branch wmf/1.20wmf1 set up to track remote branch wmf/1.20wmf1 from origin.
 
# Switch to the wmf/1.20wmf1 branch and update it
$ git checkout wmf/1.20wmf1
$ git pull
# Update the extension submodules. This may take a while when you run it for the first time
$ git submodule update --init --recursive
 
# Add a submodule for your extension
$ git submodule add https://gerrit.wikimedia.org/r/p/mediawiki/extensions/MyCoolExtension.git extensions/MyCoolExtension
Cloning into extensions/MyCoolExtension...
# Check the diff. Make sure the .gitmodules entry is in line with the others, and check that the subproject commit hash points to master
$ git diff --cached
diff --git a/.gitmodules b/.gitmodules
index 3ab3d48..9a4cc66 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -346,3 +346,6 @@
 [submodule "extensions/PrefSwitch"]
        path = extensions/PrefSwitch
        url = https://gerrit.wikimedia.org/r/p/mediawiki/extensions/PrefSwitch.git
+[submodule "extensions/MyCoolExtension"]
+       path = extensions/MyCoolExtension
+       url = https://gerrit.wikimedia.org/r/p/mediawiki/extensions/MyCoolExtension.git
diff --git a/extensions/AllTimeZones b/extensions/MyCoolExtension
new file mode 160000
index 0000000..46727ad
--- /dev/null
+++ b/extensions/MyCoolExtension
@@ -0,0 +1 @@
+Subproject commit 46727ad74adda33621323deb2bebdc2527cb4917
 
# Commit the submodule addition and submit it for review
$ git commit -a -m "Add MyCoolExtension"
$ git review
# If you don't want or need this to be reviewed, you can +2 your own
# commit if you are in the wmf-deployment group

When adding a new extension to one branch, you also need to add the extension to any other branches in use on the cluster (typically the wmf{N-1} branch), even if the extension will not be enabled on any wikis running that branch. Otherwise the localization cache builder will complain.

When adding (and removing) an extension, you need to update the files wmf-config/extension-list and default.conf, see #Add new extension to extension-list and default.conf

Step 2: get the code on tin

Now switch to tin. Use ssh -A so you can connect to gerrit and other hosts (or set up ssh proxying in your .ssh/config).

catrope$ ssh -A tin

Once the code is in the deployment branch, you simply run git pull on tin to get it there. However, before doing so, make sure that no unexpected changes will show up (see #Problem: undeployed code).

catrope@tin:~$ cd /a/common/php-1.20wmf1
catrope@tin:/a/common/php-1.20wmf1/$ git fetch
catrope@tin:/a/common/php-1.20wmf1/$ git log HEAD..origin/wmf/1.20wmf1

This will list the commits that would be pulled by 'git pull'. If there are other changes besides yours, go yell at the culprit. Otherwise you're OK to pull your changes into the deployment directory. You must always rebase in case there are security patches locally committed on tin.

catrope@tin:/a/common/php-1.20wmf1/$ git rebase origin/wmf/1.20wmf1

If you're updating an extension, check to see if there are existing security patches for the extension. Doing a submodule update will overwrite the security patches, and they need to be applied before syncing the extension to the apaches.

csteipp@tin:/a/common$ cd php-1.23wmf10/extensions/MyCoolExtension
 
# See if there are any patches with the "SECURITY:" prefix
csteipp@tin:/a/common/php-1.23wmf10/extensions/MyCoolExtension$ git log --oneline -3
905e1c2 SECURITY: Fix some bad stuff
cb6783a Localisation updates from https://translatewiki.net.
108dbea Localisation updates from https://translatewiki.net.
 
# Save a copy of the patch to your home directory
csteipp@tin:/a/common/php-1.23wmf10/extensions/MyCoolExtension$ git format-patch --stdout HEAD~1 > ~/MyCoolExtensionSecurity.patch
 
# Do the submodule update like normal:
csteipp@tin:/a/common/php-1.23wmf10/extensions/MyCoolExtension$ cd ../..
csteipp@tin:/a/common/php-1.23wmf10/$ git submodule update --init --recursive extensions/MyCoolExtension
 
# Re-apply the security patch
csteipp@tin:/a/common$ cd php-1.23wmf10/extensions/MyCoolExtension
csteipp@tin:/a/common/php-1.23wmf10/extensions/MyCoolExtension$ git apply --check ~/MyCoolExtensionSecurity.patch
# If the above didn't return any errors, then you can am the patch. If there were conflicts, you'll need to rebase or merge the patch
csteipp@tin:/a/common/php-1.23wmf10/extensions/MyCoolExtension$ git am ~/MyCoolExtensionSecurity.patch
# Security patch should now be applied. Check that it shows up at the top of the log with:
csteipp@tin:/a/common/php-1.23wmf10/extensions/MyCoolExtension$ git log --oneline -5
905e1c2 SECURITY: Fix some bad stuff
c672d43 Some feature
1378723 Another feature
cb6783a Localisation updates from https://translatewiki.net.
108dbea Localisation updates from https://translatewiki.net.

If no "SECURITY:" patches are in the log, or if this is a new extension, then you can simply update the extension submodule with:

catrope@tin:/a/common/php-1.20wmf1/$ git submodule update --init --recursive extensions/MyCoolExtension

you should see the commit ID from your work in your local deployment

Trying tin's code on testwiki

If the wmf branch you just updated is the one that test.wikipedia.org is on (view its Special:Version or grep testwiki /a/common/wikiversions.json), and you want to test your code on test.wikipedia.org before deploying it everywhere else, then you will need to sync them to testwiki's Apache server. If you view source of a testwiki page, it will have a comment !-- Served by mw1017 in 0.182 secs. -->, so update that server:

catrope@tin:/a/common/php-1.20wmf1/$ ssh mw1017
catrope@mw1017:~$ sync-common

Your code will then be live on testwiki.

The cluster machines have a cache for i18n messages for each release that does not get updated by this; it seems the only way to update these messages is to do a full scap.

Step 3: configuration and other prep work

In certain cases, you'll have to change how Wikimedia sites are configured. We generally have the same codebase everywhere, but with different configurations for each wiki.

Maybe you are just changing one configuration variable. Or, perhaps you are adding a brand-new extension, or activating an extension on some wiki where it's never been before. For all of these cases, and more, you'll have to make the changes to the config files to get the desired results.

Configuration files live in their own revision-controlled repository operations/mediawiki-config. The big difference is the configuration files are not tied to releases — there is no 1.20wmf1 branch for configuration. This means you cannot commit a configuration change and have it "roll out" across wikis on the release train, it has to work with all branches in use. In general if you're not in operations you should make changes to a local copy of this repository (as explained in How to do a configuration change#In your own repo via gerrit), submit them for gerrit review with a -1 comment to avoid early deployment, then during your deployment window +2 them and get them on tin.

Everything that follows is just a convenient way to make config changes.

If you're deploying an extension or feature that can be switched off, it's usually best to leave it switched off while you deploy and carefully switch it on after that using a simple configuration change (this is called a dark launch). Even if you do this, you should build any configuration infrastructure (e.g. $wmg variable, adding entry in InitialiseSettings with default false) at this time so all you'll have to do later is flip a switch.

For specific preparations, see the sections below as well as How to do a schema change and How to do a configuration change. Best to perform schema changes before making config changes.

Add a configuration switch for an extension

In /a/common/wmf-config/CommonSettings.php, add:

if ( $wmgEnableMyExtension ) {
  require_once( "$IP/extensions/MyExtension/MyExtension.php" );
  // Set config vars if needed
 
  // If you want to export config vars through InitialiseSettings.php, you need to set $wmgMyExtensionThingy there and do
  #$wgMyExtensionThingy = $wmgMyExtensionThingy;
}

In /a/common/wmf-config/InitialiseSettings.php, add something like:

'wmgEnableMyExtension' => array(
  'default' => false,
  'eswikibooks' => true,
  // etc.
),
// If needed, set $wmgMyExtensionWhatever vars here too

If your extension requires a large-ish amount of configuration, consider putting it in a separate file instead. Currently, AbuseFilter, LiquidThreads and FlaggedRevs do this.

For more documentation on these files and their formats, see Configuration files.

Add new extension to extension-list and default.conf

When adding a new extension, you need to add it to the extension list, or its i18n messages won't get picked up. For more information about this setup, see Configuration files.

  1. cd /a/common/wmf-config
  2. Edit extension-list and add the path to the extension setup file on a line by itself
  3. commit the change
  4. Run scap. Make sure your extension is only enabled on testwiki at this point

After adding a new extension to the deployment branch, you also have to add it to make-wmf-branch/default.conf in mediawiki/tools/release.git, most likely in the $normalExtensions array, so it'll be picked up when the deployment branch is rebranched.

Disabling an extension

Conversely, when you disable an extension, remove it from wmf-config/extension-list and make-wmf-branch/default.conf.

Reedy commented:

If you’re wanting to disable an extension on the cluster, please DO NOT remove it from current deployment branches. Git gets upset and breaks things like git submodule update.
Per Stackoverflow, there isn’t a “git submodule rm foo”, and it’s just a pain for other people to have to clean up their working copies.
So in future, if you’re wanting to disable and remove an extension from production, it’s fine to do so in InitialiseSettings.php/CommonSettings.php, and even remove it from extension-list, but do not remove it from the core deployment branch. Instead, remove it from make-wmf-branch, and as long as the commit is merged before I (or whoever) makes the deployment branch, it won’t be branched for further usage.

Getting configuration changes on tin

If you made configuration changes to your local mediawiki-config repository, then once they are merged in gerrit you need to get them on tin. This is similar to step 2, but there's no deployment branch. It's covered in How to do a configuration change#In your own repo via gerrit.

Step 4: synchronize the changes to the cluster

Small changes: sync individual files

If your change only touches one or a few files or directories and does not change i18n messages, you can sync the files/dirs individually with sync-file or sync-dir as appropriate, rather than having to run scap. This is preferable because a scap run always shakes the cluster up a bit and takes longer to complete, while a sync-file run is very lightweight. However, sync-file is only capable of synchronizing files within directories that already exist on the cluster, so it won't work with newly added directories. Also, sync-file only synchronizes one file at a time, and creates a log entry each time. Using it repetitively (e.g. with a for loop) to sync multiple files is fine, as long as there's not too many of them (say not more than ~5).

To sync a single file, run sync-file [path to file] [summary]. To sync a directory, run sync-dir [path to directory] [summary]. The IRC logmsgbot uses the summary to log your sync in #wikimedia-operations, from where it'll go to the server admin log and the identi.ca and Twitter feeds.

  • PITFALL: The path argument has to be relative to the common directory, not to the current directory. To preserve your sanity (and tab-completion functionality), always cd to /a/common before running sync-file or sync-dir.
  • PITFALL: If the summary argument contains spaces, you'll have to put it in quotes or only the first word is used. If your summary contains a $, you'll either have to escape it or put your summary in single quotes, to prevent bash's variable expansion from messing it up
  • PITFALL: sync-file and sync-dir do not work correctly for syncing i18n changes. They will appear to work, but the i18n changes won't take effect. To sync i18n changes, you must use scap.

When running sync-file or sync-dir, you'll usually see about half a dozen errors from broken servers (sample output below). We should fix things so this doesn't happen, but in the meantime consider it normal behavior. If you see unexpected output, ask in #wikimedia-operations. sync-file or sync-dir usually completes within a few seconds, but in rare cases it may hang on a broken Apache for 1 or 2 minutes.

catrope@tin:/a/common$ sync-file php-1.20wmf1/api.php 'API security fix'
No syntax errors detected in /a/common/php-1.20wmf1/api.php
copying to apaches
mw60: ssh: connect to host mw60 port 22: Connection timed out
srv189: ssh: connect to host srv189 port 22: Connection timed out
srv174: ssh: connect to host srv174 port 22: Connection timed out
srv266: ssh: connect to host srv266 port 22: Connection timed out

More complex changes: sync everything

If you're adding directories, changing many files, changing i18n messages, or otherwise have a reason why sync-file wouldn't work or would be impractical, you'll have to run scap, which syncs everything and rebuilds caches. scap logs to the server admin log, and reports in #wikimedia-operations (without !log) when it finishes.

awjrichards@tin:/a/common$ scap 'Log message here'
Checking syntax...
Copying to tin...Done.
Updating serialized data files...
Warning: messages are no longer serialized by this makefile.
Updating ExtensionMessages-1.19.php...
Updating ExtensionMessages-1.20wmf1.php...
Updating LocalisationCache for 1.19...
Updating LocalisationCache for 1.20wmf1...
...snip...

Running scap can take upwards of 15 60 minutes; the LocalisationCache rebuilds (usually two of them, one for each deployed wmf version) can also take a while. There is usually a load spike and a few hiccups on the cluster immediately after scapping, but that's normal, as long as it subsides a few minutes after scap finishes running.

Alternative to scap

Because scap takes such an incredibly long time these days, here's an alternative set of commands you can use if you want to deploy i18n changes but want to avoid running scap:

# Rebuild the i18n cache
$ mw-update-l10n
 
# Sync the i18n changes. The first parameter is the version number.
# If updating both versions, run this twice
$ sync-l10nupdate-1 1.20wmf12
 
# Sync your code
$ sync-dir php-1.20wmf12/extensions/Whatever

This is not a magic time saver though. Syncing one version of the cache takes about 15 minutes. If you need to update two versions, even this method takes 30 minutes in addition to deploying the actual code.

Test and monitor your live code

Is it doing what you expected? Unfortunately, testwiki is not like a real wiki: extensions respond to a trigger hooks, CentralNotice or Common.js might effect the browser environment, etc. No one environment can simulate all the wikis that we operate, so test your change afterwards on a live wiki to confirm. test2.wikipedia.org is a test wiki that operates as a member of the cluster. Keep in mind also that different projects are configured differently, have different extensions enabled, use different alphabets, etc; it can be worthwhile to double check your changes on multiple projects, particularly to ensure that character encoding and right-to-left formatting is behaving as expected. Also remember that the caching infrastructure on the cluster is likely different than your local or testing environments; keep the different production caching layers/strategies in mind as you're assessing your changes in production.

WMF uses open-source tools including ganglia, graphite, and icinga to monitor its production cluster; you should review their output post-deploy for unexpected spikes. Ganglia plots MediaWiki exceptions/fatals, currently under node "vanadium.eqiad.wmnet" which tallies them. The most useful view is the last two hour's worth of exceptions and misc. fatals; the 24-hour version is http://tinyurl.com/n3twd8k . In ganglia graphs, the 'm' means "milli-somethings per second", so a peak of 50m is 0.05 exceptions per second, or one exception/fatal every 20 seconds.)

All PHP error logs are routed to the server fluorine in /a/mw-log. Exceptions and fatals happen constantly, so you need to get a sense of changes over time. For example, to see trends in "Maximum execution time exceeded" errors this month, you might run

fluorine$ cd /a/mw-log/archive
fluorine$ zgrep -c 'Maximum execution time' fatal.log-201304*

All apache error logs are (still) routed to fenari in /home/wikipedia/syslog.

For a summary of all of the logs in use, see Logs.

Don't leave town

Even if your deploy appears to be working, it's important to be reachable in the hours immediately following your deploy. Ideally, stay online and in IRC channels like #wikimedia-tech and #wikimedia-operations for a couple of hours. Update Deployments with what happened in your deployment window.

If you must go offline, let people know how to reach you (and keep your mobile phone or other communications device on your person). You can use /away messages on IRC, or perhaps send a short email to the ops list.

If you are on Wikimedia staff, now might be a great time to check if your contact info is up to date. If you aren't on staff, ask a staffer to add your contact info to that page, under "Important volunteers".

A note on JavaScript and CSS

Since we have ResourceLoader, there is no need to e.g manually do a "build" (to re-minify/re-cache static files). ResourceLoader does this automatically on-demand. Depending on when the timestamp cache gets a cache-miss, it can take up to five minutes for that to occur.

Occasionally ResourceLoader trips up and does not re-cache files correctly on the bits servers. The symptom of this is typically that stale minified files are served from bits, but if you add ?debug=true to the URL RL serves the new content. Fixing this issue requires that you touch the files in question and then re-sync them to the cluster.

Security patches

The last step in fixing security issues in MediaWiki before releasing the fixes publicly is deploying the patches on the cluster. When this happens:

  • All patches / fixes will be committed changes in the local repo
  • An email will be sent to the Engineering list to notify everyone that the patches are there, and where the raw patches live on tin in case they need to be modified or reapplied

Please do not revert these. If you are unsure if local, committed changes are security related, please ask someone in platform privately. Please do not discuss the patches publicly (including IRC). In most cases the commit message and knowing the files the commit affected would be enough for a malicious person to figure out the vulnerability.

When there are security patches in deployment, please rebase them on top of any changes you are deploying. This makes it easier to see what's been deployed (no more "Merge branch...." commits), and makes the fact that security patches are live immediately clear.

The only times that these should interfere with your deployment is if the changes conflict. In this case, please contact someone from the platform team to work out the best way to handle the situation.

Creating a Security Patch

  1. Create a Bugzilla security report if one does not already exist.
  2. Create and test your patch locally (preferably on a branch); then commit locally. Do not commit the patch to gerrit.
  3. Create the patch by running git format-patch HEAD which will produce a patch file in your working directory.
  4. Upload the patch to bugzilla.
  5. Apply the patch in the current/affected wmf branches on tin by git am < patchfile
  6. Send an email to the engineering list stating that there is a security patch on the cluster and where your patch file lives on tin.
  7. Work with Chris to make sure the vulnerability is resolved and that your patch makes it into the next security release.

Problem: undeployed code

If you need to deploy something but you find undeployed changes or local changes that are not security fixes, revert all of them and !log your revert, then proceed to your deploy.

If it's uncommitted live-hacks (as in, not even in gerrit), the polite thing is to stash them, so you don't erase someone's work forever.

Background

Roan commented in October 2012:

The problem is that sometimes, people merge things into a deployment branch and then don't deploy them. This is a terrible habit that should be squashed. If you merge something into a wmf branch, you have a responsibility to either deploy it yourself very soon, make sure that someone deploys it very soon, or revert it if you can't make those things happen. The deployment branch should reflect the current state of the cluster, except during those brief moments where something is about to be deployed or in the process of being deployed.

If you are concerned about other commits being pulled in (which should never happen, unless someone has been naughty), then in Step 2 you can run git fetch followed by git log HEAD..origin/wmf/1.20wmf1. This will list the commits that would be pulled by 'git pull'. In that list, it should be easy to spot commits that aren't yours and identify the person to yell at. If you run 'git pull' and it ends up pulling things you didn't expect, you can use 'git log' to examine what happened, and 'git reflog' (or the output of 'git pull') to find the hash of the commit you were at before pulling, so you can roll back to it if needed. But if this happens to you, feel free to start yelling at people and/or asking for help.