Obsolete:1.17 deployment plan

From Wikitech
(Redirected from 1.17 deployment plan)
This page contains historical information. It may be outdated or unreliable.

Outline for deployment of 1.17

General plan

1.17 has two features that are complicated from a deployment perspective:

  • Improved category collation
  • Resource Loader

The category collation code is not likely to change the load characteristics of the site, but is difficult to back out once we deploy it. Resource Loader could have a large impact on the site performance, but is relatively easy to back out. Deploying the two together means that we have code with unknown performance that we can't back out.

We decided the safest path would be to:

  1. Make category collation configurable, so that we could remain on the legacy collation code for a bit longer
  2. Deploy 1.17 with legacy collation (dark launching the new category collation code)
  3. Enable new category collation some time after 1.17 launches

This plan makes it relatively straightforward to back out Resource Loader in case we need to do so for performance reasons.

Preparation work

Development

Dev tasks for ops:

  • make a 1.17wmf1 deployment branch (Tim/Roan)
    • Update extension list in make-wmf-branch/default.conf
  • Check if extensions need schema updates (unassigned)
  • Make it possible to conditionally deploy collation support (Tim)
  • build a test infrastructure to validate 1.17 deployment - prototype (Priyanka)
  • turn on profiler on 1.17 during testing (Priyanka & Roan)
  • Figure out 1.17 post-deployment shifts for devs so that there's continuous coverage immediately after deploy (RobLa)
  • Make sure collation code handles having only default values gracefully (Tim)

Operations

Prior Testing

Schedule

Current target (subject to change based on data center move):

  • Tuesday, February 8, 2011 at 07:00:00 UTC
    • 1.17 deployment (Only core and existing extensions)
  • Date TBD
    • Collation made live
  • Date TBD
    • New extensions go live

Deployment Steps/Sequence

Below is the checklist for this deployment. See How to deploy code for details on the checklist. Each item should have an owner, and a time that it's scheduled to be done.

  1. Finish/check database schema updates. (Tim/Ryan)
    • Procedure: perform changes to slave db first and then change that to be the master
  2. Get the code on fenari (owner? time?)
  3. Configuration and other prep work
    • Add a configuration switch for new extensions (owner? time?)
    • Add new extensions to extension-list (owner? time?)
  4. scap (owner? time?)
  5. 24x7 developer coverage for the first few days after deployment
  6. Once ops is happy that we won't need to back out, re-enable category collation feature.
  7. Run maintenance/updateCollation.php. Previous testing (r69961) indicates that this will take at least a few days to run.

Backing out

Category collation changes probably cannot be backed out due to nature of changes. Other database additions/changes should have no effect on prior changes.

Risks & Mitigations

Identified Risks and Migations :

  • db errors on partially deployed version of 1.17
    • Make sure code handles having only default values gracefully
    • perform db backups before any changes
  • Load testing of bits and varnish
  • Deploying the new categorylinks collation code before running updateCollation.php will create some extra write pressure on the DB servers, because the categorylinks rows will automatically be upgraded on edit. There is a small chance of this leading to uncontrollable replication lag.