We want to avoid overburdening people who are single points of failure (SPoFs) who may be unavailable, leading to a broken state lasting for a long time.

Questions to answer if you’re thinking about merging

  • Can you roll back this change without lasting impact?
    • A recovery plan is required as this will help identify our capacity for recovering from the failure
    • THIS IS A KEY QUESTION, if you can’t answer it, you shouldn’t deploy
  • Is specialized knowledge required to support this change in production?
    • Are there multiple people with this knowledge?
  • Is there a way to increase confidence about the correctness of this change?
    • Reviews (Design, Code, etc)
    • Testing coverage (unit tests, integration tests)
    • Manual testing (e.g. Beta, vagrant, docker)

Heightened Awareness

Please be mindful if you have work on the train.

If you have something currently in the changelog or in master please reach out.

If for any reason you feel worried about the impact of your change you can use the train blocker process (tl;dr: select the task for the current deployment train, then make a subtask of that train task via Edit Related Tasks… Edit Subtasks).

Office hours

There will be IRC office hours in #wikimedia-office connect available to help with initial questions.

Mondays - 17:00 UTC