Deployments/Covid-19

From Wikitech
Jump to navigation Jump to search

Why

We want to avoid overburdening people who are single points of failure (SPoFs) who may be unavailable, leading to a broken state lasting for a long time.

Questions to answer if you’re thinking about merging

  • Can you roll back this change without lasting impact?
    • A recovery plan is required as this will help identify our capacity for recovering from the failure
    • THIS IS A KEY QUESTION, if you can’t answer it, you shouldn’t deploy
    • Please ensure that if your recovery plan requires anyone besides the train conductor or patch owner, that person or team is notified and confirms availability during the deployment window. If they do not confirm availability, notify Release Engineering so that the patch can be postponed until availability is confirmed.
  • Is specialized knowledge required to support this change in production?
    • Are there multiple people with this knowledge?
  • Is there a way to increase confidence about the correctness of this change?
    • Reviews (Design, Code, etc)
    • Testing coverage (unit tests, integration tests)
    • Manual testing (e.g. Beta, vagrant, docker)

Heightened Awareness

Please be mindful if you have work on the train.

If you have something currently in the changelog or in master please reach out.

If for any reason you feel worried about the impact of your change you can either:

Please note that we are re-instigating train-blocker retrospectives to aid us in our continuous improvement efforts. Please see the retrospectives workboard for more information about existing and past retrospectives.

Office hours

There will be IRC office hours in #wikimedia-office connect available to help with initial questions.

Mondays - 17:00 UTC