Incident documentation/20201006-cloud-vps

From Wikitech
Jump to navigation Jump to search

document status: draft

See also: https://lists.wikimedia.org/pipermail/cloud-announce/2020-October/000324.html

Summary

What happened? Write one paragraph or at most two, including UTC timestamps for key events like the start and end of the outage. Briefly assess the impact: for user-facing outages, estimate how many queries were lost, how many users were affected, or which populations (editors? readers? particular geographies?), etc.

Avoid assuming deep knowledge of the systems here -- but if the incident is too complex to sum up in a couple of paragraphs, this lightweight format may be a bad fit.

Actionables

Create a list of action items that will help prevent this from happening again as much as possible. Link to or create a Phabricator task for every step.

  • To do #1 (TODO: Create task)
  • To do #2 (TODO: Create task)

TODO: Add the #Sustainability (Incident Followup) Phabricator tag to these tasks.