Incident response/Lightweight report template

From Wikitech
Jump to navigation Jump to search

document status: draft

Summary

What happened? Write one paragraph or at most two, including UTC timestamps for key events like the start and end of the outage. Avoid assuming deep knowledge of the systems here -- but if the incident is too complex to sum up in a couple of paragraphs, this lightweight format may be a bad fit.

Impact: Briefly assess the impact: for user-facing outages, estimate how many queries were lost, which regions were affected, or which types of clients (editors? readers? bots?), etc.

Documentation:

  • Todo (Link to relevant source code, graphs, or logs)

Actionables

Create a list of action items that will help prevent this from happening again as much as possible. Link to or create a Phabricator task for every step.

  • To do #1 (TODO: Create task)
  • To do #2 (TODO: Create task)

TODO: Add the #Sustainability (Incident Followup) Phabricator tag to these tasks.