Incident documentation meeting/QR201403

From Wikitech

Quarterly Review of post-mortems - 2014-03

Questions we want to be able to answer

  • Have all of the issues that came out of the post-mortem been addressed? If not, why not?
  • Are we satisfied with the current state of that part of the infra? Are there further actions to take (upon further reflection)?
  • anything else?

Agenda

  • Go through the post-mortems and their respective action items and make sure they have been followed up appropriately.
    • If you have details that are relevant to the post-mortem in BZ/etc, please link from the post-mortem.
  • Discuss if there is anything else that we learned from the situation and follow up to better inform future decisions.
  • Notes written up by all, collaboratively, so that others in the organization will learn from these as well.

The post mortems

site outage ~ 2014-01-11 22:10 UTC

  • TODO: Follow up with Sean and Tim about this. (Greg) - Status:    Unresolved
    • greg pinged sean 20140320

20140113-Poolcounter

Incident documentation meeting/20140113-Poolcounter

20140203-LVS

Incident documentation meeting/20140203-LVS

20131205-Swift

Incident documentation meeting/20131205-Swift

20140206-Math

Incident documentation meeting/20140206-Math

20140211-Parsoid

Incident documentation meeting/20140211-Parsoid

20140228-Cirrus

Incident documentation meeting/20140228-Cirrus

20140313-API-Parsoid

Incident documentation meeting/20140313-API-Parsoid

20140313-Deploy

Incident documentation meeting/20140313-Deploy