PuppetSWAT

From Wikitech
Jump to: navigation, search

What is it?

Similar to MediaWiki SWAT in structure/cadence, in that it typically occurs twice weekly. Operations Engineers (opsen) will look at PuppetSWAT submitted patches and ideally merge them or provide comments. The goal is to encourage more people to write patches to operations/puppet, and have a Service Level Agreement (SLA) of sorts for patches to be looked at and/or merged.

The time slot of the PuppetSWAT is typically twice weekly, but this can shift (in advance) to accommodate other deployments. Two operations team members are typically signed up per window. During (or before) the swat, those (and other) operations team members should review listed patches and provide feedback.

How to get a patch in PuppetSWAT

  • Add it to the designated PuppetSWAT window on the Deployments Page
    • This is intentionally 'fast and loose'. As long as a patch is on the listing before the PuppetSWAT window, Operations will do what we can to review and merge these. Due to the repetition of the PuppetSWAT windows, any patches submitted too close to the window (and unable to be properly reviewed) may be pushed to the next PuppetSWAT window.
  • Be present during the PuppetSWAT window for real-time conversations regarding the patch and testing/deployment/reverting.

What kind of patches can go through SWAT?

Trivial ones / ones that are easy to verify / test from people who are not in the operations team. Patches that have potentially far reaching impact (ssh, varnish, apaches) will probably be rejected. Ideally they have at least one +1 from someone familiar with the area of code the patch is changing. The ops person doing the SWAT has final discretion on which patches they merge, since they are ultimately responsible for the stability of the cluster. Also, do not think of this as a way to speed up work you're already doing with any opsen on a specific project. If patches are lagging behind, there is a specific reason and you should refer to the person you're working with, or escalate this.

Changes to the Apache configuration for the MediaWiki application server cluster are not eligible for SWAT, as due to the potentially far reaching impact / unavailability, these need extensive testing.

This guideline is still evolving.

Examples (from Giuseppe):

   Changes that can go through SWAT:
   - https://gerrit.wikimedia.org/r/#/c/230382/ (this has been specifically -2'd by me because I was already working on a better solution, but is limited in scope, easy to understand and works on something already existing)
   - https://gerrit.wikimedia.org/r/#/c/207140/
   - https://gerrit.wikimedia.org/r/#/c/226901/
   
   Changes that cannot go through SWAT:
   - https://gerrit.wikimedia.org/r/#/c/221827/ lots of code, a lot of things to review, will need some opsen to follow the deployment with the author
   -  https://gerrit.wikimedia.org/r/#/c/229727  Large, introduces new functions, is part of work being followed by multiple opsens already
   - basically everything that is not both trivial and clear-cut enough

The ideal PuppetSWAT patch has...

  1. The author / someone involved with the patch around on IRC during PuppetSWAT
  2. A +1 on the patch from someone.
  3. Puppet Compiler has been run on the patch and it has given a go ahead
  4. Rebases cleanly to master
  5. For patches that can be tested on the beta cluster, they should be by being cherry-picked to the beta cluster puppetmaster.

The ideal PuppetSWAT patch does *not* have...

  1. Any sudo / access rights changes
  2. Any outstanding -1s / unaddressed concerns

Who is going to do it?

The operations team allocates two members for PuppetSWAT during their weekly operations meetings. These members are designated in advance of their assigned weeks. Those members then must update the Deployments page to list them for those allotted SWAT windows.