Performance/Runbook/Puppet patches

From Wikitech
Jump to navigation Jump to search

This is the runbook for testing and staging Puppet patches that affect Puppet roles applied to servers maintained by the Performance Team.

Meta

  • Source code (roles): Gerrit
  • Source code (webperf classes): Gerrit
  • Source code (arclamp classes): Gerrit

For changes to our services that run on these hosts, see instead the runbooks for Webperf-processor services and Webperf-tools services.

Writing a patch

See Puppet coding.

Testing a patch

When submitting a patch for operations/puppet.git, Jenkins typically reports within a minute or two with the results of syntax, coding convention, and unit tests.

Staging a patch

Before we deploy a patch to production, there's two kinds of tests we apply:

  1. Puppet compiler tests. This asks Puppet to simulate what would happen given all the production realm variables. This identical to what would happen in actual production, if applied to a clean install of the HEAD-1 state on a fresh server and no private overrides.
  2. Beta Cluster testing. This will actually apply the patch to a real server in the Beta Cluster. Catches everything that would happen on a real server. But, it runs with the betacluster realm variables instead of production. So there may be intentional differences.

Puppet compiler

Prerequisites:

  • Wikimedia Developer account (same as Gerrit account), with "wmf" or "nda" user group.

Steps:

  • Use the build form for the puppet-compiler job on Jenkins.
  • Enter the Gerrit change number.
  • Enter the list of nodes to simulate before/after. For our patches this is usually: webperf1001.eqiad.wmnet,webperf1002.eqiad.wmnet,webperf2001.codfw.wmnet,webperf2002.codfw.wmnet
  • Start the build and view its console output. Once done, review its result. (example)

Beta Cluster testing

Once the patch passes Puppet compiler without errors, and the effective changes are what you want them to be, it's time to cherry-pick the puppet patch to the Beta Cluster.

Prerequisites:

  • Wikimedia Developer account (same as wikitech.wikimedia.org account).
  • Shell access to Wikimedia Cloud VPS (see Help:Access).
  • In user group "Administrators" for the "Beta Cluster" VPS project in Wikimedia Cloud (existing admins can add you in Horizon).

Steps:

  • Connect with SSH to the current puppetmaster in Beta Cluster (deployment-puppetmaster03.deployment-prep.eqiad.wmflabs).
  • Enter sudo mode (sudo -i).
  • Navigate to /var/lib/git/operations/puppet.
  • Ensure git status is clean.
  • Copy the "Cherry Pick" command from the change page on Gerrit, under "Download".
  • Run the command on the puppetmaster in the operations/puppet directory.

Now, in a separate terminal (so that you can easily undo or fixup if something goes wrong):

  • Connect with SSH to the Beta Cluster to apply the change to. For example, if the change affects webperf1001 in production, you'll connect with deployment-webperf11.deployment-prep.eqiad.wmflabs. If the change affects multiple, carefully consider whether it should really be a single commit. If the in-between state is harmless, then go ahead and try to do this mostly concurrently for other hosts as well in a third terminal.
  • Trigger a Puppet agent run on this host: sudo -i puppet agent --test --verbose.

If Puppet fails with an error about compilation of the Puppet catalog, that means the Puppet master is now unable to serve any hosts in the Beta Cluster, including others. As such, undo your change on the puppetmaster by running git rebase -i and removing your cherry-pick from the list.

Once any Puppet compilation error or other error has been addressed with an amended version of the patch, confirm that the host is now has the new behaviour your patch intends to create.

Report back to Gerrit and ask SRE to merge it:

  • Leave a link to the clean Puppet compiler result in a Gerrit comment.
  • Add hash tag "beta-picked" to the Gerrit change.
  • Add a Gerrit comment with the link to the clean Puppet compiler result, and stating that it's live on Beta Cluster working as intended.