Portal:Cloud VPS/Admin/Runbooks
Runbooks for Cloud VPS
This page contains basic resources for developers who want to author and publish runbooks for Cloud VPS on Wikitech.
Overview
This page contains basic resources for developers who want to author and publish runbooks for Cloud VPS on Wikitech.
Before you begin
Permissions: Some processes documented in runbooks for Cloud VPS will need varying levels of permission to perform. Make sure you have appropriate permissions to complete your task.
What is a runbook?
A runbook is a detailed set of instructions that explain how to perform a common task or procedure, so it can be easily and accurately repeated by others. Runbooks are particularly useful for incident response operations. Creating runbooks in response to specific incidents, makes it possible for people to repeat the steps in response to similar incidents. Because they are often used to help people respond to incidents quickly, they should be easy to read and follow, consistent, and accurate.
When should you use a runbook?
A runbook should be used whenever a common task or procedure may need to be repeated by mulitple people.
Tips for building useful runbooks
- Keep them up-to-date. Revisit the instructions after each incident to make sure they are clear and accurate.
- Keep them simple. These are sets of detailed instructions, and they do not require extra history or context.
- Publish a separate runbook for each issue or incident
- Test them. Make sure the instructions are repeatable by others by asking others to follow them and provide feedback.
- Follow a template. Make sure your runbooks follow a template, so that people can find and update information easily.
Cloud VPS runbook templates
Instructions for each issue or alert should have its own page in order to reduce unnecessary information. You can see an example here with the issue Check for VMs leaked by the nova-fullstack test. Note that the page addresses one issue only.
You can find a template/outline for creating runbooks here: Portal:Cloud_VPS/Admin/Runbooks/Runbook_template
Where to publish runbooks for Cloud VPS
Where to publish Cloud VPS runbooks |
---|
Portal:Cloud_VPS/Admin/Runbooks/(PAGE NAME) |
Note: This is a subpage of admin documentation. Some procedures may require advanced admin permissions to complete. |
Note: Many runbooks for Cloud VPS will include procedures that can only be followed by individuals with admin access. In order to avoid confusion and frustration for general users, you should note at the content level when a procedure will require admin permissions.
When an entire runbook requires admin permissions to complete procedures, mark it with the following template:
When runbooks include information for general users and special instructions for admins, mark any instructions (inline) for admins with the following template:
Requires admin permissions
Existing runbooks
- CephClusterAllMonsDown
- CephClusterInError
- CephClusterInWarning
- Check for snapshots leaked by cinder backup agent
- Check unit status of backup cinder volumes
- Check unit status of backup vms
- Check unit status of maintain dbusers
- Cloud VPS alert Puppet failure on
- DRBDConnectedAndUpToDate
- DRBDOnlyOnePrimary
- LabstoreSlowGetent
- NodeDown
- NodeDownForLong
- NovaFullstackStaleStats
- NovafullstackSustainedFailures
- NovafullstackSustainedLeakedVMs
- OpenstackAPIs
- PuppetStaleCertificates
- RabbitmqNetworkPartition
- Refresh sssd caches
- Runbook template
- SystemdUnitDown
- SystemdUnitDownForLong
- TooManyCloudcontrolsDown
- TooManyCloudgwsDown
- TooManyCloudnetsDown
- TooManyCloudvirtsDown
Communication and support
Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia Movement volunteers. Please reach out with questions and join the conversation:
- Chat in real time in the IRC channel #wikimedia-cloud connect, the bridged Telegram group, or the bridged Mattermost channel
- Discuss via email after you subscribed to the cloud@ mailing list
See Also
- Category:Runbooks - Wikitech pages marked as runbooks.
- WMCS Admin documentation - Advanced documenation for WMCS administrators.
- Runbooks entry on Wikipedia