Portal:Cloud VPS/Admin/Cookbooks
Cloud VPS administrators have to routinely perform some tasks like adding or removing users to a project, adding or removing a node to a cluster, etc. We have created a set of cookbooks to automate those tasks, and make sure they are performed in a way that is consistent, repeatable and traceable. Our cookbooks are implemented using the Wikimedia Spicerack library and are maintained in the wmcs-cookbook repository.
Cookbooks vs Runbooks
In WMCS, we use the word cookbook to describe the automation scripts described in this wiki page, implemented using the Spicerack library. We use instead the word Runbooks to describe manual procedures documented in Wikitech. A runbook might require you to run a cookbook to perform one of the runbook steps, or it might not involve any cookbook at all.
How to run a cookbook
You can run a cookbook from a cloudcumin host, or from your laptop.
Running from a cloudcumin host is recommended because:
- It ensures you are using the tip of the "main" branch
- Logs will be stored in the cloudcumin host, where other admins can see them if they need to
- If the cookbook takes a long time to complete, you can use
screen
ortmux
to let the cookbook continue to run even if you lose your internet connection, or if you turn your computer off.
Some reasons why you might want to run a cookbook from your laptop instead:
- The cookbook does not work correctly when run from a cloudcumin host. If this happens, please open a bug in Phabricator as ideally all cookbooks should work fine from a cloudcumin host.
- You don't have "sudo" privileges in cloudcumin hosts. This is something we would like to solve in the future, and is tracked in phab:T343330.
- You are developing a new cookbook and need a quicker feedback than pushing to the repo and fetching.
- You are developing a patch for spicerack or any dependent library and need to test it/adapt cookbooks with it.
Running a cookbook from a cloudcumin host
We have two cloudcumin hosts, cloudcumin1001.eqiad.wmnet
and cloudcumin2001.codfw.wmnet
.
They are almost identical, but 1001
is configured to target the eqiad1
OpenStack cluster, and 2001
is configured to target the codfw1dev
OpenStack cluster (see /etc/cumin/config.yaml
).
Both hosts are Ganeti virtual machines, managed by the Infra Foundation team.
At the moment, some cookbooks are failing when run from cloudcumins. These issues are tracked as subtasks of phab:T343330.
Listing available cookbooks
fnegri@cloudcumin1001:~$ sudo cookbook -l
Showing documentation for a single cookbook
fnegri@cloudcumin1001:~$ sudo cookbook wmcs.cookbook.name -h
Running a cookbook
fnegri@cloudcumin1001:~$ sudo cookbook wmcs.cookbook.name --project PROJECT_NAME --task-id PHAB_TASK_ID
Using Screen/tmux
If you are running a cookbook that you expect will take a long time to complete, you should run it inside a Screen or tmux session, so you can detach from the session and let the cookbook continue to run.
Screen example:
fnegri@cloudcumin1001:~$ screen
fnegri@cloudcumin1001:~$ sudo cookbook ...
# Press "Ctrl+a d" to detach while the cookbook is running
fnegri@cloudcumin1001:~$ screen -x # To reattach
fnegri@cloudcumin1001:~$ exit # To terminate the session
Tmux example:
fnegri@cloudcumin1001:~$ tmux
fnegri@cloudcumin1001:~$ sudo cookbook ...
# Press "Ctrl+b d" to detach while the cookbook is running
fnegri@cloudcumin1001:~$ tmux attach # To reattach
fnegri@cloudcumin1001:~$ exit # To terminate the session
Running a cookbook from your laptop
Local setup
To run locally the cookbooks, you can follow the docs here https://gerrit.wikimedia.org/g/cloud/wmcs-cookbooks
Common runbook options
All WMCS cookbooks have some common options:
--task-id
to indicate a related Phabricator task. This ID will be included in SAL messages, and SAL messages will be displayed in Phab comments.--project
to indicate the related Cloud VPS project. This will make sure that SAL messages are displayed under the correct project in [sal.toolforge.org]. If the option is not included, it defaults toadmin
.--no-dologmsg
to disable sending messages to SAL. Note: this only affects messages sent with the SALLogger class and not the ones sent with sal_logger. For more information, read phab:T343528.
Some additional common options are provided by Spicerack itself and are documented at Spicerack/Cookbooks#Cookbook_Operations.
Cumin
The cloudcumin hosts are also capable of running Cumin commands. Example:
user@cloudcumin1001:~ $ sudo cumin 'O{project:toolsbeta name:toolsbeta-test-k8s-*}' 'apt-get update'
SRE cookbooks and WMCS cookbooks
WMCS admins with Global Root permissions can also run the SRE cookbooks (the ones with a name starting with sre.
). SRE cookbooks (from the operations/cookbooks repo) are not installed in cloudcumin hosts, but only in production cumin hosts (cuminXXXX).
TODO: Create "shared" cookbooks that are installed in both production and cloud cumin hosts. (phab:T343894)
Roadmap
These are some of the enhancements we would like to make to WMCS Cookbooks. We don't have a timeline for those but please leave a comment in the Phab tasks if a feature is particularly useful/relevant to you.
- phab:T343330 Provide shared hosts for people without root privileges
- phab:T343528 Use a single bot for SAL logging