User:LToscano (WMF)/Cloud Usage
This page is a collection of workflows and best practices to follow to request a public cloud provider account and a set of resources. The Infrastructure Foundations team is the root account owner for Amazon AWS and any other future public cloud providers we may use.
Introduction
The Wikimedia production infrastructure (excluding Wikimedia Enterprise) is running on bare metal hosts. This helps control and protect PII data and adhere to the privacy policy.
There are however certain use cases where using a public cloud service is beneficial and doesn't undermine privacy. These include, but are not limited to, testing software or proof-of-concept deployments.
Requesting capacity
The requester should provide the following two things in a Phabricator task (using this quicklink):
- Budget approval to cover the cost of the public cloud provider.
- A brief justification for use case. Please make sure to add why the use case cannot be run on the available bare metal or Wikimedia Cloud infrastructure, what kind of data it is going to handle/need and the project's timeline (how long is it expected to run, etc..)
The above will be reviewed and upon approval a new account will be open with the selected public cloud provider and the credentials will be shared using 1Password vaults.
Dos and don'ts
Please make sure to read the following:
- There is no expectation of getting any SRE team's support to deploy/debug/improve your project running on public cloud resources.
- Puppet and any other configuration tools will not be adapted/configured to run on public cloud resources.
- No PII data of any kind can leave the production premises to be tested on public cloud resources. The SRE team will not create any special VPN tunnel to copy data from production to public cloud. If any public-only data is needed, you'll likely need to create a dump and copy it to the public cloud.
- If the scope of the project is to find a hardware configuration for a specific use case that will run in production, please make sure that an SRE reviews your final document where all the details are explained. There are subtle differences between a VM running on public cloud and a bare metal server that need to be taken into account when making comparisons.
- You have the liberty to install/deploy software on public cloud VMs/Containers, but please be aware that in production the SRE team will require all software to be in internal repositories (and not coming from external vendor distributions etc..).
- BILLING ALERTS ARE MANDATORY. The Infrastructure Foundation will help you to create billing alerts, we need to get notified if the expected budget for a given time window is being respected. It is really easy to miss some hidden cost of running special/expensive public cloud products, and we don't want to burn more budget than the one assigned (or burn it in not time without being able to complete a project).