Production shell access
For instructions on accessing public Cloud VPS and Toolforge instances, see Help:Access.
This page explains how to access hosts in the production Wikimedia network using SSH.
Remember that production access is extremely sensitive! Take it seriously and immediately contact SRE if you make a mistake or something goes wrong.
Read and remember the server access responsibilities, including the overall philosophy:
- The Wikimedia SRE Team will do whatever is necessary to keep all machines and services working and running in a secure fashion.
- Don't by any wilful, deliberate, reckless or unlawful act interfere with the work of another developer or jeopardize the integrity of data networks, computing equipment, systems programs, or other stored information.
- Don't use Wikimedia facilities for private purposes, including consultancy or any other work outside the scope of official duties or functions for the time being, without specific authorization to do so.
- This is not your personal machine. Many things that are fine to do on your personal machine are not okay in a production environment
- When in doubt ask questions first, act second. In this case forgiveness is much more difficult to get than permission.
- 1 Requesting access
- 2 Setting up your access
- 3 Debugging
- 4 See also
- 5 Notes
Production shell access is granted strictly on an as needed basis, and is entirely under the purview of the Engineering Department and SRE team managers. They can approve or deny access for any reason, as security is of the highest priority.
To acquire shell access, you must have projects or responsibilities that requires this access on a regular and ongoing basis. Requests based on a one-time need will not be granted. If you have a one-time need for data, request the data instead.
There are some things you'll need before you start the process.
- A non-disclosure agreement with the Wikimedia Foundation. If you work for the Foundation, this was probably included as part of your employment agreement. Otherwise, you'll have to follow the volunteer NDA process.
- A Phabricator account. If you don't have one, see the instructions for creating an account on mediawiki.org.
- A Wikimedia developer account.
- Read and sign the Acknowledgement of Wikimedia Server Access Responsibilities.
- Create a ticket requesting access.
- In the title, replace "RESOURCE" and "USER" with your name and the resource you need access to. (For new user requests, make a separate ticket for each user.)
- Add the following information to the description:
- Your full name
- Your developer access username (that is, the one you use for Cloud VPS SSH, not Wikitech login. Wikitech shows this as "instance shell account name" in preferences). We will use this as your production shell username.
- The public key from your SSH keypair. This must not be the same one you use to access Cloud VPS or Gerrit.
- A detailed reason for your request. In particular, describe which specific servers you need access to and why. We err on the side of giving fewer permissions rather than more, so the more detailed your request, the more likely you are to get all the permissions you need.
- Get approvals from the following people as comments to the Phabricator task. The comments should be made directly through the web interface, not via email.
- At least one comment of support from a Wikimedia Foundation employee, explaining why it is a good idea to accept your request. The comment of support should be from your supervisor if you're an employee, or from the employee you will be collaborating with if you're not.
- The project lead where your access will be granted.
- For most requests, a three business day waiting period must be observed after the request is filed.
- When your request is approved, you will be asked to provide your full legal name, preferred email address for contact, and physical address to the Wikimedia Foundation Legal team (or your employee contact may forward this information on your behalf). This information will be used to customize a non-disclosure agreement, which you will be asked to read, comprehend, and electrically sign through the Foundation's contract management system. The agreement will be similar to the Volunteer NDA.
- The Wikimedia Foundation employee that will be supervising your work will coordinate final sign off by an Executive level staff of the Wikimedia Foundation when all other criteria have been met before your access is granted.
If you feel an unreasonable amount of time has passed, you can comment on the ticket to request update and/or request an update directly from the Operations team member on Ops Clinic Duty that week.
Additional permissions for existing users
To escalate shell access, you must be working on a project that requires this access on a regular and ongoing basis. Any one time requests should simply request what data is required, access is not granted for one time requests.
- Read and sign the Acknowledgement of Wikimedia Server Access Responsibilities, if you have not already.
- Create a ticket requesting access.
- In the title, replace "RESOURCE" and "USER" with your name and the resource you need access to. (Group tickets are acceptable when a group is being escalated.)
- Add the following information to the description.
- Your full name
- Your shell username
- A three business day waiting period must be observed after the request is filed.
- This may not be required when the change is correcting a previous request, but should be followed for escalations that include not previously approved permissions. It may not be required in some other circumstances.
Production shell users, their keys, and their permissions are managed in
modules/admin/data/data.yaml in the operations/puppet.git repository.
Setting up your access
Generating your SSH key
First, you'll have to generate a new SSH keypair--do not reuse an existing key which has been used anywhere else. GitHub has a good help page (note that you can switch between Mac, Windows, and Linux documentation right under the title).
We recommend that you use an ED25519 key (or, alternatively, a 4096-bit RSA key). Do not use DSA keys as they are insecure and rejected by our ssh servers.
To generate an ED25519 key, run the following command in your terminal:
ssh-keygen -t ed25519
To generate an RSA key, run the following command in your terminal:
ssh-keygen -t rsa -b 4096 -o
Some systems don't support the newer
-o option which saves private keys in a slightly more secure format (OpenSSH rather than PEM), but those should be fairly rare, it was introduced in 6.5
The minimum bit length for this key is
-b 2048, which is currently the default length for OpenSSH. More bits won't hurt.
Once your new SSH key is set up, follow the instructions above to submit your access request. Remember: the key you use for production access must be different from the key you use for Cloud VPS (i.e. do NOT paste it into the Openstack field under Special:Preferences on this wiki).
Setting up your SSH config
The standard configuration for people not having root access is to have the ssh connection to be established on a bastion and proxy the command to the target host inside the cluster. To do this, add the following to your SSH config file (usually located at $HOME/.ssh/config):
Host bast1002.wikimedia.org # Direct connection for the bastion host ProxyCommand none ControlMaster auto Host *.wikimedia.org *.wmnet !gerrit.wikimedia.org !git-ssh.wikimedia.org User your_username_here # Everything else goes via bastion acting as a proxy ProxyCommand ssh -a -W %h:%p bast1002.wikimedia.org # Do not offer other identities loaded in ssh-agent IdentitiesOnly yes IdentityFile ~/.ssh/your_production_ssh_key
Troubleshooting: If ssh first hangs for a few minutes and then produces a lot of "Connection closed by remote host" messages, you probably created an infinite proxy loop, with connections to bast1002 trying to proxy via bast1002. This may happen if the proxy host name in the second entry does not match the host name the first entry matches. It will also happen if you put these entries in the wrong order - ssh will use the first config entry that matches, and will ignore any further matching entries! Since bast1002.wikimedia.org matches *.wikimedia.org, this will apply the proxy command to the connection to bast1002, creating an infinite loop.
In the example above you may replace bast1002.wikimedia.org with the bastion that is physically closest to you:
bast1002.wikimedia.orgin the eqiad cluster in Virginia, United States
bast2002.wikimedia.orgin the codfw cluster in Texas, United States
bast3002.wikimedia.orgin the esams cluster in Amsterdam, The Netherlands
bast4002.wikimedia.orgin the ulsfo cluster in San Francisco, United States
bast5001.wikimedia.orgin the eqsin cluster in Singapore
Advanced: operations config
If you will be setting up new servers or doing other administration work, you can use the below advanced configuration instead. Otherwise, skip this section. If you're not sure, you almost certainly don't need this!
|Advanced $HOME/.ssh/config for production root users|
|The following content has been placed in a collapsed box for improved usability.|
## Production & External Zones Host iron.wikimedia.org bast1002.wikimedia.org bast2002.wikimedia.org bast3002.wikimedia.org bast4002.wikimedia.org bast5001.wikimedia.org bastion-restricted.wmflabs.org StrictHostKeyChecking yes ProxyCommand none ControlMaster auto IdentitiesOnly yes Host *.wikimedia.org !gerrit.wikimedia.org !git-ssh.wikimedia.org User your_username_here StrictHostKeyChecking yes IdentitiesOnly yes IdentityFile ~/.ssh/your_production_ssh_key UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-prod ProxyCommand ssh -a -W %h:%p bast1002.wikimedia.org ## Internal Zones Host *.mgmt.eqiad.wmnet *.mgmt.codfw.wmnet *.mgmt.ulsfo.wmnet *.mgmt.esams.wmnet *.mgmt.eqsin.wmnet User root StrictHostKeyChecking no Host *.wmnet User your_username_here StrictHostKeyChecking yes IdentitiesOnly yes IdentityFile ~/.ssh/your_production_ssh_key UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-prod Host *.eqiad.wmnet ProxyCommand ssh -a -W %h:%p bast1002.wikimedia.org Host *.codfw.wmnet ProxyCommand ssh -a -W %h:%p bast2002.wikimedia.org Host *.esams.wmnet ProxyCommand ssh -a -W %h:%p bast3002.wikimedia.org Host *.ulsfo.wmnet ProxyCommand ssh -a -W %h:%p bast4002.wikimedia.org Host *.eqsin.wmnet ProxyCommand ssh -a -W %h:%p bast5001.wikimedia.org ## Networking Equipment Host *-eqiad.wikimedia.org *-eqord.wikimedia.org ProxyCommand ssh -a -W %h:%p bast1002.wikimedia.org Host *-codfw.wikimedia.org *-eqdfw.wikimedia.org ProxyCommand ssh -a -W %h:%p bast2002.wikimedia.org Host *-esams.wikimedia.org *-knams.wikimedia.org ProxyCommand ssh -a -W %h:%p bast3002.wikimedia.org Host *-ulsfo.wikimedia.org ProxyCommand ssh -a -W %h:%p bast4002.wikimedia.org Host *-eqsin.wikimedia.org ProxyCommand ssh -a -W %h:%p bast5001.wikimedia.org ## Gerrit and Cloud VPS Host gerrit.wikimedia.org User your_username_here StrictHostKeyChecking yes ProxyCommand none IdentitiesOnly yes IdentityFile ~/.ssh/your_development_ssh_key UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-cloud Host *.wmflabs.org *.wmflabs User your_username_here IdentityFile ~/.ssh/your_development_ssh_key StrictHostKeyChecking no UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-cloud ProxyCommand ssh -a -W %h:%p bastion-restricted.wmflabs.org
|The above content has been placed in a collapsed box for improved usability.|
Known host files
To ensure the validity of the hosts you connect to, enable the StrictHostKeyChecking yes option and create a local list of known hosts. A utility script is available to generate that list and keep it up to date. Read the instructions in the script's header for help on usage. If you need any additional help, contact the script's author.
Before you can use the script, you'll need to bootstrap this setup with at least one bastion host. Disable strict host key checking, ssh to a bastion, and make sure the fingerprint matches what's listed at Help:SSH Fingerprints.
Do not use SSH agent forwarding (the
-A command line option). Agent forwarding does not make it possible to steal your private key itself, but it does make it possible for someone to hijack your SSH agent and thus your identity, so we do not do it. The
-a option (with a lower case "a") disables agent forwarding, and is thus included in the sample configurations below.
This page used to recommend that you add the following lines to protect against an SSH bug from 2016:
Host * UseRoaming no
However, we are now using an updated version which removed the vulnerable options, so you will get an error if your config includes the lines above. Just remove them from your config to connect.
Do not use your production cluster SSH key for any other service, including Gerrit or Cloud VPS.
- Fundraising infrastructure config
- Greg Grossmeier's SSH config
- Managing multiple SSH agents
- Notes on configuring SSH for production shell access (for the purpose of working with the stats servers stat1002/3/4), by Zareen
- (experimental) Bash script to detect the correct bastion and auto-fix SSH config
If your production access has been approved but you aren't able to log in, you can ask for help in the Phabricator ticket for your access request. If you got access a long time ago and it's a new problem, you can file a new ticket and tag it with #operations.
Wherever you ask for help, make sure you include your SSH configuration (but not your key itself!) and the output you get when you run your ssh command with the
-v option (verbose mode).
- Help:Access for instructions on accessing Cloud VPS and Toolforge instances
- Help:SSH Fingerprints for fingerprints of ssh bastion servers
- Proxy access to cluster for direct web access to production servers behind the firewall
- Yubikey-SSH and Yubikey4 and gpg-agent for instructions on using a YubiKey device to manage your ssh key
- Managing multiple SSH agents for help configuring separate ssh-agent instances for different security realms
- Fundraising/tech/ssh config for help configuring ssh for access to hosts in the frack environment
- The form automatically adds the ticket to the Ops-Access-Requests project so the SRE team will see your request.
- You can also put your public key on your wiki user page, in a Phabricator paste, or in a Gerrit patchset you upload, but you can't include it in an email reply to the task.
- This protects against email spoofing.
- If you request any level of sudo privileges, your request must have a security review at a weekly SRE meeting. Sudo access is granted on an extremely limited basis, and will typically apply to the smallest permissions possible (user/process restricted over all). Expect this process to take at least one business week.
- Your manager's approval is usually not required, as you've already been granted access to the cluster; the project lead of the cluster you request access to should sign off (if in doubt, ask the Ops Clinic Duty person for the week.)