Portal:Data Services/Admin/Runbooks/Enable NFS for a project

From Wikitech
All procedures in this runbook require admin permissions to complete.

Overview

NFS is the primary shared storage system for projects in Cloud VPS and is the main platform for users to place code on the Toolforge execution environment. When a Cloud VPS project would like to use shared storage for one reason or another, we provide a fairly simple path for them to do so. Generally all this will be done in response to a ticket.

We use NFS for several purposes within cloud-vps:


And for all, you'll have to enable it on the VMs you want it mounted.

Create a project internal NFS server

If a project only requires access to the shared 'scratch' service or content dumps, skip this section -- those servers already exist.

There are ready-made cookbooks for creating a new project-local NFS server; those cookbooks are documented at Portal:Data_Services/Admin/Runbooks/Create_an_NFS_server. You may also want to also create a second, failover server if users of the project are sensitive to downtime.

Typically for a project named 'foo' you would create a server with the prefix 'foo-nfs' and the volume name 'foo'.

Scaling up to a larger VM (e.g. with more cores or RAM) is fairly easy, so feel free to start with a small-sized server flavor. Scaling up the storage size of an NFS server should also be fairly straightforward but it's best to leave some slack in the first place.

Once the new NFS server is built, note the path to the NFS volume. It will be something like /srv/foo. If you are planning to host multiple shares from this server (e.g. both $home and /data/project) create subdirs for those shares: /srv/foo/home and /srv/foo/project.

Note that the new NFS server will not launch an NFS server OR export any shares until the yaml configuration step, below.

Get the new service fqdn name

Any server created using the wmcs.nfs.add_server cookbook with the --service-ip flag will already have a service address associated with it which you can find via the horizon dns->zones interface.

We will need this fqdn later to update the yaml file.

Update yaml config in Puppet

First we'll need to find out the project gid:

Find out the GID for the project

The NFS server will want to know the project GID. On any cloud VM, you can run: $ getent group project-$project_name

Add entry to the yaml

Add a section for the new project to modules/labstore/templates/nfs-mounts.yaml.erb with the project gid and whichever mounts are required. An entry for a volume that mounts everything we've got would look like this:

 testlabs:
   gid: 50302
   mounts:
     dumps: true
     home: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/home
     project: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/project
     scratch: scratch.svc.cloudinfra-nfs.eqiad1.wikimedia.cloud:/srv/scratch

Dumps is a special case and only needs to be set to 'true' to work. Any other mount must specify the path to the nfs server followed by the path to the share on the server. Rather than using a specific VM's fqdn you can use the service name which will ease an future failovers. Any server created using the wmcs.nfs.add_server cookbook with the --service-ip flag will already have a service address associated with it which you can find via the horizon dns->zones interface.

Once puppet is patched, run run-puppet-agent on labstore1004. This will trigger nfs-exportd's configuration changes and restart it. That should create a new file for the project under /etc/exports.d on labstore1004 that will be configured with the project's ips. There should also be a lot of 'nfsd' processes running on the server.

Note that this same yaml file is also consumed by the NFS client hosts: it tells them what to mount.

Enabling on the VMs

Utilize hiera key mount_nfs to opt-in / out. (e.g. mount_nfs: true) The default is false at this time. A puppet run after the above work is completed on a VM with this key set to true will mount the NFS as specified.

Users can be instructed to do this step themselves. This will also enable tc traffic shaping on the VM client which will not remove itself if NFS is later removed. Setting mount_nfs: false will not remove NFS mounts. You must do that by hand after changing hiera.

Historical notes for bare-metal pre-NFS servers

As of 2022-02-15 we are rapidly moving cloud-vps projects onto project-specific NFS servers. Straggler projects may still use the bare-metal servers but any new projects should follow the newer docs, above.

Old $home and /data/project mounts were stored on labstore1004; the old maps and scratch mounts were on cloudstore1009 in /srv/maps and /srv/scratch.

Support contacts

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Stay aware of critical changes and plans
Track work tasks and report bugs

Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself

Read stories and WMCS blog posts

Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)

Related information

Portal:Data Services/Admin/Shared storage