Portal:Toolforge/Admin/Runbooks/ToolsNfsAlmostFull
The ToolsNfsAlmostFull alert fires when the Toolforge NFS server is almost out of disk space. This happens surprisingly often as the NFS share has no quotas.
Error / Incident
The Toolforge NFS server is almost out of disk space. This generally means that some space needs to be freed up.
Note that this alert comes in multiple severity levels, a warning
alert means that there's much more space available than for a critical
or a page
alert.
As of 2024-01-03 the nfs server is tools-nfs-2.tools.eqiad1.wikimedia.cloud
Debugging
Try what was done last time
If the alert fires after a very short time (about a week or so) after the last time cleanup was done, it is usually caused by the same thing as the last time. Look at the task for that cleanup and look what was done there. Cleanup those again and nudge those maintainers.
Locate disk hogs
# ionice -c 3 nice -19 find /srv/tools -type f -size +100M -printf "%k KB %p\n" | sort -h > tools_large_files_$(date +%Y%m%d).txt
This will take a few hours to complete.
Common issues
Add here any new common issues you find.
Related information
- Portal:Data_Services/Admin/Runbooks/Enable_NFS_for_a_project
- Portal:Data_Services/Admin/Shared_storage
Old incidents
Add here any new tasks for incidents you might encounter.