Incidents/20151024-LabsNFS-Lag
Appearance
Summary
NFS lagged for all labs instances when I (milimetric (talk)) executed a
tail -n +2
on a huge 35GB file in /data/project/milimetric/. This used up all the bandwidth.
Timeline
- 2015-10-24 00:52:44UTC Started the operation
- 2015-10-24 01:12:42UTC Coren alerted me on IRC
- 2015-10-24 01:33:16UTC I killed the operation
Conclusions
- I could have asked someone from ops to do the file operation locally on labstore1002
- I could have rate limited with pv -L
- I probably should have just fixed the source file on stat1002 (where it came from originally) and re-copied it to labs