Monitoring/systemd unit state
The "systemd unit state" Icinga checks test if there are any failed systemd units. Units commonly include, but are not limited to, services (.service), mount points (.mount), devices (.device) and sockets (.socket). See systemd.unit(5) for details on each of these.
For this type of alerts, you should ssh to the server in question and run systemctl list-units --state=failed
(or the shortcut systemctl --failed
) to check which unit is the one that has issues.
You can use systemctl status unit
, journalctl -u unit
and journalctl -xn
to see more details and logs to figure out why it failed.
When using systemctl status unit
take note of the Doc:
entry which may provide further pointers to additional wikitech pages to further troubleshoot the issue.
Try manually starting it with systemctl start unit
.
Sometimes the failure has been fixed already and you just need to clear the list of failed units with systemctl reset-failed unit
.
See also
- task T199911 for an ongoing issue with "Systemd session creation fails under I/O load"
- Auditing systemd: solving failed units with systemctl
- How To Use Journalctl to View and Manipulate Systemd Logs