Monitoring/systemd unit state
The "systemd unit state" Icinga checks test if there are any failed systemd units. Units commonly include, but are not limited to, services (.service), mount points (.mount), devices (.device) and sockets (.socket). See systemd.unit(5) for details on each of these.
For this type of alerts, you should ssh to the server in question and run
systemctl list-units --state=failed (or the shortcut
systemctl --failed) to check which unit is the one that has issues.
You can use
systemctl status unit,
journalctl -u unit and
journalctl -xn to see more details and logs to figure out why it failed.
systemctl status unit take note of the
Doc: entry which may provide further pointers to additional wikitech pages to further troubleshoot the issue
Try manually starting it with
systemctl start unit.
Sometimes the failure has been fixed already and you just need to clear the list of failed units with
systemctl reset-failed unit.
- task T199911 for an ongoing issue with "Systemd session creation fails under I/O load"
- Auditing systemd: solving failed units with systemctl
- How To Use Journalctl to View and Manipulate Systemd Logs