Systemd resource control
Material may not yet be complete, information may presently be omitted, and certain parts of the content may be subject to radical, rapid alteration. More information pertaining to this may be available on the talk page.
This page shares a mechanism for doing server resource control for users using systemd.
The mechanism is based on systemd slices, which is in turn based on cgroups. Previous to systemd, one could use cgred, but that's considered obsolete in the systemd era.
How it works
systemd will put all user owned proccesses into a shared cgroup. The mechanism is very elegant and robuts.
But this doesn't allow specifying quotas or limits process or using other criteria, so this mechanism is not very granular.
On the other hand, in a systemd server, there is no way an user can workaround these limits (i.e, you failed with a regexp and then there are proccesses not covered by the limits).
systemd creates 2 basic slices by default:
- user.slice (for user sessions and procs started by users)
- system.slice (for system daemons and other units started by systemd itself)
Additionally, each logged user will be put in a sub-slice for itself:
- user-NNNN.slice (NNNN = numeric user id)
- user-YYYY.slice (YYYY = numeric used id)
Thus, there is a slice tree for resource control.
aborrero@puppetmaster1001:~$ systemctl status
● puppetmaster1001
State: running
Jobs: 0 queued
Failed: 0 units
Since: Tue 2018-08-28 19:38:50 UTC; 5 months 7 days ago
CGroup: /
├─user.slice <----------
│ └─user-18194.slice <----------
│ ├─session-694762.scope
│ │ ├─ 4232 systemctl status
│ │ ├─ 4233 pager
│ │ ├─32447 sshd: aborrero [priv]
│ │ ├─32471 sshd: aborrero@pts/0
│ │ └─32481 -bash
│ └─user@18194.service
│ └─[...]
└─system.slice <----------
├─lvm2-lvmetad.service
│ └─449 /sbin/lvmetad -f
├─confd.service
[...]
You can create generic or default slice config for all users, creating an unit named user-.slice. Limits placed here will be applied to all child slices.
If using this templating mechanism, beware that root will get the same resource constraint, specially if you just do sudo (since the sudo proc belongs to your personal user slice).
Things like puppet agent can consume a lot of resources. You can solve this by creating a root-specific slice configuration (user-0.slice) and probably leaving it unrestricted.
And then loging directly as root (no user ssh and then sudo).
checking the configuration
A couple of handly commands to check slices are systemd-cgtop and systemd-cgls:
root@tools-sgebastion-06:~# systemd-cgtop
Control Group Tasks %CPU Memory Input/s Output/s
/ 206 10.0 754.3M - -
/user.slice 22 10.3 177.2M - -
/user.slice/user-18194.slice 9 9.9 145.7M - -
/user.slice/user-0.slice 7 0.4 19.3M - -
/system.slice 79 0.0 105.5M -
[...]
root@tools-sgebastion-06:~# systemd-cgls
Control group /:
-.slice
├─user.slice
│ ├─user-0.slice
│ │ ├─session-2.scope
│ │ │ ├─ 727 sshd: root@pts/0
│ │ │ ├─ 759 -bash
│ │ │ ├─2720 systemd-cgls
│ │ │ └─2721 pager
│ │ ├─session-6.scope
│ │ │ └─init.scope
│ │ │ ├─1156 sshd: root@pts/1
│ │ │ └─1187 -bash
│ │ └─user@0.service
│ │ └─init.scope
│ │ ├─738 /lib/systemd/systemd --user
│ │ └─739 (sd-pam)
│ ├─user-18194.slice
│ │ ├─user@18194.service
│ │ │ └─init.scope
│ │ │ ├─1145 /lib/systemd/systemd --user
│ │ │ └─1146 (sd-pam)
[...]
In the slice config, if you explicitly activate accouting (see config section), you can check live numbers in the slice unit status:
root@tools-sgebastion-06:~# systemctl status user-18194.slice
● user-18194.slice
Loaded: loaded
Drop-In: /etc/systemd/system/user-.slice.d
└─puppet-override.conf
Active: active since Mon 2019-02-04 13:20:39 UTC; 34min ago
Tasks: 9 (limit: 100)
Memory: 145.7M (high: 100.0M max: 150.0M swap max: 0B)
CPU: 3min 4.431s
Configuration
You can create configuration limits for each of these slices by means of these directives:
[Slice] # each user can use max this % of one CPU CPUQuota=10% # each user can run max this number of tasks/threads TasksMax=100 # slow down procs if they use more than this memory MemoryHigh=100M # if more than this memory is used, OOM killer will step in MemoryMax=150M # users can't use swap memory MemorySwapMax=0 # do accounting but don't limit anything by now IOAccounting=yes IPAccounting=yes
You can create an override for all user slices creating an unit configuration for user-.slice, i.e, the wildcard slice for all users.
Puppet examples
By the time of this writting, Toolforge uses this mechanism in user-facing bastion servers.
This is present in the profile::toolforge::bastion::resourcecontrol class:
[..] # we need systemd >= 239 for resource control using the user-.slice trick # this version is provied in stretch-backports apt::pin { 'toolforge-bastion-systemd': package => 'systemd udev', pin => 'version 239*', priority => '1001', } $packages = [ 'systemd', 'udev', ] package { $packages: ensure => present, install_options => ['-t', 'stretch-backports'], } systemd::unit { 'user-.slice': ensure => present, content => file('profile/toolforge/bastion-user-resource-control.conf'), override => true, } systemd::unit { 'user-0.slice': ensure => present, content => file('profile/toolforge/bastion-root-resource-control.conf'), override => true, } [..]