Portal:Toolforge/Admin/Kubernetes/RBAC and Pod security

This page contains the design of Role-based Access Control (RBAC) and Pod Security system that Toolforge Kubernetes cluster uses.

Kubernetes RBAC Role-bindings

Roles are assigned at either the namespace level (rolebinding) or cluster level (clusterrolebinding) through bindings. A role binding links an API object to a user, serviceaccount or similar system object with one or more verbs. These verbs do not universally make sense for all API objects, and the documentation can be sparse outside of code-based, generated docs. In general, Toolforge user accounts are only permitted to act within their particular namespace, and therefore, they usually will have things applied via a rolebinding within the scope of their namespace.

Pod Security

As of today, we implement pod security controls using kyverno policies.

The source of truth for the policy definition is in the maintain-kubeusers repository:

Roles

Root on the controlplane can use the "cluster-admin" role by default. Not much else should be using that. Special roles should be defined for Toolforge services that offer the minimum required capabilities only. Toolforge users can all use the same role defined at the cluster level (a "ClusterRole") with a namespaced role binding.

Toolforge user roles

The Toolforge users all share one cluster role that they can only use within their namespaces.

YAML

Explanation

The easiest way to visualize all that is as a table.

RBAC Permissions Sorted By API and Resource
API	Resource	Verbs
CoreV1 (apiGroup: "")	bindings	get,list,watch
CoreV1 (apiGroup: "")	configmaps	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	endpoints	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	events	get,list,watch
CoreV1 (apiGroup: "")	limitranges	get,list,watch
CoreV1 (apiGroup: "")	namespaces	get,list,watch
CoreV1 (apiGroup: "")	namespaces/status	get,list,watch
CoreV1 (apiGroup: "")	persistentvolumeclaims	get,list,watch
CoreV1 (apiGroup: "")	pods	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	pods/attach	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	pods/exec	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	pods/log	get,list,watch
CoreV1 (apiGroup: "")	pods/portforward	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	pods/proxy	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	pods/status	get,list,watch
CoreV1 (apiGroup: "")	replicationcontrollers	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	replicationcontrollers/scale	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	replicationcontrollers/status	get,list,watch
CoreV1 (apiGroup: "")	resourcequotas	get,list,watch
CoreV1 (apiGroup: "")	resourcequotas/status	get,list,watch
CoreV1 (apiGroup: "")	secrets	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	services	get,list,watch,create,delete,deletecollection,patch,update
CoreV1 (apiGroup: "")	services/proxy	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	daemonsets	get,list,watch
ExtensionsV1beta1 (apiGroup: extensions)	deployments	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	deployments/rollback	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	deployments/scale	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	ingresses	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	networkpolicies	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	replicasets	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	replicasets/scale	get,list,watch,create,delete,deletecollection,patch,update
ExtensionsV1beta1 (apiGroup: extensions)	replicationcontrollers/scale	get,list,watch,create,delete,deletecollection,patch,update
NetworkingV1 (apiGroup: networking.k8s.io)	ingresses	get,list,watch,create,delete,deletecollection,patch,update
NetworkingV1 (apiGroup: networking.k8s.io)	networkpolicies	get,list,watch,create,delete,deletecollection,patch,update
PolicyV1beta1 (apiGroup: policy)	poddisruptionbudgets	get,list,watch
AppsV1 (apiGroup: apps)	controllerrevisions	get,list,watch
AppsV1 (apiGroup: apps)	daemonsets	get,list,watch
AppsV1 (apiGroup: apps)	deployments	get,list,watch,create,delete,deletecollection,patch,update
AppsV1 (apiGroup: apps)	deployments/rollback	get,list,watch,create,delete,deletecollection,patch,update
AppsV1 (apiGroup: apps)	deployments/scale	get,list,watch,create,delete,deletecollection,patch,update
AppsV1 (apiGroup: apps)	replicasets	get,list,watch,create,delete,deletecollection,patch,update
AppsV1 (apiGroup: apps)	replicasets/scale	get,list,watch,create,delete,deletecollection,patch,update
AppsV1 (apiGroup: apps)	statefulsets	get,list,watch,create,delete,deletecollection,patch,update
AppsV1 (apiGroup: apps)	statefulsets/scale	get,list,watch,create,delete,deletecollection,patch,update
BatchV1Api (apiGroup: batch)	cronjobs	get,list,watch,create,delete,deletecollection,patch,update
BatchV1Api (apiGroup: batch)	jobs	get,list,watch,create,delete,deletecollection,patch,update
AutoscalingV1Api (apiGroup: autoscaling)	horizontalpodautoscalers	get,list,watch

The reason there is so much apparent repetition is because in various editions of Kubernetes, the same resources appear under multiple APIs as features are graduated from alpha/beta/extensions into core APIs or the Apps API. In later editions (1.16, for instance) many of the resources under extensions are only found under apps.

Most of this is likely not controversial, but there are some things to consider. Users can do nearly all of this in the current Toolforge. Something new is ingresses and networkpolicies. The reason they can launch ingresses is to be able to launch services that are accessible to the outside, and networkpolicies are, I think, required for ingresses to work properly. That last part about networkpolicies may be worth testing first. Each namespace should have quotas applied so scaling is not something I fear. "poddisruptionbudgets" are an HA feature that isn't something I think we should restrict, per se either. (see https://kubernetes.io/docs/concepts/workloads/pods/disruptions/). Another consideration is that we may want to restrict deletecollection in some cases, particularly in configmaps where deleting all configmaps in their namespace will recycle their x509 certs and secrets where they might be able to revoke their own service account credentials inadvertently (rendering Deployments non-functional).

One important note: for this and the PSP for Toolforge users to work right, it must be applied to both the toolforge user and the $namespace:default service account, which is what a replicationcontroller runs as (therefore the thing launching pods in a Deployment object). This last piece hasn't been included in maintain_users.py yet, but it will be before launch.

Observer role

Tracked in Phabricator
Task T233372