Portal:Toolforge/Admin/Kubernetes/RBAC and Pod security
This page contains the design of Role-based Access Control (RBAC) and Pod Security system that Toolforge Kubernetes cluster uses.
Kubernetes RBAC Role-bindings
Roles are assigned at either the namespace level (rolebinding) or cluster level (clusterrolebinding) through bindings. A role binding links an API object to a user, serviceaccount or similar system object with one or more verbs. These verbs do not universally make sense for all API objects, and the documentation can be sparse outside of code-based, generated docs. In general, Toolforge user accounts are only permitted to act within their particular namespace, and therefore, they usually will have things applied via a rolebinding within the scope of their namespace.
Pod Security
As of today, we implement pod security controls using kyverno policies.
The source of truth for the policy definition is in the maintain-kubeusers repository:
- https://gitlab.wikimedia.org/repos/cloud/toolforge/maintain-kubeusers/
- maintain_kubeusers resources/kyverno_pod_policy.py
- maintain_kubeusers resources/kyverno_pod_policy.yaml.tpl
Roles
Root on the controlplane can use the "cluster-admin" role by default. Not much else should be using that. Special roles should be defined for Toolforge services that offer the minimum required capabilities only. Toolforge users can all use the same role defined at the cluster level (a "ClusterRole") with a namespaced role binding.
Toolforge user roles
The Toolforge users all share one cluster role that they can only use within their namespaces.
YAML
ClusterRole YAML |
---|
# RBAC minimum perms for toolforge users:
# verbs for R/O
# ["get", "list", "watch"]
# verbs for R/W (there are some specific quirks like deletecollection)
# ["get", "list", "watch", "create", "update", "patch", "delete"]
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: tools-user
rules:
- apiGroups:
- ""
resources:
- bindings
- events
- limitranges
- namespaces
- namespaces/status
- persistentvolumeclaims
- pods/log
- pods/status
- replicationcontrollers/status
- resourcequotas
- resourcequotas/status
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- pods
- pods/attach
- pods/exec
- pods/portforward
- pods/proxy
- replicationcontrollers
- replicationcontrollers/scale
- secrets
- services
- services/proxy
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- patch
- update
- apiGroups:
- apps
resources:
- controllerrevisions
- daemonsets
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources:
- deployments
- deployments/rollback
- deployments/scale
- replicasets
- replicasets/scale
- statefulsets
- statefulsets/scale
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- patch
- update
- apiGroups:
- autoscaling
resources:
- horizontalpodautoscalers
verbs:
- get
- list
- watch
- apiGroups:
- batch
resources:
- cronjobs
- jobs
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- patch
- update
- apiGroups:
- extensions
resources:
- daemonsets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources:
- deployments
- deployments/rollback
- deployments/scale
- ingresses
- networkpolicies
- replicasets
- replicasets/scale
- replicationcontrollers/scale
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- patch
- update
- apiGroups:
- networking.k8s.io
resources:
- ingresses
- networkpolicies
verbs:
- get
- list
- watch
- create
- delete
- deletecollection
- patch
- update
- apiGroups:
- policy
resources:
- poddisruptionbudgets
verbs:
- get
- list
- watch
|
Explanation
The easiest way to visualize all that is as a table.
API | Resource | Verbs |
---|---|---|
CoreV1 (apiGroup: "") | bindings | get,list,watch |
CoreV1 (apiGroup: "") | configmaps | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | endpoints | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | events | get,list,watch |
CoreV1 (apiGroup: "") | limitranges | get,list,watch |
CoreV1 (apiGroup: "") | namespaces | get,list,watch |
CoreV1 (apiGroup: "") | namespaces/status | get,list,watch |
CoreV1 (apiGroup: "") | persistentvolumeclaims | get,list,watch |
CoreV1 (apiGroup: "") | pods | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | pods/attach | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | pods/exec | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | pods/log | get,list,watch |
CoreV1 (apiGroup: "") | pods/portforward | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | pods/proxy | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | pods/status | get,list,watch |
CoreV1 (apiGroup: "") | replicationcontrollers | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | replicationcontrollers/scale | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | replicationcontrollers/status | get,list,watch |
CoreV1 (apiGroup: "") | resourcequotas | get,list,watch |
CoreV1 (apiGroup: "") | resourcequotas/status | get,list,watch |
CoreV1 (apiGroup: "") | secrets | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | services | get,list,watch,create,delete,deletecollection,patch,update |
CoreV1 (apiGroup: "") | services/proxy | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | daemonsets | get,list,watch |
ExtensionsV1beta1 (apiGroup: extensions) | deployments | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | deployments/rollback | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | deployments/scale | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | ingresses | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | networkpolicies | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | replicasets | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | replicasets/scale | get,list,watch,create,delete,deletecollection,patch,update |
ExtensionsV1beta1 (apiGroup: extensions) | replicationcontrollers/scale | get,list,watch,create,delete,deletecollection,patch,update |
NetworkingV1 (apiGroup: networking.k8s.io) | ingresses | get,list,watch,create,delete,deletecollection,patch,update |
NetworkingV1 (apiGroup: networking.k8s.io) | networkpolicies | get,list,watch,create,delete,deletecollection,patch,update |
PolicyV1beta1 (apiGroup: policy) | poddisruptionbudgets | get,list,watch |
AppsV1 (apiGroup: apps) | controllerrevisions | get,list,watch |
AppsV1 (apiGroup: apps) | daemonsets | get,list,watch |
AppsV1 (apiGroup: apps) | deployments | get,list,watch,create,delete,deletecollection,patch,update |
AppsV1 (apiGroup: apps) | deployments/rollback | get,list,watch,create,delete,deletecollection,patch,update |
AppsV1 (apiGroup: apps) | deployments/scale | get,list,watch,create,delete,deletecollection,patch,update |
AppsV1 (apiGroup: apps) | replicasets | get,list,watch,create,delete,deletecollection,patch,update |
AppsV1 (apiGroup: apps) | replicasets/scale | get,list,watch,create,delete,deletecollection,patch,update |
AppsV1 (apiGroup: apps) | statefulsets | get,list,watch,create,delete,deletecollection,patch,update |
AppsV1 (apiGroup: apps) | statefulsets/scale | get,list,watch,create,delete,deletecollection,patch,update |
BatchV1Api (apiGroup: batch) | cronjobs | get,list,watch,create,delete,deletecollection,patch,update |
BatchV1Api (apiGroup: batch) | jobs | get,list,watch,create,delete,deletecollection,patch,update |
AutoscalingV1Api (apiGroup: autoscaling) | horizontalpodautoscalers | get,list,watch |
The reason there is so much apparent repetition is because in various editions of Kubernetes, the same resources appear under multiple APIs as features are graduated from alpha/beta/extensions into core APIs or the Apps API. In later editions (1.16, for instance) many of the resources under extensions are only found under apps.
Most of this is likely not controversial, but there are some things to consider. Users can do nearly all of this in the current Toolforge. Something new is ingresses and networkpolicies. The reason they can launch ingresses is to be able to launch services that are accessible to the outside, and networkpolicies are, I think, required for ingresses to work properly. That last part about networkpolicies may be worth testing first. Each namespace should have quotas applied so scaling is not something I fear. "poddisruptionbudgets" are an HA feature that isn't something I think we should restrict, per se either. (see https://kubernetes.io/docs/concepts/workloads/pods/disruptions/). Another consideration is that we may want to restrict deletecollection in some cases, particularly in configmaps where deleting all configmaps in their namespace will recycle their x509 certs and secrets where they might be able to revoke their own service account credentials inadvertently (rendering Deployments non-functional).
One important note: for this and the PSP for Toolforge users to work right, it must be applied to both the toolforge user and the $namespace:default service account, which is what a replicationcontroller runs as (therefore the thing launching pods in a Deployment object). This last piece hasn't been included in maintain_users.py yet, but it will be before launch.
Observer role
See also
Some other interesting information related to this topic: