Portal:Toolforge/Admin/Kubernetes/RBAC and Pod security/PSP migration
This page contains information on the PSP migration we conducted in 2024. See also: phab:T279110
PSP vs PSA feature comparison
The old PSP tool account profile vs what we can do with PSA restricted profile, see https://v1-24.docs.kubernetes.io/docs/concepts/security/pod-security-standards/#restricted
PSP tool account restricted | PSA profile | Comment |
---|---|---|
seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'runtime/default' | included in restricted | |
seccomp.security.alpha.kubernetes.io/defaultProfileName: 'runtime/default' | included in restricted | |
requiredDropCapabilities: [ALL] | included in restricted | |
allowPrivilegeEscalation: false | included in restricted | |
fsGroup.rule MustRunAs user.id | not included in any profile, we need alternative | |
hostIPC: false | included in baseline | |
hostNetwork: false | included in baseline | |
hostPID: false | included in baseline | |
privileged: false | not included in any profile? | maybe it was replaced by spec.containers[*].securityContext.privileged
|
readOnlyRootFilesystem: false | not included in any profile, we need alternative | |
runAsUser.rule 'MustRunAs user.id | not included in any profile, we need alternative | |
seLinux:rule: 'RunAsAny' | not included in any profile, we need alternative | |
runAsGroup.rule: 'MustRunAs' user.id | not included in any profile, we need alternative | |
supplementalGroups.rule: 'MustRunAs' (disallow root group) | not included in any profile, we need alternative | |
volumes: [list of allowed volume types] | None allows hostPath volume mounts | WARNING: this is a major blocker. We need to hostPath mounts for NFS. Or, we may rework how we do NFS at all. Is it worth it? |
allowedHostPaths | not included in any profile, we need alternative |
Custom admission controllers
We have several custom admission controllers:
- volume-admission -- mount volumes in pods.
- registry-admission -- verify and restricts container registry URLs.
- ingress-admission -- verify and restricts ingress resources.
- envvars-admission -- mutating pod manifests to add secrets.
Plans
- experiment with a PolicyAgent to see if they are capable of fully replacing the PSP functions.
- make sure the policy agent we choose can also absorb the functionalities of the several custom admissions controllers we have
Questions and answers
- Q: Why using PSA at all if it can't cover all we need?
- A: Using a standard function can introduce a baseline of expected behavior with regards to upstream practices. Which may be desirable. We can additionally add our own policies by means of OPA gatekeeper or Kyverno on top of the upstream standard functions.
- Q: Why migrating from the custom admission controllers to a policy agent?
- A: Some of our custom admission controllers are non-trivial pieces of codes that were crafted to enforce just 2 or 3 fields of a JSON definition. Given we are going to introduce a policy agent anyway, the migration may be trivial, with the additional gain of not having to maintain the custom admission controller codebase anymore.
Kyverno POC
Kyverno can work on different ways, checking resources and auditing, enforcing or even mutating them to achieve compliance. Mutating resources seems like a very cool thing to do, but probably the less-friction migration from current PSP setup is an auditing/enforcement setup.
Installing kyverno :
user@lima-lima-kilo:~$ helm repo add kyverno https://kyverno.github.io/kyverno/
user@lima-lima-kilo:~$ helm repo update
user@lima-lima-kilo:~$ helm search repo kyverno -l
NAME CHART VERSION APP VERSION DESCRIPTION
kyverno/kyverno 3.1.4 v1.11.4 Kubernetes Native Policy Management
kyverno/kyverno 3.1.3 v1.11.3 Kubernetes Native Policy Management
kyverno/kyverno 3.1.2 v1.11.2 Kubernetes Native Policy Management
kyverno/kyverno 3.1.1 v1.11.1 Kubernetes Native Policy Management
kyverno/kyverno 3.1.0 v1.11.0 Kubernetes Native Policy Management
kyverno/kyverno 3.0.9 v1.10.7 Kubernetes Native Policy Management
[..]
user@lima-lima-kilo:~$ helm install kyverno kyverno/kyverno -n kyverno --create-namespace --version 3.0.9
Kyverno version 1.10.x supports k8s min 1.24 max 1.26, see https://kyverno.io/docs/installation/#compatibility-matrix
Kyverno have 2 main config resources:
- Policy: namespace scope
- ClusterPolicy: cluster scope
Example ClusterPolicy to validate all pods definition for common set of desirable security configs:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: "toolforge-tool-account-cluster-policy"
annotations:
policies.kyverno.io/title: "pod security"
policies.kyverno.io/category: "toolforge tool account"
kyverno.io/kyverno-version: "1.10.7"
kyverno.io/kubernetes-version: "1.24"
policies.kyverno.io/subject: "Pod"
policies.kyverno.io/description: "potential tool account pod security check, clusterwide"
spec:
validationFailureAction: "Audit"
background: false
rules:
- name: "pod-level validations"
match:
all:
- resources:
kinds:
- "Pod"
namespaces:
- "tool-*"
validate:
message: "pod-level configuration must be correct"
pattern:
spec:
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
runAsNonRoot: true
privileged: false
hostNetwork: false
hostIPC: false
hostPID: false
capabilities:
drop:
- ALL
seccompProfile:
type: "runtime/default"
Example Policy to validate per-tool account security configs (this one could be generated by maintain-kubeusers):
apiVersion: kyverno.io/v1
kind: Policy
metadata:
name: "tf-test-policy"
namespace: "tool-tf-test"
annotations:
policies.kyverno.io/title: "pod security"
policies.kyverno.io/category: "toolforge tool account"
kyverno.io/kyverno-version: "1.10.7"
kyverno.io/kubernetes-version: "1.24"
policies.kyverno.io/subject: "Pod"
policies.kyverno.io/description: "potential tool account pod security check"
spec:
validationFailureAction: "Audit"
background: false
rules:
- name: "pod-level validations"
match:
any:
- resources:
kinds:
- "Pod"
validate:
message: "pod-level configuration must be correct"
pattern:
spec:
workingDir: "/data/project/tf-test"
securityContext:
runAsUser: 1001
runAsGroup: 1001
fsGroup: 1001
supplementalGroups: 1001
Loading them:
user@lima-lima-kilo:~$ kubectl apply -f clusterpolicy.yaml
clusterpolicy.kyverno.io/toolforge-tool-account-cluster-policy configured
user@lima-lima-kilo:~$ kubectl apply -f policy.yaml
policy.kyverno.io/tf-test-policy unchanged
Exploring the policy effects on the pods. Note in this POC the policy is in audit mode on purpose:
user@lima-lima-kilo:~$ kubectl get policyreport -n tool-tf-test
NAME PASS FAIL WARN ERROR SKIP AGE
cpol-toolforge-tool-account-cluster-policy 0 2 0 0 0 2d18h
pol-tf-test-policy 0 2 0 0 0 2d20h
arturo@lima-lima-kilo:~$ kubectl get policyreport -n tool-tf-test cpol-toolforge-tool-account-cluster-policy -o yaml
apiVersion: wgpolicyk8s.io/v1alpha2
kind: PolicyReport
metadata:
creationTimestamp: "2024-04-05T15:14:07Z"
generation: 1
labels:
app.kubernetes.io/managed-by: kyverno
cpol.kyverno.io/toolforge-tool-account-cluster-policy: "95124"
name: cpol-toolforge-tool-account-cluster-policy
namespace: tool-tf-test
resourceVersion: "95262"
uid: 2c548864-4327-4dc6-a097-cdb7ccf0aaf6
results:
- category: toolforge tool account
message: 'validation error: pod-level configuration must be correct. rule autogen-pod-level
validations failed at path /spec/template/spec/securityContext/allowPrivilegeEscalation/'
policy: toolforge-tool-account-cluster-policy
resources:
- apiVersion: apps/v1
kind: Deployment
name: test
namespace: tool-tf-test
uid: d8075147-0721-4c80-bbff-e1e95431ecef
result: fail
rule: autogen-pod-level validations
scored: true
source: kyverno
timestamp:
nanos: 0
seconds: 1712330036
- category: toolforge tool account
message: 'validation error: pod-level configuration must be correct. rule pod-level validations
failed at path /spec/securityContext/allowPrivilegeEscalation/'
policy: toolforge-tool-account-cluster-policy
resources:
- apiVersion: v1
kind: Pod
name: test-6d779f4c7b-pccrr
namespace: tool-tf-test
uid: 51a38062-8c27-4266-bd76-5407cb13ebe7
result: fail
rule: pod-level validations
scored: true
source: kyverno
timestamp:
nanos: 0
seconds: 1712330036
summary:
error: 0
fail: 2
pass: 0
skip: 0
warn: 0
arturo@lima-lima-kilo:~$ kubectl describe clusterpolicy toolforge-tool-account-cluster-policy
Name: toolforge-tool-account-cluster-policy
[..]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning PolicyViolation 30s kyverno-admission Pod tool-tf-test/test-6d779f4c7b-hlwsm: [pod-level validations] fail; validation error: pod-level configuration must be correct. rule pod-level validations failed at path /spec/securityContext/allowPrivilegeEscalation/
Warning PolicyViolation 21s kyverno-admission Deployment tool-tf-test/test: [autogen-pod-level validations] fail; validation error: pod-level configuration must be correct. rule autogen-pod-level validations failed at path /spec/template/spec/securityContext/allowPrivilegeEscalation/
Warning PolicyViolation 21s kyverno-admission Pod tool-tf-test/test-6d779f4c7b-bphqj: [pod-level validations] fail; validation error: pod-level configuration must be correct. rule pod-level validations failed at path /spec/securityContext/allowPrivilegeEscalation/
OpenPolicyAgent gatekeeper POC
Installing from helm:
user@lima-lima-kilo:~$ helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
user@lima-lima-kilo:~$ helm install gatekeeper/gatekeeper --name-template=gatekeeper --namespace gatekeeper-system --create-namespace
NOTE: we may want to cache the container image, or even build our own. Both of these options are supported upstream, in the sense that they provide docs on how properly do it. See https://open-policy-agent.github.io/gatekeeper/website/docs/install