Kubernetes/Clusters/PSP replacement
So your Kubernetes cluster needs to be migrated off of Pod Security Policies (PSP) to Pod Security Standards (PSS)?
In theory all Helm charts should have been updated as part of https://phabricator.wikimedia.org/T362978 but it's not guaranteed that we did not miss something or that all chart updates have actually been deployed to your cluster. So please verify carefully!
Verify the current cluster state
All namespaces that are using the default restricted
PSP already have the restricted
PSS enabled in audit mode. No user visible warnings will be created on deploying manifests that don’t adhere to the PSS but this allows cluster maintainers to verify existing workload against the new rules.
- Check the Audit Logs Dashboard in logstash for audit violations in your cluster. The filter is pre-selected on loading the dashboard.
- Double check that the current workload does validate against the new PSS:
kubectl get ns -l pod-security.kubernetes.io/audit=restricted -o name | while read ns; do
kubectl label --dry-run=server --overwrite "$ns" pod-security.kubernetes.io/enforce=restricted;
done
- Ensure there is no other PSP applied in your cluster besides
restricted
(andprivileged
):
kubectl get pods -A -o=jsonpath='{range .items[?(@.metadata.annotations.kubernetes\.io/psp!="privileged")]}{@.metadata.namespace}{" "}{@.metadata.annotations.kubernetes\.io/psp}{"\n"}{end}' | sort -u | column -t -s' ' | grep -v 'restricted$'
Disable mutating parts of the restricted PSP
As a first step you may disable the parts of the restricted PSP that can alter a manifest upon deployment to make it adhere to the PSP. As these mutations are no longer supported with PSS, all workloads needs to be compatible upon deployment, without requiring a mutation.
To do so, change helmfile.d/admin_ng/values/*/values.yaml
:
PodSecurityStandard:
disablePSPMutations: true # Disable PSP mutation, allow all seccomp profiles
# enforce: true # Enforce the PodSecurityStandard profile "restricted"
# disableRestrictedPSP: true # Disable PSP binding for the restricted PSP
Ideally this should sink in for a bit as manifests will need to be re-deployed in order to verify that they adhere to the PSP without being mutated by the PSP. After that, do the verification steps from above again.
Enforce the restricted PSS
To enforce the restricted PSS for all namespaces, change helmfile.d/admin_ng/values/*/values.yaml
:
PodSecurityStandard:
disablePSPMutations: true # Disable PSP mutation, allow all seccomp profiles
enforce: true # Enforce the PodSecurityStandard profile "restricted"
disableRestrictedPSP: true # Disable PSP binding for the restricted PSP
The restricted PSP will be completely bypassed by binding the namespace to the privileged PSP. This does not grant further permissions though, as the PSS enforced. Enforcing the PSS will block deployments and scheduling of new Pods (even for existing Deployment manifests) that don’t adhere to the restricted PSS with an error message to the deployer.
After applying the change you would ideally delete one Pod of each of your deployments to verify that it is still scheduled properly. Make use of the Audit Logs Dashboard!
Disable PSPs in your cluster
As last step, you should completely disable the PSP feature within the Kubernetes API server. To do so, add the following block to the cluster in hieradata/common/kubernetes.yaml
:
admission_plugins:
enable:
- DenyServiceExternalIPs
- NodeRestriction
disable:
- PodSecurityPolicy
- StorageObjectInUseProtection
- PersistentVolumeClaimResize
This moves PodSecurityPolicy
from enabled
to disabled
, compared to the default we apply to each cluster.