Jump to content

User:JMeybohm (WMF)/PSP Replacement

From Wikitech

Problem

Pod Security Policies (PSP's) will be removed with Kubernetes 1.25, so we are required to switch to a different solution to enforce the policies we define for our clusters. With Pod Security Admission (PSA) there is a build-in replacement in Kubernetes which is unfortunately limited to three policies (Privileged, Baseline and Restricted) defined by the Pod Security Standards (PSS). These policies can not be altered and it is not possible to create custom ones.

Certain workload (read: MediaWiki) does require features/functionality that is prohibited by all but the Privileged PSS profile, making a different solution a requirement for the wikikube cluster:

  • Mounting a hostPath volume for the GeoIP database (we can probably work around this)
  • Allowing the SYS_PTRACE capability in order to allow php-fpm to produce slowlogs (we can not work around this)

Options

A not comprehensive analysis of the workloads in the wikikube cluster suggests that almost all of the services can run with the Restricted PSS profile. As mentioned above, MediaWiki can not. So the basic idea is to run all services but MediaWiki with the Restricted PSS profile, making that a straight forward transition following upstream documentation (Migrate from PodSecurityPolicy to the Built-In PodSecurity Admission Controller).

For MediaWiki (and potential other workload that does not fit into the standard PSS profiles) we need to adopt a different solution:

  • Do nothing (e.g. run with no policy enforcement at all)
  • Deploy one of the PSP replacement > External Policy Management Solutions
  • Use PSP replacement > Validating Admission Policy, implementing our own policies

External Policy Management Solutions

Generic con's that apply to all of them:

  • All three are obviously external, e.g. they require web hooks to function etc.
  • All three come with one to N>1 components that need to be installed in each cluster
    • Yet another version compatibility matrix
  • All three come with CRDs which need to be installed

Gatekeeper

  • Pro:
    • Has been around the longest (Open Policy Agent has), CNCF graduated since 2021
    • Microsoft, Google, Red Hat building (on) this
  • Con:
    • Policies need to be written in Rego
    • Broader scope (not just k8s)
    • Audits are stored in custom gatekeeper format

Kubewarden

  • Pro:
  • Con:
    • Pretty new (CNCF sandbox as of 2022)
    • Documentation seems pretty sparse compared to the others
    • Mostly SUSE/Rancher contributing

Kyverno

  • Pro:
    • Policies are plain YAML or CEL (which is used for official ValidatingAdmissionPolicy as well)
    • Narrow scope (just k8s)
    • CNCF sandbox in 2020, incubator in 2022
    • Audits are stored in kubernetes-sigs/wg-policy-prototypes/policy-report CRDs
    • Most active development
  • Con:
    • Mostly written by nirmata, they also have a enterprise version :/ (enterprise offering extended support for k8s versions, validating policies and some monitoring UI - maybe metrics ?⚠️)

Validating Admission Policy

Since 1.26 (in alpha stage) Kubernetes offers a in-process alternative to validating admission web hooks (which all of the external policy management solutions basically are). Policies can be expressed using the Common Expression Language (CEL) and are evaluated within the kube-apiserver itself (e.g. no external dependency).

  • Pro:
    • Upstream Kubernetes Feature (no addition to the kubernetes/components version matrix)
    • No dependency to a web hook for scheduling workloads
    • No maintenance of external components/software
  • Con:
    • Need to maintain/test all policies ourselves
    • Hard to create "fail-fast" policies that are immediately visible to the deployer (e.g. fail when applying a Deployment that would create a prohibited Pod)
    • No built in report generation methods
    • No mutating of objects
    • Requires k8s >=1.26 (e.g. we can't migrate to them before the next Kubernetes upgrade)

Make use of existing Kyverno policies

As Kyverno policies can be written in plain CEL, they are (more or less) compatible to plain Validating Admission Policies (VAPs). The Kyverno policy repository also contains a set of policies that practically replicate the PSS profiles Baseline and Restricted (policies/pod-security-cel).

That means Kyverno CEL policies can be parsed and used to create VAPs. An existing proof of concept works as follows:

  • Parse all pod-security-cel Kyverno policies
  • Create a ValidationAdmissionPolicy and corresponding ValidatingAdmissionPolicyBinding for each of their rules

(a Kyverno policy can have multiple rules, a VAPs can not)

  • Each ValidatingAdmissionPolicy is bound to namespaces with the label pod-security.wmf.org/profile: baseline or pod-security.wmf.org/profile: restricted, depending on the PSS profile it is part of
  • Namespaces can opt out of specific rules by setting the label pod-security.wmf.org/<rule-name>: exclude


With that we can create namespaces that adhere to the Restricted PSS profile, excluding certain rules. Additional rules can then be added to particular namespaces (like the mw- ones) to allow just what's required there:

  • Add back SYS_PTRACE capability
  • Allow a certain hostPath to be mounted


Obviously this has the downside of requiring us to maintain the set of policies required to represent PSS profiles, but we can rely on the Kyverno community here as we would if we were to use Kyverno itself.
In addition to that we should have our own set of tests to verify that the generated VAPs actually do what we expect them to. This can be achieved by adopting the E2E test suite Chainsaw (which was build to test Kyverno policies in the first place). Test can maybe also created semi-automatic from the ones available in the Kyverno Policies repository, but it will definitely require some additional work.
During testing (and writing the Kyverno Policy require-run-as-nonroot) I build tests for baseline/disallow-capabilities and restricted/require-run-as-nonroot to validate this method of testing (at least locally, not sure about CI).

Violation error handling

What happens when a Deployment is created that would create a Pod that violates a policy (in this case by mounting a hostPath volume) ?

PSP

  • deployment is created
  • Raises no warning to the user,
  • Creates k8s events
$ kubectl -n restricted-psp get events  
LAST SEEN   TYPE      REASON              OBJECT                                  MESSAGE  
7s          Warning   FailedCreate        replicaset/echo-restricted-74bc5c7d67   Error creating: pods "echo-restricted-74bc5c7d67-" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used]

PSA/PSS

  • deployment is created
  • Raises a warning to the user
  • Creates k8s events
$ kubectl apply -f resticted.yaml    
namespace/restricted created  
Warning: would violate PodSecurity "restricted:latest": restricted volume types (volume "geoip" uses restricted volume type "hostPath")  
deployment.apps/echo-restricted created

$ kubectl -n restricted get events  
LAST SEEN   TYPE      REASON              OBJECT                                 MESSAGE  
28s         Warning   FailedCreate        replicaset/echo-restricted-fb9b96767   Error creating: pods "echo-restricted-fb9b96767-jzz2l" is forbidden: violates PodSecurity "restricted:latest": restricted volume types (volume "geoip" uses restricted volume type "hostPath")

VAP

  • deployment is created
  • Raises no warning to the user
  • Creates k8s events
$ kubectl -n restricted-vap get events  
LAST SEEN   TYPE      REASON              OBJECT                                 MESSAGE  
16s         Warning   FailedCreate        replicaset/echo-restricted-fb9b96767   Error creating: pods "echo-restricted-fb9b96767-xttkl" is forbidden: ValidatingAdmissionPolicy 'restrict-volume-types-restricted-volumes' with binding 'restrict-volume-types-restricted-volumes' denied request: Only the following types of volumes may be used: configMap, csi, downwardAPI, emptyDir, ephemeral, persistentVolumeClaim, projected, and secret.

kyverno

  • Deployment is not created
  • Raises a warning to the user
  • Creates k8s events (but always in default namespace)
$ kubectl apply -f resticted-kyverno.yaml 
Error from server: error when creating "resticted-kyverno.yaml": admission webhook "validate.kyverno.svc-fail" denied the request: 

resource Deployment/default/echo-restricted was blocked due to the following policies 

restrict-volume-types:
  autogen-restricted-volumes: 'Only the following types of volumes may be used: configMap,
    csi, downwardAPI, emptyDir, ephemeral, persistentVolumeClaim, projected, and secret.'
$ kubectl get events 
28s         Warning   PolicyViolation           clusterpolicy/restrict-volume-types   Deployment restricted/echo-restricted: [autogen-restricted-volumes] fail (blocked); Only the following types of volumes may be used: configMap, csi, downwardAPI, emptyDir, ephemeral, persistentVolumeClaim, projected, and secret.