Jump to content

User:BKing (WMF)/Notes/Kubernetes RBAC Troubleshooting

From Wikitech

Kubernetes RBAC Troubleshooting (WMF production environments)

Taken from this Slack thread (WMF NDA required to see). Balthazar's explanation of RBAC in WMF is illuminating...so much so, in fact, that I've posted it here in hopes that it will be useful to others.

The Problem

T397246 Deploy OpenSearch in dse-k8s

When I try to deploy my application, I get the error

 User "opensearch-test-deploy" cannot get resource "roles" in API group "rbac.authorization.k8s.io" in the namespace "opensearch-test"

So where do we configure the opensearch-test-deploy ClusterRole?

this file is a good place to start. This is where we define the deploy ClusterRole, by cluster, depending on what is installed in the k8s cluster

And how do we bind that deploy ClusterRole to your opensearch-operator-deploy User?

For that, we need to have a look at the ClusterRoleBinding resources

I can find a ClusterRoleBinding such as

- apiVersion: rbac.authorization.k8s.io/v1
  kind: ClusterRoleBinding
  metadata:
    name: view
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: ClusterRole
    name: view
  subjects:
  - apiGroup: rbac.authorization.k8s.io
    kind: Group
    name: view

in helmfile_rbac.yaml but that's not what we're looking for. I'm going to assume that deploy is somehow a default ClusterRoleBinding. Let's have a look in k8s directly:

root@deploy1003:~# kubectl get clusterrolebinding | grep deploy
system:controller:deployment-controller     ClusterRole/system:controller:deployment-controller     2y205d

no joy

root@deploy1003:~# kubectl get rolebinding -A | grep deploy | grep opensearch-operator
opensearch-operator   deploy   ClusterRole/deploy   7d13h

Ok so ClusterRole/deploy is bound per namespace!

root@deploy1003:~# kubectl get rolebinding -n opensearch-operator -oyaml deploy
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: deploy
  namespace: opensearch-operator
  
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: deploy
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: User
  name: opensearch-operator-deploy

So that RoleBinding grants the deploy ClusterRole to the opensearch-operator-deploy User

Where can we find the associated permissions?

root@deploy1003:~# kubectl get rolebinding -n opensearch-operator -oyaml deploy

let's break things down.

- apiGroups: ["", extensions, apps, networking.k8s.io, batch]
  resources: ["*"]
  verbs: ["*"]

This means that the user can do anything to all core resources (pod, deployment, service, etc), network policies, jobs, etc. Basically, all resources that are supposed to be installed alongside a "normal" app

8:22

- apiGroups: [networking.istio.io]
  resources: [gateways, virtualservices, destinationrules]
  verbs: ["*"]

This means that this ClusterRole can do anything on istio related resources. That's for ingress

8:22

- apiGroups: [cert-manager.io]
  resources: [certificates]
  verbs: ["*"]

That's for internal x509 certificates

8:22

- apiGroups: [flink.apache.org] 
- apiGroups: [postgresql.cnpg.io] 
- apiGroups: [opensearch.opster.io] 

flink/opensearch/PG CRDs

8:23

- apiGroups: [crd.projectcalico.org]
  resources: [networkpolicies]
  verbs: ["*"]

Calico networkpolicies

... and that's it. Notice that nowhere in that list did we authorize the deploy ClusterRole to manage anything rbac-related.

8:26 If we look at a e.g RoleBinding resource, we see:

items:
- apiVersion: rbac.authorization.k8s.io/v1
  kind: RoleBinding
  

so to add support for these resources, we'd need something like:

- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["roles", "rolebindings"]
  verbs: ["*"]


Now. let's take a step back and think about why we're doing this.

In the case of an operator, things are a bit more complicated, as it does need to be able to create these resources

Operator Examples

airflow example

mediawiki-dumps-legacy link

As we can see, the opensearch-operator chart needs to create Roles and RoleBindings

brouberol@deploy1003:~$ sudo -i
root@deploy1003:~# kube_env admin dse-k8s-eqiad
root@deploy1003:~# kubectl get clusterrole admin
NAME    CREATED AT
admin   2023-02-24T08:31:56Z
root@deploy1003:~# kubectl get clusterrole admin -oyaml
…
- apiGroups: [rbac.authorization.k8s.io]
  resources: [rolebindings, roles]
  verbs: [create, delete, get, list, patch, update, watch]

8:44 so that admin user is basically the root user in kubernetes, and can only be impersonated by the root UNIX user

8:45

root@deploy1003:/srv/deployment-charts/helmfile.d/admin_ng# ls -alh /etc/kubernetes/admin-dse-k8s-eqiad.config
-r-------- 1 root root 445 Jun  4 20:53 /etc/kubernetes/admin-dse-k8s-eqiad.config

8:46