Jump to content

Data Platform/Systems/OpenSearch-on-K8s/Administration

From Wikitech

OpenSearch on K8s (WIP)

Intended audience: SREs who operate the OpenSearch on K8s platform.

Note: This platform is not yet in production. See T408586 ☂️ OpenSearch on K8s: Ensure platform is ready for production ☂️ for the latest on production status.

If you are a service owner, or are interested in deploying on the platform, you may be more interested in this page.

Dashboards and Alerts

See our OpenSearch on K8s dashboard. The dashboard contains a number of metrics we use to gauge health. Of particular interest are cluster state, thread pools, FIXME: Add more data

You can find alerts in the WMF's alert repo . FIXME: Update with alerts details once alerts actually exist.

Deploying a New OpenSearch on K8s Cluster

  • Begin by following the general instructions for deploying a new Kubernetes service.
  • Create/merge a patch that increases default resources for your new namespace (example patch) . This is needed because the version of the operator chart we use (2.7.0) does not allow changing the resources allotted to the bootstrap pod, and it needs at least 2 GB RAM to stand up the cluster.
  • Create and add secrets. By default, you will need the following secrets:
Secret Name Details
username Used to access the OpenSearch REST API via basic auth. Set the username to the k8s namespace unless the service owner wants something else.
password Used to access the OpenSearch REST API via basic auth.
hashed_password Used to populate the opensearch security YAML config, which in turn is pushed to the OpenSearch Security API via securityadmin.sh . This must be a bcrypted hash. You can generate this with htpasswd , Python's bcrypt library, etc.
  • Create and merge a patch that tells the OpenSearch operator to watch your new namespace. Example patch

Changing Resources on a Live Cluster

The chart has several places to change requests/limits.

So far as I can tell, the one that actually applies is under

opensearchCluster:
  nodePools:
    - component: masters
      roles:
        - "master"
        - "data"
      resources:
        requests:
          memory: "4Gi"
          cpu: "2000m"
        limits:
          memory: "4Gi"
          cpu: "2000m"


As of this writing, that configuration is applied via this file in our deployment-charts repo. Which means if you want to override it, you need to add it to the values.yaml file specific to your deployment and/or helmfile release.

Again, as of this writing, the operator does not detect changes to the resources. As such, you'll have to delete the pods one-by-one to apply the resource changes. Take your time and ensure the cluster is healthy before moving on to the next pod!

1.

$ k get po -o yaml opensearch-ipoid-test-masters-0 | grep -i memory
        memory: 2Gi
        memory: 2Gi

2.

k delete po opensearch-ipoid-test-masters-0

3.

k get po -o yaml opensearch-ipoid-test-masters-0 | grep -i memory
        memory: 4Gi
        memory: 4Gi

The operator should be handling this automatically. We'll continue investigating, but it's likely this will be fixed when we move to a newer version of the operator and/or helm chart.

API Calls

Audit Logging

PT=43885; curl -H 'Content-type: Application/json' -XPOST -k -u ${PW}  https://0:${PT}/security-auditlog-2025.10.31/_search?pretty