Jump to content

Data Platform/Systems/OpenSearch-on-K8s

From Wikitech

OpenSearch on K8s

You've reached the documentation for the OpenSearch on K8s platform!

Intended Audience

Foundation employees who need an OpenSearch backing store for their application (small-to-medium use cases only), or who are already building on the platform.

SREs who support the platform may be more interested in this page.

Service/Availability Expectations

Here's what the OpenSearch on K8s platform offers in terms of support.

Responsibilities for the individual OpenSearch cluster are split between the platform owner and application owner, as defined below.

Glossary

Term Definition
Platform owner The team that owns OpenSearch on Kubernetes (currently Data Platform SRE)
Application owner The team or person who owns an application which uses OpenSearch on Kubernetes as a backing store.

Responsibilities

Action Responsibility
Restore Kubernetes services platform owner
Restore OpenSearch operator service platform owner
Redeploy an OpenSearch cluster platform owner
Load data into the OpenSearch cluster application owner
Ensure application can withstand the loss at least one of its availability zones(eqiad/codfw datacenters) application owner
Regenerate lost OpenSearch data application owner
Index settings (replica/shard counts) application owner†
Index state management (pruning/deleting old indices) application owner†

See "OpenSearch best practices" section below.

Response Time/SLO

The OpenSearch on K8s platform is supported on a “best effort” basis, which means the platform owner (DPE SRE) will work to restore the platform during business hours only. Alerts, notifications, and responders will be the same as other DPE SRE-supported services.

An SLO for the platform is planned, but not yet available.

Disaster Recovery/Return to Service

The OpenSearch service shall not be the primary backing source of data for any application. In case of data loss, the application owner is expected to regenerate/reload the data into their OpenSearch on Kubernetes instance. OpenSearch-native backups and restores are not part of the platform at this time.

Support Escalations

If you encounter a situation that can't be solved by the troubleshooting section below, not to worry! You can contact DPE SRE via #data-platform-sre Slack (preferred) or #wikimedia-data-platform IRC.

Lifecycle Work/Upgrades

The physical infrastructure which hosts OpenSearch on Kubernetes will need to go offline from time to time. When possible, the platform owner shall notify the application owners via email at least 5 business days before any planned service interruption.

Security Updates

In case of security updates, the platform owners will redeploy the service. When possible, the platform owner shall notify the application owners via email at least 5 business days before any planned service interruption.

Responsible Usage Policy

In the event that the platform owners reasonably suspect that platform-wide issues are being caused by a specific application, the platform owners may take steps to ensure the stability of the platform. These actions include (but are not limited to): rate-limiting, changing resources/limits, temporarily or permanently disabling the OpenSearch cluster. The platform owners will notify the application owner and involve them in the mitigation process as much as is practical.

Consultations

Users can request general advice on OpenSearch optimizations, workflows, etc from DPE SRE and other subject matter experts at WMF (such as the Search Platform team). Consultations are subject to SME availability and are not considered to be part of the SLO. OpenSearch query optimization is one example of a consultation that would not fall under the SLO.


Platform Info

OpenSearch on K8s
Attribute Value
Owner Data Platform SRE
Kubernetes Cluster dse-k8s-eqiad, dse-k8s-codfw
Kubernetes Namespace Multiple. For an up-to-date list, check the list of opensearch operator-watched namespaces in the deployment-charts repo:

eqiad: https://w.wiki/Gbed codfw: https://w.wiki/Gbej

Chart https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/charts/opensearch-cluster
Helmfiles https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/master/helmfile.d/dse-k8s-services/opensearch-ipoid/helmfile.yaml (example; varies per cluster)
Docker image https://gitlab.wikimedia.org/repos/data-engineering/opensearch/
Internal service DNS https://opensearch-ipoid.discovery.wmnet:30443 (example; varies per cluster)
Public service URL N/A
Logs https://logstash.wikimedia.org/goto/21f561792ee30287c25af1624a797702
Metrics https://grafana.wikimedia.org/d/c0a89788-c6fe-4d06-aeb2-70b63049599e/opensearch-on-k8s
Monitors https://gerrit.wikimedia.org/r/plugins/gitiles/operations/alerts/+/refs/heads/master/team-data-platform/opensearch-k8s.yaml
Application documentation https://github.com/opensearch-project/opensearch-k8s-operator/blob/v2.7.0/docs/userguide/main.md
Paging false
Deployment Phabricator ticket https://phabricator.wikimedia.org/T408586



Design

OpenSearch on Kubernetes runs in WMF’s primary datacenters (eqiad and codfw). Each datacenter’s service will be independent (data will not be replicated between datacenters).

The platform owners can offer assistance and advice on multi-DC configuration (code review, failover tests, etc) but the application owner is responsible for ensuring the application can survive the loss of a single primary DC (if so desired by the application owner; this is not a requirement for running on the platform).

Flavors

Below is the "t-shirt size" breakdown of available resources. New clusters will get size opensearch-small unless otherwise requested.

Name Number of replicas CPU cores/replica RAM/replica (GB) default disk space/replica (GB) maximum disk space/replica (GB)
opensearch-small 3 2 4 30 100
opensearch-medium 3 4 8 30 100
opensearch-large 3 8 16 30 100

OpenSearch versions

Current versions

See our Docker images repo for the available versions.

Version update cadence

The platform owners will update the OpenSearch version at least once per quarter. Since our docker images are sourced from upstream Debian packages, we can only update as far as what's available there.

Application owners are welcome to request newer versions at any time.

OpenSearch Plugins

If your application requires a plugin that's not in the default OpenSearch install, open a Phab task to request the plugin. Plugins will be evaluated on a case-by-case basis. Criteria to be evaluated:

  • Whether or not the plugin is built by the OpenSearch team
  • Whether or not the plugin is owned by WMF
  • Security
  • Resources consumed by plugin
  • Responsiveness of plugin developers

You can see a list of existing plugins by looking for the phrase opensearch-plugin install in the Blubberfile that creates the Docker image for your OpenSearch version (example ). FIXME: improve discoverability of installed plugins

Getting Started With Your New Cluster

Access

Read-only anonymous access

You can reach your new cluster at https://${kubernetes-namespace}.discovery.wmnet:30443/ from anywhere inside WMF.

For example, the opensearch-test cluster URL is

https://opensearch-test.discovery.wmnet:30443/ 

If you asked for a multi-DC deploy, your cluster has endpoints at

https://${kubernetes-namespace}.svc.${dc}.wmnet:30443

as well.

Write access

Application owner

In order to write to your cluster, you have to supply a username and password. The default username is opensearch and the password can be found on the Puppet server, ask an SRE for help! This user has full permissions for writing data, but lacks some cluster permissions.

Platform owner

SREs needing to perform some cluster operations (such as forcing shard reallocation) will need to use the operator user until https://phabricator.wikimedia.org/T416714 is complete.

Troubleshooting

Dashboards and Alerts

Dashboards

See our OpenSearch on K8s dashboard. The dashboard contains a number of metrics we use to gauge health.

Look out for the 'interval' dropdown

Unlike other WMF dashboard, OpenSearch on K8s contains an "interval" dropdown, used to get a rolling average. If the panels are showing up blank, try to lengthen the interval. But you should also realize if you make the interval too long, it might confuse things as well.

For example, we had an incident where latency spiked to about 10x its normal value. After the incident was resolved, we still had the interval set to 24h. Thus, it was still including periods of extremely high latency in its graphs, making it seem like the incident was still ongoing. We are working to address this confusion in in T417230 .

Important Dashboard Panels
  • The "network probes" panel, showing the latency of all OpenSearch clusters per DC.
  • Disk Usage - Nothing good happens when you run out of disk space!
  • Memory usage - amount of memory used per pod
  • Search latency

You can find alerts in the Wikimedia Foundation's alert repo .

Alerting

These alerts will notify DPE SRE once your cluster is in "production" state in service.yaml in Puppet.

As stated in the "Service Availability/Expectations" section, DPE SRE will respond to the alerts within business hours.

The table below describes whether or not the application owner has permissions to fix the problem. This is not meant to imply the application owner must fix the problem. Generally speaking, the platform owner (DPE SRE) is the right team to deal with alerts in non-emergency situations.

Alert Probable cause Probable solution Can application owner fix this with their current perms? Should they do this in a non-emergency situation?
OpenSearchNode(Low,High)DiskWatermarkReached The OpenSearch pods are out of disk space Expand disk space Yes No
OpenSearchClusterAtLeastOneRedIndex At least one of the indices is missing data. Delete the broken index/regenerate the data Yes Only if the cause is well-understood (I lost data on an index I don't care about, let me completely delete the index to make the alerts go away).
OpenSearchBulkRequestsRejectionJumps The OpenSearch pods are overwhelmed 1. Determine cause (increased traffic? dying K8s worker?)

2. Take action based on cause (increase resources, block attacker, etc)

Unknown Only if the cause is well-understood (I just tried to ingest a 1 trillion line bulk JSON file into OpenSearch).
OpenSearchJVMHeapUseHigh The OpenSearch pods are using too much memory

- Check pod lifetimes to see if the pods are getting frequently OOMKilled

- Typical Java memory troubleshooting.

- Consider raising memory requests/limits in Kubernetes up to a maximum of 16 GB

Yes No


OpenSearch Best Practices

  • OpenSearch indices default to 1 replica and 1 primary. Since OpenSearch on K8s deployments have 3 nodes, create your indices with 2 replicas instead of 1. This will give you better read performance and more redundancy. To set this cluster-wide for any newly-created indices (highly recommended), use the Cluster Dynamic Setting cluster.default_number_of_replicas . To apply to an existing index or while creating a new index, set index update setting index.number_of_replicas.


  • Setting up aliases will simplify future changes, when wanting to introduce changes in mappings, or in the number of shards, which cannot be done on a live index.
  • Likewise, you should use Index State Management to manage the lifecycle of your indices (such as periodically reducing replica count or deleting older indices). Note that the OpenSearch Dashboards option mentioned in the link is not available in the OpenSearch on K8s platform, so you will need to use the API.
  • There is no inter data center replication. Client applications should send writes to both clusters.